A marker-less, vision-only 6-DoF pose estimator for bottles, syringes, and medical trays — eliminating physical markers and enabling repeatable setup in clinical environments.
This project develops a marker-less 6-DoF pose estimation system for common medical objects — bottles, syringes, and IV trays — designed to run onboard the WPI nursing assistance robot. By eliminating the need for physical fiducial markers, the system makes deployment in real clinical spaces significantly more practical.
The estimator relies on a multi-view fusion pipeline fed by Intel RealSense D435 depth cameras, outputting 6-DoF pose (position + orientation) for each detected object in the robot's workspace.
Vision-Only Pipeline: The system processes synchronized RGB-D frames from multiple calibrated cameras. Feature extraction and depth back-projection are used to initialize pose hypotheses, which are then refined through iterative alignment.
Multi-View Fusion: Estimates from individual viewpoints are fused probabilistically to reduce ambiguity caused by occlusions or specular surfaces common in medical environments (plastic syringes, reflective trays).
Embedded Deployment: The model runs on NVIDIA Jetson Nano. A model quantization and pruning roadmap is in progress to push from the current ~2.0 s/frame research prototype toward real-time operation.
Eliminating physical markers reduced experimental setup time by approximately 30%. Pose estimation error remained under 5% across all tested object categories in controlled hospital testbed conditions.