An end-to-end perception and autonomy stack for a bi-manual mobile nurse assistant — from raw sensor data to shared-autonomy task execution in hospital environments.
This project develops perception and autonomy capabilities for a bi-manual mobile robot designed to assist nursing staff in hospital environments. The goal is to reduce the manual burden on nurses by enabling the robot to recognize, localize, and interact with common medical objects — bottles, syringes, IV trays — with minimal human intervention.
The system is built on a ROS2 architecture and runs on NVIDIA Jetson Nano for embedded, real-time deployment in controlled hospital testbeds. This is an ongoing research project at WPI's Human-Inspired Robotics (HiRO) Lab.
Perception Stack: The system begins with multi-camera calibration across Intel RealSense D435 units to establish a common reference frame. Depth data from multiple viewpoints is fused to generate dense 3D representations of the workspace, enabling robust object detection even under partial occlusions.
Object Recognition: A trained deep learning pipeline (PyTorch) runs inference on fused RGB-D frames to detect and classify medical objects in real time. The recognizer is coupled with a 6-DoF pose estimator to determine the exact position and orientation of target objects for downstream manipulation.
Shared Autonomy Pipeline: Rather than fully autonomous or fully teleoperated operation, the system implements a shared-autonomy architecture. The robot can execute high-level tasks autonomously (reach, grasp, hand off) while the human operator retains the ability to override or guide specific steps. This merging of teleoperation and autonomous execution significantly reduces cognitive load on the operator.
Embedded Inference: All inference runs on the Jetson Nano. Model quantization and pruning pipelines are being developed to bring latency from the current ~2.0 s/frame prototype to real-time targets.
The shared-autonomy pipeline has consistently reduced operator intervention by approximately 60% in testbed trials. Pose estimation error remains under 5% across all tested medical object categories. Ongoing work focuses on embedded model optimization for real-time performance.