Visual-Inertial Odometry

Overview

This project addresses a fundamental bottleneck in VIO research: the scarcity of labeled, synchronized Vision+IMU datasets with known ground truth. A synthetic dataset pipeline was built in Blender, and both classical (MSCKF) and learning-based fusion methods were implemented and benchmarked against it.

Approach

Synthetic Dataset Generation: Using Blender's scripting API and OysterSim, a pipeline was created to render photorealistic camera sequences with synchronized IMU trajectories. Over 10,000 frames were generated across diverse lighting conditions, motion profiles, and scene types.

MSCKF Implementation: The Multi-State Constraint Kalman Filter was implemented as the classical VIO baseline. Feature tracks from optical flow are used to impose geometric constraints, bounding trajectory drift without loop closure.

Deep Fusion Network: A learning-based approach fuses CNN-extracted visual features with IMU integration windows through a recurrent architecture, learning to weight the two modalities based on motion and lighting conditions.

Results

10,000+Synthetic frames

<5%Trajectory error

+20%Over vision-only

Both classical MSCKF and the deep fusion network achieved under 5% trajectory recovery error on synthetic benchmarks. The fusion approach improved pose reliability by approximately 20% over vision-only baselines, particularly in high-dynamic-range and textureless scenes where visual features degrade.

Media

🎥 Demo video and project images coming soon.

Visual-Inertial Odometry & Synthetic Dataset Generation

Overview

Approach

Results

Media