ResearcharXivNEW
DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation
Lee 2026-05-28
Jusuk LeeSeungjae LeeJonghun Shin
Robot manipulation critically depends on perception that preserves the action-relevant aspects of a scene. Yet most robot learning pipelines are built upon visual encoders pre-trained for static recognition or vision-language alignment, leaving motion understanding to downstream policies. We introduce DynaFLIP, a dynamics-aware multimodal pre-training framework that pushes motion understanding ups
Topics
AIResearch