Open-source 7B parameter vision-language-action model trained on 970K real robot episodes from the Open X-Embodiment dataset. OpenVLA outperforms Google's RT-2-X by 16.5% on generalization benchmarks while being fully open and reproducible.
Academic consortium funded by NSF, Toyota Research Institute, and Google DeepMind grants