VLA Systems References

This page contains references and resources for the Vision-Language-Action (VLA) systems covered in Module 4.

Academic Papers

Do as I Can, Not as I Say: Grounding Language in Robotic Affordances, Collaborative work on instruction-following robots
RT-1: Robotics Transformer for Real-World Control at Scale, DeepMind paper on transformer-based robotic control
SayTap: Language-guided Quadrupedal Locomotion, Research on language-guided robot movement
PaLM-E: An Embodied Multimodal Language Model, Google's work on embodied language models
CoRAL: Compositional Robot Ability Learning from Large-Scale Human Demonstrations, Paper on learning robot abilities
Language Models as Zero-Shot Planners, Research on using LLMs for robotic planning
Manipulation with Language Models, Study on language-guided manipulation tasks

Robotics and AI: A Survey of Current Techniques - Comprehensive overview of modern robotics
Embodied Intelligence: From Action to Cognition - Book on embodied AI principles
Human-Robot Interaction: A Survey - Academic survey of HRI research

OpenAI Whisper Documentation - Official documentation for Whisper speech recognition
ROS 2 Documentation - ROS 2 official documentation
NVIDIA Isaac Documentation - NVIDIA's robotics platform documentation
Hugging Face Transformers - Documentation for transformer models

Robotics Papers Weekly - Curated list of robotics papers
Embodied AI Dataset Repository - Collection of datasets for embodied AI research
Robot Operating System Community - Community forum for ROS users
AI Robotics Research Groups - Google's robotics research

MIT Introduction to Robotics - Course materials on robot perception and control
Stanford CS320: Robotic Manipulation - Course on manipulation with deep learning
Berkeley CS287: Advanced Robotics - Graduate course on advanced robotics topics
ETH Zurich Robotic Systems Lab - Educational materials on legged locomotion