Creating Datasets with Imitation Learning
If you can do it, you can teach it.
Dataset = A collection of Episodes
Episode = A single, complete demonstration of a task
Frame = A snapshot in time, containing:
Use your teleoperation skills to create a dataset.
python -m lerobot.record \
# ... robot/teleop args ...
--dataset.repo_id="your-hf-username/my-awesome-dataset" \
--dataset.num_episodes=10
This combines teleoperation with saving data locally and to the Hugging Face Hub.
Your trained model will only be as good as your data.
✅ Consistency: Perform the task the same way each time.
✅ Variety: Show the robot different starting positions for objects.
✅ Clarity: Make sure the object and gripper are clearly visible to the cameras.
✅ Quantity: Aim for 50+ episodes for a simple task. 10 is often not enough.
Before you train, look at your data to see if it's good.
python lerobot/scripts/visualize_dataset.py \
--repo-id "your-hf-username/my-awesome-dataset"
This opens a `rerun.io` window where you can scrub through your episodes frame-by-frame. It's an essential debugging tool!
Record & Visualize
Use your high-quality, verified dataset to train a policy.
python lerobot/scripts/train.py \
--dataset.repo_id="your-hf-username/my-awesome-dataset" \
--policy.type=act \
--output_dir=outputs/my-first-training-run
This is **Supervised Learning**: The model learns "Given this image, output this command."
Use `--wandb.enable=true` to see live graphs in Weights & Biases.
The most important graph is the loss curve. You want to see it go down over time.
Any questions about the imitation learning workflow?