Workshop 3: Learning by Watching

Creating Datasets with Imitation Learning

What We'll Cover

➔ The Core Idea: Teaching by Showing
➔ What is a Robotics Dataset?
➔ Step 1: The `record` Command
➔ Crucial Step: What Makes a *Good* Dataset?
➔ Step 2: Visualizing Your Data with `rerun`
➔ Step 3: Training a Policy with `train.py`

The Core Idea: Teaching by Showing

If you can do it, you can teach it.

1. You perform a task using the teleoperation controller.
2. The robot records everything it "sees" (camera images) and every "action" you command.
3. A neural network learns to map what it sees to the action you took.

The Robot's Memory

Dataset = A collection of Episodes

Episode = A single, complete demonstration of a task

Frame = A snapshot in time, containing:

`Observations`: What the robot saw (camera images, joint positions).
`Action`: The command you sent at that instant.

Step 1: Recording Your Demonstrations

Use your teleoperation skills to create a dataset.

python -m lerobot.record \
       # ... robot/teleop args ...
       --dataset.repo_id="your-hf-username/my-awesome-dataset" \
       --dataset.num_episodes=10

This combines teleoperation with saving data locally and to the Hugging Face Hub.

CRITICAL STEP: Garbage In, Garbage Out

Your trained model will only be as good as your data.

✅ Consistency: Perform the task the same way each time.

✅ Variety: Show the robot different starting positions for objects.

✅ Clarity: Make sure the object and gripper are clearly visible to the cameras.

✅ Quantity: Aim for 50+ episodes for a simple task. 10 is often not enough.

Step 2: Check Your Work with `rerun`

Before you train, look at your data to see if it's good.

python lerobot/scripts/visualize_dataset.py \
       --repo-id "your-hf-username/my-awesome-dataset"

This opens a `rerun.io` window where you can scrub through your episodes frame-by-frame. It's an essential debugging tool!

Live Demo

Record & Visualize

1. Instructor will teleoperate the robot to perform a simple task.
2. Use `lerobot.record` to save 3-5 episodes.
3. Immediately use `visualize_dataset.py` to show the audience what was recorded.

Step 3: Training Your Policy

Use your high-quality, verified dataset to train a policy.

python lerobot/scripts/train.py \
  --dataset.repo_id="your-hf-username/my-awesome-dataset" \
  --policy.type=act \
  --output_dir=outputs/my-first-training-run

This is **Supervised Learning**: The model learns "Given this image, output this command."

Watching it Learn

Use `--wandb.enable=true` to see live graphs in Weights & Biases.

The most important graph is the loss curve. You want to see it go down over time.

Q&A

Any questions about the imitation learning workflow?