# gym_study **Repository Path**: xmlcy/gym_study ## Basic Information - **Project Name**: gym_study - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-11-15 - **Last Updated**: 2025-11-18 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Booster Gym Booster Gym is a reinforcement learning (RL) framework designed for humanoid robot locomotion developed by [Booster Robotics](https://boosterobotics.com/). [![real_T1_deploy](https://obs-cdn.boosterobotics.com/rl_deploy_demo_video_v3.gif)](https://obs-cdn.boosterobotics.com/rl_deploy_demo_video.mp4) ## Features - **Complete Training-to-Deployment Pipeline**: Full support for training, evaluating, and deploying policies in simulation and on real robots. - **Sim-to-Real Transfer**: Including effective settings and techniques to minimize the sim-to-real gap and improve policy generalization. - **Customizable Environments and Algorithms**: Easily modify environments and RL algorithms to suit a wide range of tasks. - **Out-of-the-Box Booster T1 Support**: Pre-configured for quick setup and deployment on the Booster T1 robot. ## Overview The framework supports the following stages for reinforcement learning: 1. **Training**: - Train reinforcement learning policies using Isaac Gym with parallelized environments. 2. **Playing**: - **In-Simulation Testing**: Evaluate the trained policy in the same environment with training to ensure it behaves as expected. - **Cross-Simulation Testing**: Test the policy in MuJoCo to verify its generalization across different environments. 3. **Deployment**: - **Model Export**: Export the trained policy from `*.pth` to a JIT-optimized `*.pt` format for efficiency deployment - **Webots Deployment**: Use the SDK to deploy the model in Webots for final verification in simulation. - **Physical Robot Deployment**: Deploy the model to the physical robot using the same Webots deployment script. ## Installation Follow these steps to set up your environment: 1. Create an environment with Python 3.8: ```sh $ conda create --name python=3.8 $ conda activate ``` 2. Install PyTorch with CUDA support: ```sh $ conda install numpy=1.21.6 pytorch=2.0 pytorch-cuda=11.8 -c pytorch -c nvidia ``` 3. Install Isaac Gym Download Isaac Gym from [NVIDIA’s website](https://developer.nvidia.com/isaac-gym/download). Extract and install: ```sh $ tar -xzvf IsaacGym_Preview_4_Package.tar.gz $ cd isaacgym/python $ pip install -e . ``` Configure the environment to handle shared libraries, otherwise cannot found shared library of `libpython3.8`: ```sh $ cd $CONDA_PREFIX $ mkdir -p ./etc/conda/activate.d $ vim ./etc/conda/activate.d/env_vars.sh # Add the following line export OLD_LD_LIBRARY_PATH=${LD_LIBRARY_PATH} export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib $ mkdir -p ./etc/conda/deactivate.d $ vim ./etc/conda/deactivate.d/env_vars.sh # Add the following line export LD_LIBRARY_PATH=${OLD_LD_LIBRARY_PATH} unset OLD_LD_LIBRARY_PATH ``` 4. Install Python dependencies: ```sh $ pip install -r requirements.txt ``` ## Usage ### 1. Training To start training a policy, run the following command: ```sh $ python train.py --task=T1 ``` Training logs and saved models will be stored in `logs//`. #### Configurations Training settings are loaded from `envs/.yaml`. You can also override config values using command-line arguments: - `--checkpoint`: Path of the model checkpoint to load (set to `-1` to use the most recent model). - `--num_envs`: Number of environments to create. - `--headless`: Run headless without creating a viewer window. - `--sim_device`: Device for physics simulation (e.g., `cuda:0`, `cpu`). - `--rl_device`: Device for the RL algorithm (e.g., `cuda:0`, `cpu`). - `--seed`: Random seed. - `--max_iterations`: Maximum number of training iterations. To add a new task, create a config file in `envs/` and register the environment in `envs/__init__.py`. #### Progress Tracking To visualize training progress with [TensorBoard](https://www.tensorflow.org/tensorboard), run: ```sh $ tensorboard --logdir logs ``` To use [Weights & Biases](https://wandb.ai/) for tracking, log in first: ```sh $ wandb login ``` You can disable W&B tracking by setting `use_wandb` to `false` in the config file. --- ### 2. Playing #### In-Simulation Testing To test the trained policy in Isaac Gym, run: ```sh $ python play.py --task=T1 --checkpoint=-1 ``` Videos of the evaluation are automatically saved in `videos/.mp4`. You can disable video recording by setting `record_video` to `false` in the config file. #### Cross-Simulation Testing To test the policy in MuJoCo, run: ```sh $ python play_mujoco.py --task=T1 --checkpoint=-1 ``` --- ### 3. Deployment To deploy a trained policy through the Booster Robotics SDK in simulation or in the real world, export the model using: ```sh $ python export_model.py --task=T1 --checkpoint=-1 ``` After exporting the model, follow the steps in [Deploy on Booster Robot](deploy/README.md) to complete the deployment process.