This page provides quick commands and code snippets for common tasks in CostNav.
=== “:whale: Docker (Recommended)”
```bash
# Clone repository
git clone https://github.com/worv-ai/CostNav.git
cd CostNav
# Initialize submodules
git submodule update --init --recursive
# Configure environment
cp .env.example .env
# Edit .env with your settings
# Start Isaac Lab container
docker compose --profile isaac-lab up -d
# Enter container
docker exec -it costnav-isaac-lab bash
```
=== “:computer: Manual Installation”
```bash
# Install Isaac Lab first (see Isaac Lab docs)
# Install CostNav
cd CostNav
python -m pip install -e costnav_isaaclab/source/costnav_isaaclab
python -m pip install -e ".[dev]"
# Verify installation
python costnav_isaaclab/scripts/list_envs.py
```
=== “RL-Games”
cd costnav_isaaclab
# Train with default settings (vector observations only)
python scripts/rl_games/train.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--headless
# Train with cameras (RGB-D observations)
python scripts/rl_games/train.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--enable_cameras \
--headless
=== “SKRL”
cd costnav_isaaclab
# Train with default settings (vector observations only)
python scripts/skrl/train.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--headless
# Train with cameras (RGB-D observations)
python scripts/skrl/train.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--enable_cameras \
--headless
| Use Case | RL-Games | SKRL |
|---|---|---|
| Resume from checkpoint | --resume |
--checkpoint=PATH |
| More environments (faster) | --num_envs=128 |
--num_envs=128 |
| With visualization | Remove --headless |
Remove --headless |
| Wandb tracking | --track |
--track |
| SLURM cluster | sbatch train.sbatch |
sbatch train.sbatch |
??? example “Full Command Examples”
=== “RL-Games”
```bash
# Resume from checkpoint
python scripts/rl_games/train.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--resume
# Train with more environments (faster)
python scripts/rl_games/train.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--num_envs=128 \
--headless
# Train with visualization (slower, for debugging)
python scripts/rl_games/train.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--enable_cameras
```
=== “SKRL”
```bash
# Resume from checkpoint
python scripts/skrl/train.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--checkpoint=logs/skrl/.../checkpoints/best_agent.pt
# Train with more environments (faster)
python scripts/skrl/train.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--num_envs=128 \
--headless
# Train with visualization (slower, for debugging)
python scripts/skrl/train.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--enable_cameras
# Train with wandb tracking
python scripts/skrl/train.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--headless \
--track \
--wandb-project-name=costnav
```
=== “RL-Games”
# Evaluate with metrics
python scripts/rl_games/evaluate.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--enable_cameras \
--checkpoint=logs/rl_games/Template-Costnav-Isaaclab-v2-NavRL/nn/last_checkpoint.pth
# Visualize policy
python scripts/rl_games/play.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--enable_cameras \
--checkpoint=logs/rl_games/Template-Costnav-Isaaclab-v2-NavRL/nn/last_checkpoint.pth
=== “SKRL”
# Evaluate with metrics
python scripts/skrl/evaluate.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--enable_cameras \
--checkpoint=logs/skrl/Template-Costnav-Isaaclab-v2-NavRL/checkpoints/best_agent.pt
# Visualize policy
python scripts/skrl/play.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--enable_cameras \
--checkpoint=logs/skrl/Template-Costnav-Isaaclab-v2-NavRL/checkpoints/best_agent.pt
# Use last checkpoint (instead of best)
python scripts/skrl/evaluate.py \
--task=Template-Costnav-Isaaclab-v2-NavRL \
--enable_cameras \
--use_last_checkpoint
| Baseline | Command |
|---|---|
| :zero: Zero agent | python scripts/zero_agent.py --task=Template-Costnav-Isaaclab-v2-NavRL |
| :game_die: Random agent | python scripts/random_agent.py --task=Template-Costnav-Isaaclab-v2-NavRL |
| :robot: Deterministic | python scripts/test_controller.py --task=Template-Costnav-Isaaclab-v2-NavRL |
=== “RL-Games”
# Start TensorBoard
tensorboard --logdir costnav_isaaclab/logs/rl_games --port 6006
# Open in browser: http://localhost:6006
=== “SKRL”
# Start TensorBoard
tensorboard --logdir costnav_isaaclab/logs/skrl --port 6006
# Open in browser: http://localhost:6006
| Metric | Description | Target |
|---|---|---|
rewards/iter |
Total reward per iteration | :arrow_up: Increasing |
Episode/arrive_rate |
Success rate | :arrow_up: > 50% |
Episode/collision_rate |
Collision rate | :arrow_down: < 10% |
losses/kl |
Policy change magnitude | :wavy_dash: < 0.02 |
cost_model/sla_compliance |
SLA compliance rate | :arrow_up: > 70% |
# List all environments
python scripts/list_envs.py
# Test controller
python scripts/test_controller.py --task=Template-Costnav-Isaaclab-v2-NavRL
# Test rewards
python scripts/test_v2_rewards.py --task=Template-Costnav-Isaaclab-v2-NavRL
# In Python
import gymnasium as gym
env = gym.make("Template-Costnav-Isaaclab-v2-NavRL", num_envs=1)
obs, info = env.reset()
# Check observation shape
print(f"Observation shape: {obs['policy'].shape}")
# Check observation values
print(f"Observation min: {obs['policy'].min()}")
print(f"Observation max: {obs['policy'].max()}")
print(f"Observation mean: {obs['policy'].mean()}")
# Enable reward printing in config
rewards:
print_rewards = RewTerm(
func=mdp.print_rewards,
weight=0.0,
params={"print_every_n_steps": 10},
)
Edit costnav_isaaclab_env_cfg.py:
@configclass
class RewardsCfg:
# Increase arrival reward
arrived_reward = RewTerm(
func=loc_mdp.is_terminated_term,
weight=30000.0, # Changed from 20000.0
params={"term_keys": "arrive"}
)
# Increase collision penalty
collision_penalty = RewTerm(
func=loc_mdp.is_terminated_term,
weight=-500.0, # Changed from -200.0
params={"term_keys": "collision"}
)
Edit costnav_isaaclab_env_cfg.py:
@configclass
class ObservationsCfg:
@configclass
class PolicyCfg(ObsGroup):
# Add new observation
robot_height = ObsTerm(func=mdp.base_pos_z)
# Modify existing observation
pose_command = ObsTerm(
func=mdp.pose_command_2d,
params={"command_name": "pose_command"},
scale=10.0, # Changed from 5.0
)
Edit coco_robot_cfg.py:
# Change velocity limits
max_velocity = 6.0 # Changed from 4.0
# Change steering limits
max_steering_angle = 50 * torch.pi / 180 # Changed from 40°
cd costnav_isaaclab/source/costnav_isaaclab/costnav_isaaclab/tasks/manager_based/costnav_isaaclab_v2_NavRL
# Generate with visualization
python find_safe_positions.py --visualize_raycasts
# Generate without visualization (faster)
python find_safe_positions.py
# Validate existing positions
python safe_area_validator.py
# Check for collisions
python check_impulse.py
??? solution “Solution” ```bash # Ensure Isaac Lab is in Python path export PYTHONPATH=/path/to/isaac-lab/source:$PYTHONPATH
# Or use the compatibility layer (already included in CostNav)
```
??? solution “Solution” ```bash # Reduce number of environments python scripts/rl_games/train.py –task=… –num_envs=32
# Disable cameras
python scripts/rl_games/train.py --task=... # Remove --enable_cameras
# Use smaller image resolution (edit config)
```
??? solution “Solution” ```bash # Test reward function python scripts/test_v2_rewards.py –task=Template-Costnav-Isaaclab-v2-NavRL
# Check for division by zero in reward functions
# Check observation normalization is enabled
```
??? solution “Solution”
1. Verify reward function: python scripts/test_v2_rewards.py
2. Check observations are informative (not constant)
3. Try simpler task first (v0 or v1)
4. Reduce learning rate or increase minibatch size
5. Check for NaN/Inf in logs
# In mdp/rewards.py
def my_custom_reward(env: ManagerBasedRLEnv) -> torch.Tensor:
"""Custom reward function."""
robot = env.scene["robot"]
# Get robot velocity
velocity = robot.data.root_lin_vel_b[:, 0]
# Reward forward motion
reward = velocity.clamp(min=0.0)
return reward
# In costnav_isaaclab_env_cfg.py
@configclass
class RewardsCfg:
my_reward = RewTerm(func=mdp.my_custom_reward, weight=1.0)
# In mdp/observations.py
def my_custom_observation(env: ManagerBasedEnv) -> torch.Tensor:
"""Custom observation function."""
robot = env.scene["robot"]
# Get robot height
height = robot.data.root_pos_w[:, 2]
return height.unsqueeze(-1)
# In costnav_isaaclab_env_cfg.py
@configclass
class ObservationsCfg:
@configclass
class PolicyCfg(ObsGroup):
robot_height = ObsTerm(func=mdp.my_custom_observation)
# In mdp/terminations.py
def my_custom_termination(env: ManagerBasedRLEnv) -> torch.Tensor:
"""Custom termination condition."""
robot = env.scene["robot"]
# Terminate if robot is too high
height = robot.data.root_pos_w[:, 2]
too_high = height > 2.0
return too_high
# In costnav_isaaclab_env_cfg.py
@configclass
class TerminationsCfg:
my_termination = DoneTerm(func=mdp.my_custom_termination)
| Resource | Link |
|---|---|
| :octocat: GitHub | github.com/worv-ai/CostNav |
| :book: Documentation | worv-ai.github.io/CostNav |
| :green_book: Isaac Lab | isaac-sim.github.io/IsaacLab |
| :robot: Isaac Sim | developer.nvidia.com/isaac-sim |