CostNav: A Navigation Benchmark for Cost-Aware Evaluation of Embodied Agents

Haebin Seong, Sungmin Kim, Minchan Kim, Yongjun Cho, Myunchul Joe, Suhwan Choi, Jaeyoon Jung, Jiyong Youn, Yoonshik Kim, Samwoo Seong, Yubeen Park, Youngjae Yu, Yunsung Lee
WoRV Team, Maum.AI
To be presented at CES 2026
costnav logo

CostNav is the first navigation benchmark that evaluates robots the way businesses actually evaluate them: by profit per run. Instead of just measuring success rates, CostNav asks: How much did that navigation cost? How much revenue did it generate? When will this system become profitable?

Abstract

Autonomous navigation research has made remarkable technical progress, yet a critical gap remains: we optimize for success rates and path efficiency, but not for economic viability. CostNav addresses this by introducing the first navigation benchmark that evaluates robots through the lens of profitability—the metric that actually matters for commercial deployment.

CostNav models the complete economic lifecycle of delivery robots: upfront hardware and training costs, per-delivery expenses (energy, maintenance, crashes), and revenue generation constrained by real-world service-level agreements. By grounding evaluation in actual industry data—delivery pricing, energy rates, hardware costs—CostNav reveals that a robot with 80% success using cheap sensors might be more profitable than one with 95% success using expensive LiDAR.

Our framework enables systematic comparison of fundamentally different approaches: classical planning with expensive sensors vs. learning-based methods with cameras, on-device vs. cloud inference, and traditional training vs. cost-aware reinforcement learning. CostNav bridges the gap between impressive research demos and sustainable businesses, providing researchers and engineers with data-driven answers to deployment decisions that directly impact commercial viability.

The Problem: Optimizing for the Wrong Things

problem illustration

Traditional navigation benchmarks celebrate 95% task completion rates and optimized path efficiency. But these metrics don't answer the questions that keep startup founders awake at night:

  • Should I spend $8,000 on LiDAR sensors with classical planning, or $400 on RGB-D cameras with a learning-based approach?
  • How many deliveries until I break even?
  • What's my actual cost per delivery when I factor in energy, crashes, and sensor degradation?

Academic benchmarks don't answer these questions. CostNav does.

The Complete Economic Picture

economic model

CostNav models the entire economic lifecycle of a delivery robot:

Before the robot even starts

Hardware costs, sensor investments, training expenses, data collection—all the upfront investments that need to be recovered.

During every delivery

Energy consumption from motors and sensors, battery degradation from charge cycles, maintenance costs from wear and tear, crash damage from collisions.

Revenue that actually matters

Not just "did it deliver?" but "did it deliver within the service-level agreement (SLA)?" In the real world, a delivery that takes 35 minutes when you promised 30 gets refunded. A delivery that arrives on time but spoils the food because of aggressive driving is refunded. Both timing and quality constraints define whether a delivery has truly created economic value.

All of this is grounded in real-world data: actual delivery service pricing, industry energy rates, hardware costs from commercial robots. This isn't theoretical—it's what real companies face every day.

What We're Building

Our initial release establishes a learning-based navigation baseline in realistic urban environments. But this is just the beginning.

Comparing Fundamentally Different Approaches

  • Classical rule-based planning with expensive sensors vs. learning-based methods with cheap cameras
  • On-device inference vs. cloud-based computation
  • Traditional training vs. cost-aware reinforcement learning that directly optimizes for profit

Testing in Challenging Scenarios

  • Dense crowds where collision avoidance becomes critical
  • Nighttime conditions where sensor choices matter
  • Adverse weather that tests robustness
  • Outdated maps that reflect real-world deployment

Answering Questions That Matter

  • Which navigation approach maximizes profit, not just performance?
  • How do hardware choices affect break-even time?
  • What's the true cost of collisions beyond just counting them?
  • When does investing in better sensors pay for itself?

Why This Matters

For Researchers

CostNav lets you optimize for what actually matters in deployment. Explore cost-aware reward functions, evaluate trade-offs between sensor cost and performance, and publish work that directly translates to commercial value.

For Startup Founders & Engineers

CostNav gives you data-driven answers to deployment decisions. No more guessing whether expensive sensors are worth it—you'll see the break-even analysis. No more wondering if cloud inference pays for itself—you'll see the profit margins.

For the Future of Autonomous Systems

CostNav bridges the gap between impressive demos and sustainable businesses. A robot that's technically impressive but economically unviable won't change the world. A robot that's profitable at scale will.

The Vision

Imagine a world where navigation research papers include a "profitability" section alongside accuracy metrics. Where we optimize for dollars per delivery, not just success rates. Where choosing between navigation approaches is guided by break-even analysis, not just technical performance.

That's the world CostNav is building.

We're not saying traditional metrics don't matter—they absolutely do. But they're incomplete. A robot that's technically impressive but economically unviable won't change the world. A robot that's profitable at scale will.

What's Next

We're releasing everything: the benchmark framework, cost models validated against industry data, simulation environment, evaluation code, and our baseline results. We want the community to build on this.

Coming Soon

  • Comprehensive comparison of rule-based vs. learning-based navigation economics
  • Cloud vs. edge inference trade-off analysis
  • Imitation learning that requires human annotation wage cost
  • Cost-aware RL training that directly optimizes profit
  • Diverse maps, robots reflecting infinite choices in the real world
  • Expanded scenarios testing robustness under challenging conditions
  • Open challenges for the community to beat our baselines

The autonomous navigation field has made incredible technical progress.
Now it's time to make it economically viable.

It's time to talk about money. It's time for CostNav.

Get Involved

CostNav will be presented at CES 2026. Technical report, benchmark, code, and models are available now with continual updates planned.

The current pre-release version includes our initial implementations for simulation, task design, training, evaluation, and—most importantly—metrics. We'll be rolling out continual improvements, so keep an eye on upcoming updates!

BibTeX

@article{seong2025costnav,
  title={CostNav: A Navigation Benchmark for Cost-Aware Evaluation of Embodied Agents},
  author={Seong, Haebin and Kim, Sungmin and Kim, Minchan and Cho, Yongjun and Joe, Myunchul and Choi, Suhwan and Jung, Jaeyoon and Youn, Jiyong and Kim, Yoonshik and Seong, Samwoo and others},
  journal={arXiv preprint arXiv:2511.20216},
  year={2025}
}