# CodeClash
**Repository Path**: github_zoo/CodeClash
## Basic Information
- **Project Name**: CodeClash
- **Description**: Benchmarking Goal-Oriented Software Engineering
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-03-17
- **Last Updated**: 2026-03-25
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
## 👋 Overview
CodeClash is a benchmark for evaluating AI systems on **goal-oriented software engineering**.
Today's AI coding evals are *task*-oriented (e.g.,
HumanEval, SWE-bench).
Models are given explicit instructions.
We then verify correctness with unit tests.
But building software is fundamentally driven by goals ("improve user retention", "reduce costs", "increase revenue").
Reaching our goals via code is a self-directed, iterative, and often competitive process.
To capture this dynamism of real software development, we introduce CodeClash!
Check out our [arXiv paper](https://arxiv.org/abs/2511.00839) and [website](https://codeclash.ai/) for the full details!
## 🏎️ Quick Start
### Prerequisites
- **Python 3.11+**
- **[uv](https://docs.astral.sh/uv/)** - Fast Python package manager
- **Docker** - For running games in containers
- **Git**
### Installation
```bash
# Clone the repository
git clone https://github.com/CodeClash-ai/CodeClash.git
cd CodeClash
# Install uv (if you haven't already)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies and create virtual environment
uv sync --extra dev
# Set up your environment variables
cp .env.example .env # Then edit .env with your GITHUB_TOKEN
# Run a test battle
uv run python main.py configs/test/battlesnake.yaml
```
> [!TIP]
> CodeClash requires Docker to create execution environments. CodeClash was developed and tested on Ubuntu 22.04.4 LTS.
> The same instructions should work for Mac. If not, check out [#81](https://github.com/CodeClash-ai/CodeClash/issues/81) for an alternative solution.
Alternative: Using pip (not recommended)
```bash
pip install -e '.[dev]'
python main.py configs/test/battlesnake.yaml
```
Once this works, you should be set up to run a real tournament!
To run *Claude Sonnet 4.5* against *o3* in a *BattleSnake* tournament with *5 rounds* and *1000 competition simulations* per round, run:
```bash
uv run python main.py configs/examples/BattleSnake__claude-sonnet-4-5-20250929__o3__r5__s1000.yaml
```
## ⚔️ How It Works
In CodeClash, 2+ LM agents compete in a **code arena** over the course of a multi-round tournament.
For the duration of the tournament, each agent is iteratively improving their own codebase to win a high-level, competitive objective (e.g., accumulate resources, survive the longest, etc).
Each round consists of two phases:
* Edit phase: LM agents make whatever changes they want to their codebase.
* Competition phase: The modified codebases are pitted against each other in the arena.
Critically, *LMs don't play the game directly*.
Their code serves as their competitive proxy.
The winner is the LM agent who wins the most rounds.
## 🚀 Get Involved
- Check out our [docs](https://docs.codeclash.ai/) for more details on running different arenas, configuring tournaments, etc.
- Explore [2000+ tournaments](https://viewer.codeclash.ai/) via our viewer.
- See our [contribution guide](CONTRIBUTING.md) for what we're excited about!
- Have a big idea? Open an issue, and let's turn it into an [insight](https://codeclash.ai/insights/)!
## 💫 Contributions
We're actively working on several follow ups!
Check out the [Contributing Guide](CONTRIBUTING.md) for more.
Contact Person: [John Yang](https://john-b-yang.github.io/), [Kilian Lieret](https://lieret.net)
(Email: [johnby@stanford.edu](mailto:johnby@stanford.edu), [kl5675@princeton.edu](mailto:kl5675@princeton.edu))
## 🪪 License
MIT. Check `LICENSE` for more information.
## ✍️ Citation
```bibtex
@misc{yang2025codeclashbenchmarkinggoalorientedsoftware,
title={CodeClash: Benchmarking Goal-Oriented Software Engineering},
author={John Yang and Kilian Lieret and Joyce Yang and Carlos E. Jimenez and Ofir Press and Ludwig Schmidt and Diyi Yang},
year={2025},
eprint={2511.00839},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2511.00839},
}
```
## 📕 Our Other Projects