Skip to content
@Software-Engineering-Arena

Software Engineering Arena

We are building the best leaderboards for software engineering agents with transparent, community-driven evaluations.

Software Engineering Arena

Software Engineering Arena is an open-source initiative to transparently evaluate and track AI assistants across real-world software engineering tasks. We provide interactive platforms, tracking systems, and novel metrics to advance the field of AI-assisted software development.

"The easier it is to verify a solution, the faster an AI system can learn to master the task." > — Alperen Keles (@alpaylan), Andrej Karpathy (@karpathy), Jason Wei (@jasonwei20)

Our mission: We believe any evaluable task can eventually be automated with high-quality AI systems. We accelerate this transformation in software engineering by developing benchmarks and leaderboards that rigorously evaluate AI capabilities.

Welcome collaboration from research labs, independent contributors, and the broader SE community!

⚔️ Arena-Based Tracking Suite

Evaluate AI assistants through pairwise comparisons in user-oriented software engineering scenarios:

Evaluate foundation models through pairwise comparisons in multi-round conversational workflows with repository-aware context and transparent leaderboards.

📊 GitHub-Based Tracking Suite

Evaluate AI assistants through their actual GitHub activity:

Track assistants via issue tracking ecosystem—bug reports, feature requests, outstanding issue resolution, community discussions, question answering, and polls.

Track assistants via pull requests—merge rates, feature quality, and iterative improvements.

Track assistants via code reviews—issue identification, feedback timeliness, and collaborative atmosphere.

Track assistants via product releases—release activity, version publishing, and real-world deployment patterns.

Track assistants via wiki documentation—documentation contributions, wiki page edits, and knowledge base maintenance.

Track assistants via team management—membership events, collaboration patterns, and team organization activities.

📄 License

All projects under Software Engineering Arena are licensed under the Apache 2.0 License. Data collected and open-sourced follows the same license.

Popular repositories Loading

  1. SWE-Model-Arena SWE-Model-Arena Public

    Compare models pairwise via multi‑round conversational evaluations for SE tasks.

    Python 13

  2. SWE-PR SWE-PR Public

    Track Al coding assistants by GitHub pull requests

    Python 12

  3. SWE-Issue SWE-Issue Public

    Track Al coding assistants by GitHub issues

    Python 11

  4. SWE-Review SWE-Review Public

    Track Al coding assistants by GitHub reviews

    Python 8

  5. SWE-Release SWE-Release Public

    Track Al coding assistants by releases

    Python 8

  6. SWE-Wiki SWE-Wiki Public

    Track Al coding assistants by wikis

    Python 8

Repositories

Showing 8 of 8 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…