Roadmap

The Mathlib Initiative Year 1 Roadmap

Our roadmap for the first year of Mathlib Initiative operations from October 2025 - September 2026. For background on our mission and approach, see our about page.

Vision

Current State

Mathlib has over 1.9 million lines of formally verified mathematics, written by over 500 contributors.

Long-term Goals

The Initiative supports Mathlib's growth toward:

  • A comprehensive library of formal mathematics across all levels

  • A thriving ecosystem of large-scale collaborative mathematical projects

  • Enhanced documentation and educational resources lowering barriers to entry

  • AI integration supporting automated reasoning and mathematical research

The Initiative operates within the existing Mathlib community structure, working with current maintainers, reviewers, and contributors to address scaling challenges.

Year 1 Objectives

By September 2026, the Initiative will:

  • Establish responsive review cycles with under one week response time for 90% of review rounds

  • Create infrastructure for an ecosystem of downstream mathematical libraries

  • Organize and expand documentation to support new contributors

  • Create tools for AI training data extraction and integration

Deliverables

D1. Rapid Pull Request Triage and Review

Description: The review queue bottleneck is the primary constraint on Mathlib's growth. Currently circa 300 pull requests wait in backlog with median wait times around two weeks, limiting the pace of mathematical formalization.

Strategy: We will establish a professional editorial team of research mathematicians with proven Mathlib expertise to process the review queue systematically. This team will work within the existing maintainer structure, focusing on triage, initial review, and routing to appropriate expert reviewers. Complementary automation improvements will reduce manual review overhead.

Deliverables: We will achieve review response times under one week for 90% of review cycles by September 2026. Specific outputs include: enhanced queueboard v2.0 with advanced filtering and assignment features; automated PR staleness detection and closure system; reviewer training documentation and onboarding process; and quarterly review queue performance reports tracking: (1) median review response time, (2) percentage of review cycles completed within 1/2/4 weeks, (3) review queue size trends, published with historical trend analysis.

Notes on triage and review cycle: Triage involves assessing each PR for scope, feasibility, and technical direction, then assigning it to the appropriate expert reviewer. This is separate from detailed technical review. Review response time is measured as calendar time from when contributors submit a PR or remove the awaiting-author label (indicating completed feedback processing) until they receive the next review response.

D2. Coordinate Ecosystem Dependencies

Description: Mathlib's role as a foundational library requires solving the dependency synchronization problem: when multiple projects depend on Mathlib, they must coordinate versions to avoid dependency conflicts. Currently, projects update Mathlib independently, making it impossible for downstream libraries to depend on multiple Mathlib-dependent projects simultaneously.

Strategy: We will design and prototype a dependency coordination system informed by community expertise, then iterate based on early feedback to establish effective synchronization standards across the ecosystem. This includes building automated synchronization tools and ensuring Mathlib's testing infrastructure remains available for downstream projects.

Deliverables: Community-agreed standards for ecosystem dependency synchronization; prototype tooling for coordinated dependency updates across projects; ecosystem health monitor tracking downstream project status and Mathlib impact; integration with Reservoir for dependency coordination; and complete availability of Mathlib's testing infrastructure as reusable components for downstream projects.

D3. Documentation and Educational Material

Description: Mathlib currently benefits from the existence of numerous community-created materials (Natural Number Game, API documentation, doc strings & library notes). Implementing a systematic organization of these, will greatly enhance the experience of both new and existing contributors as well as educational adoption.

Strategy: We will classify existing mathlib documentation (tutorials, how-to guides, reference, explanation) and identify gaps. Priority areas include tactics documentation and theory folder overviews. Separately, we will develop modular educational components for standard mathematics curricula.

Deliverables: Enhanced Mathlib documentation with comprehensive navigation and organization; revamp the community website docs; complete tactics reference documentation for Mathlib tactics; auto-grader infrastructure for educational contexts; and modular educational components for Linear Algebra and Analysis.

D4. Curated AI Training Sets and Hooks for AI Contributions

Description: Mathlib's role in mathematical reasoning AI development requires systematic data extraction and integration capabilities. Lean projects contain rich goal state and proof trace information beyond source code alone.

Strategy: We will develop comprehensive data extraction pipelines capturing: syntax trees, declaration signatures and types, goal states, and tactic scripts; and establish regular publication schedules. Integration hooks will prepare the ecosystem for AI-assisted contributions while maintaining quality standards.

Deliverables: Monthly publication of complete Mathlib dataset to HuggingFace including goal states, tactic traces, and type inference data; extraction pipeline supporting appropriately licensed downstream projects; and documentation for dataset structure.

Tracking Progress

Milestones and Metrics

Q4 2025 (Oct–Dec):

  • Editorial team hiring in progress

  • Automated review queue statistics in place

  • Dependency coordination requirements gathered and system design completed, hiring in progress

  • Documentation audit and taxonomy implementation done

  • Community website docs refreshed

Q1 2026 (Jan–Mar):

  • Review response times reduced to <10 days for 75% of review cycles

  • Dependency coordination prototype developed and initial testing complete

  • Mathlib testing infrastructure components fully available for downstream projects

  • Tactics documentation 50% complete

  • AI data extraction pipeline design underway

Q2 2026 (Apr–Jun):

  • Review response times under one week achieved for 90% of review cycles

  • Dependency coordination prototype deployed and community feedback collected

  • Ecosystem health monitor deployed tracking downstream project status and Mathlib impact

  • Educational components for 2 subjects created

  • Monthly HuggingFace dataset publication established

Q3 2026 (Jul–Sep):

  • Review response times under one week maintained consistently for 90% of review cycles

  • Dependency coordination prototype iterated based on community feedback and health monitor data

  • Standards for ecosystem dependency synchronization established

  • Tactics documentation completed

  • Auto-grader created

  • Documentation of HuggingFace dataset structure

Quarterly Reviews

Progress against these metrics will be assessed quarterly with detailed reports published on the Mathlib Initiative website. Community feedback sessions will be held each quarter to adjust priorities based on evolving needs and opportunities.

Implementation

This roadmap will be implemented by our dedicated team working in close coordination with the Mathlib community. To support this work and help accelerate the future of formal mathematics, learn about funding opportunities and partnerships.