Open Research · Lean 4 · Formal Mathematics

A Graph of
Formal Mathematics

An analysis of the Lean 4 Mathlib proof library, turning the library into a queryable graph.

Explore Results ↓ View Source on GitHub
738K
Mathematical declarations
387K
Proof dependency edges
43K
Docstrings harvested
6,056
Named theorems identified
20+
Mathematical areas covered

What is Mathlib Graph?

Mathlib is a library of formally verified mathematics, written in Lean 4. Proof formalization libraries like mathlib provide an exciting array of new possibilities, including having a centralized locus of verified mathematical knowledge, training data for LLMs and proof formalizers. These developments bring to the forefront old questions in a new format regarding what mathematics is and how it is practiced and taught. Fortunately, I think that libraries of formalized theorems can help us gain empirical evidence towards answering these questions more satisfactorily. Mathlib Graph is a very small step into what I believe will be a field of increasing importance, namely mathematical metaanalysis. This field analyzes the structure of mathematics as a whole, similar to how scientific metaanalysis synthesizes many relevant studies and allows one to see larger trends which aren't apparent from the ground.

🔗

Proof Dependency Graph

Each theorem that uses another creates a directed edge. We construct a DAG of 387K edges.

📊

Statistical Analysis

We compute in-degree, out-degree, hub scores, longest dependency chains and other stats, then correlate these with proof complexity metrics.

🏷️

Named Theorem Detection

We identify theorems named after mathematicians using two independent signals: docstring bold-markup and camelCase identifier decomposition against a curated name list.

Methods

Data Pipeline

In case you want to play around with the data, this section describes the back-end.

🤗
HuggingFace
mathlib-types
713K rows, Parquet
+
🗃️
LeanDojo
Zenodo corpus.jsonl
premises → edges
+
🐙
GitHub Raw
Docstring extraction
at commit hash
🦆
DuckDB Graph
mathlib.db
738K decl · 387K edges
We construct the full transitive dependency graph across all of Mathlib, join it with docstrings and source positions from GitHub, and run statistical analyses comparing named vs. unnamed theorems.

Graph Metrics

For each theorem we compute: in-degree (how many proofs cite it), out-degree (how many lemmas it uses), hub score (in × out), BFS depth, and longest dependency chain via Kahn's topological DP.

Named Theorem Detection

Two independent signals: (1) doc bold **...** markup in docstrings, (2) id camelCase/underscore decomposition against ~90 mathematician names. Combined: 6,056 named theorems identified.

Docstring Harvesting

We fetch raw Lean source files from GitHub at a fixed commit hash, parse /-- ... -/ docstring blocks using source line positions from the lean_source table. 43,480 docstrings collected across all of Mathlib.

Statistical Testing

Mann-Whitney U tests (non-parametric) with rank-biserial correlation as effect size. Compares named vs. unnamed theorems across in-degree, out-degree, tactic step count, and code line count.


Most Cited Theorems in Mathlib

In-degree (the number of other proofs that directly depend on a theorem) is the most direct proxy for mathematical importance within a formal library. In the future, this analysis can be expanded to perform more sophisticated statistical analyses and make large AI generated theorem libraries human interpretable.

mul_comm (commutativity of multiplication) is cited by 1,462 other proofs. The top 5 are all basic ring axioms, appearing more often than any named theorem like Cauchy's or Fermat's.

Declarations cited by the most other proofs. Includes all kinds (theorems, defs, instances).

#NameKindAreaIn-degree
1mul_commtheoremAlgebra1,462
2mul_assoctheoremAlgebra1,267
3mul_onetheoremAlgebra1,171
4Set.rangedefData1,055
5one_multheoremAlgebra1,015
6add_communknownAlgebra950
7le_antisymmtheoremOrder814
8Fintype.carddefData810
9LE.le.transunknownOrder699
10CategoryTheory.Category.assocunknownCategoryTheory695
11Set.univdefData692
12MulZeroClass.mul_zerounknownAlgebra644
13zero_addunknownAlgebra634
14add_zerounknownAlgebra616
15SetdefData599
16le_rfltheoremOrder598
17sub_eq_add_negunknownAlgebra582
18le_transtheoremOrder565
19eq_or_netheoremLogic559
20MulZeroClass.zero_mulunknownAlgebra530
21Set.mem_setOf_eqtheoremData518
22div_eq_mul_invtheoremAlgebra493
23smul_eq_mullemmaAlgebra486
24add_assocunknownAlgebra444
25algebraMapdefAlgebra437
26one_smullemmaAlgebra435
27LT.lt.leunknownOrder416
28ENNReal.ofRealdefData416
29CategoryTheory.Category.comp_idunknownCategoryTheory406
30le_of_lttheoremOrder379

Primitive concepts are declarations with no outgoing dependency edges. (This is partly determined by how sensitive the dependency generator is, determining line by line dependency for term mode proofs was not that easy and could be made to detect more dependencies.)

#NameKindAreaIn-degree
1Eq.symmtheoremPrelude1,730
2Iff.rfltheoremCore1,413
3Function.InjectivetheoremData1,268
4Set.rangedefData1,055
5add_communknownAlgebra950
6Fintype.carddefData810
7Or.inltheoremPrelude802
8Function.comp_applytheoremCore801
9Or.inrtheoremPrelude760
10eq_commtheoremCore701
11LE.le.transunknownOrder699
12CategoryTheory.Category.assocunknownCategoryTheory695
13Set.univdefData692
14congr_argunknownLogic680
15Iff.mptheoremCore655
16MulZeroClass.mul_zerounknownAlgebra644
17zero_addunknownAlgebra634
18NetheoremCore627
19add_zerounknownAlgebra616
20SetdefData599

644,656 of 738,240 declarations (87%) are primitive (omega, simp, decide)

Cross-module dependency traffic analyzes which dependencies between different areas of math. Algebra is the dominant sink: Analysis, RingTheory, LinearAlgebra, and NumberTheory all flow heavily into it.

Analysis→Algebra
11,866
RingTheory→Algebra
11,555
LinearAlg→Algebra
6,840
NumberTh→Algebra
6,168
Analysis→Data
5,095
Algebra→Data
5,017
MeasureTheory→Data
5,009
Topology→Data
4,550
Analysis→Topology
4,534
Analysis→Order
4,226
MeasureTheory→Order
4,061
Topology→Order
4,055
MeasureTheory→Algebra
3,635
NumberTh→Data
3,262
Data→Algebra
3,253
Results · Named Theorems

Which Named Theorems Matter Most?

We identify theorems named after mathematicians using two independent signals, then measure whether they are structurally more important than unnamed results.

de Morgan's laws (not_and_or) top the named-theorem rankings with an in-degree of 113, cited by 113 other proofs. They're followed by the Squeeze theorem (18), Monotone Convergence (16), and the Binomial theorem (16).

Statistical Effect Sizes: Named vs. Unnamed Theorems

Rank-biserial correlation r from Mann-Whitney U tests. Positive = named theorems score higher. All results significant at p < 0.001.

+0.51
In-degree (citations)
Large effect · p < 0.001
+0.38
Out-degree (dependencies)
Medium effect · p < 0.001
+0.46
Tactic steps
Large effect · p < 0.001
+0.44
Code lines
Large effect · p < 0.001

Top Named Theorems by In-degree (docstring signal)

#Common NameLean IdentifierAreaIn-degree
1de Morgan's lawsnot_and_orLogic113
2Squeeze theoremtendsto_of_tendsto_of_tendsto_of_le_of_leTopology18
3Monotone convergence / Beppo-LeviMeasureTheory.lintegral_iSupMeasureTheory16
4Binomial theoremadd_powData16
5Lipschitz continuousLipschitzWithTopology16
6Krull's theoremIdeal.exists_le_maximalRingTheory15
7Squeeze theorem (variant)tendsto_of_tendsto_of_tendsto_of_le_of_le'Topology12
8Bézout's lemmaNat.gcd_eq_gcd_abData11
9Extreme value theoremIsCompact.exists_isMaxOnTopology10
10Cantor's theoremCardinal.cantorSetTheory9
11Integration by partsintervalIntegral.integral_mul_deriv_eq_deriv_mulMeasureTheory8
12Dominated convergenceMeasureTheory.tendsto_integral_of_dominated_convergenceMeasureTheory7
13Rearrangement inequalitymul_add_mul_le_mul_add_mulAlgebra6
14Hölder inequalityNNReal.inner_le_Lp_mul_LqAnalysis6

Named Theorem Counts by Area

Analysis
270
MeasureTheory
147
Algebra
79
Topology
75
Combinatorics
67
NumberTheory
43
Probability
32
RingTheory
31
GroupTheory
28
LinearAlgebra
27

The Shape of Mathematical Knowledge

Despite hundreds of transitive dependencies, proof chains in Mathlib are surprisingly shallow. Even the most complex theorems have longest chains of only 10-15 steps (this is partly due to the limitation in detecting dependency between very simple lemmas).

🌊

Shallow Chains

PadicInt.norm_int_lt_one_iff_dvd has 195 transitive dependencies yet its longest chain is only 12 steps. Formal proofs are wide, not deep (again partly due to difficulty detecting simple lemma dependence).

🌳

87% Primitive Leaves

644,656 declarations have no outgoing edges

🧲

Algebra is the Attractor

Analysis, RingTheory, LinearAlgebra, and NumberTheory all flow into Algebra at 3,000-12,000 edges each.

Broader Context

Mathematics in the Age of Formal Verification

The essay below sketches some implications that automated formal verification might have.

📝

Essay: Framing the New Mathematical Paradigm

Formal proof libraries like Mathlib are not just verification tools, they have broader implications about how people think about and relate to mathematics. This essay has some preliminary thoughts.

Read Full Essay →