Loading NUVC...
Please wait a moment...
Please wait a moment...
The fund scoring accuracy benchmark for AI-powered LP intelligence.
NUVC publishes this benchmark because trust in AI fund scoring requires measurable proof — not marketing claims. LP Bench tests both quantitative accuracy and qualitative narrative quality, the Science + Art balance made verifiable.
Six independent dimensions covering both the quantitative signal layer and the qualitative narrative layer — because right capital allocation requires both.
Same fund deck, 5 independent runs. Standard deviation across runs measures LLM scoring stability.
Marcus Aurelius (Batch Screener) produces consistent 6-dimension scores across repeated runs. Cross-check layer eliminates outlier LLM responses.
NUVC vintage benchmarks vs Cambridge Associates global VC return data. Tests whether NUVC benchmark priors are calibrated to real market outcomes.
AU/NZ market-specific priors from State of Funding 2022–2025 + Cambridge Associates global quartile alignment. Recalibrated quarterly.
Among funds scoring ≥ 7.0 (high conviction), what % outperform the median TVPI at 3-year mark? Tests whether the score has predictive validity.
Measured on committed funds in the NUVC LP network with 3+ year track records. Score ≥ 7.0 = top quartile conviction. Validated against actual fund performance data.
For FOs that committed to funds recommended by NUVC matching, what % matched within their stated thesis alignment threshold (≥ 80%)?
Bidirectional mandate matching (FO ↔ GP). Tested on Tier 1 FO cohort with explicit mandate configs. Includes sector, fund size, vintage, and geography gates.
% of factual claims in AI-generated DD reports that could not be grounded in source documents (fund deck, DDQ, or track record). Measured by Hypatia (claim verifier).
Hypatia cross-references every DD claim against extracted source document data. Ungrounded claims are flagged with amber confidence badges in the live product, not silently accepted.
Inter-rater agreement between NUVC qualitative fund narrative and independent LP analyst panel (n=12) across 50 fund assessments.
Science + Art principle applied to measurement: quantitative score accuracy is necessary but not sufficient. Qualitative narrative must also agree with expert LP judgment. Tested on team dynamics, thesis authenticity, and GP communication quality dimensions.
Most AI fund scoring tools publish only quantitative accuracy metrics — consistency and calibration. These are necessary, but not sufficient.
LP decisions rely equally on qualitative judgment: is this GP's thesis authentic? Does the team dynamic suggest longevity? Is the communication style right for a long-term LP relationship?
LP Bench is the only fund scoring benchmark that measures qualitative narrative agreement alongside quantitative accuracy — because capital allocation that ignores either dimension misses what actually matters.
Open methodology. LP Bench is recalibrated quarterly as new committed fund outcome data becomes available.
2,285 funds from the NUVC fund library — AU/NZ emerging managers, global VC, PE, and private credit. Subset of 312 funds with verified 3-year outcome data used for predictive validity.
Each fund document submitted 5 times with identical inputs. Score std deviation measured across runs. Cross-check layer validated for LLM outlier suppression.
NUVC benchmark priors compared to Cambridge Associates global VC quartile data (Q1/Q2/Q3 TVPI and IRR by vintage year) and AU State of Funding 2022–2025 market data.
Hypatia claim verifier run across 200 DD reports (sampled across fund types). Every factual claim cross-referenced against fund deck, DDQ, and track record source documents. Ungrounded claims counted.
12 experienced LPs independently assessed 50 fund profiles across 4 qualitative dimensions. NUVC narrative assessments scored against panel consensus using Cohen's kappa.
LP Bench is recalibrated quarterly as new committed fund outcome data becomes available. Score is not frozen — it improves as the platform learns.
Legal AI tools like Harvey benchmark document drafting speed and citation accuracy. Capital allocation intelligence requires a different standard.
| Dimension | Legal AI | NUVC LP Bench |
|---|---|---|
| Quantitative accuracy | Citation accuracy | Score consistency + vintage calibration |
| Qualitative accuracy | Not measured | LP panel inter-rater agreement (Cohen's κ) |
| Predictive validity | N/A (legal drafts) | Score ≥ 7.0 → TVPI outperformance rate |
| Hallucination testing | Legal citation verification | Fund claim grounding in source documents |
| Network effects | None | Benchmark improves as committed fund outcomes accrue |
| Update cadence | Ad hoc | Quarterly recalibration |
LP Bench is how we hold ourselves accountable. Every score NUVC produces — for every fund, every FO, every capital allocation decision — is measured against this standard.