Loading NUVC...
Please wait a moment...
Please wait a moment...
Research-led. Peer-reviewable. Open methodology.
Two founders. 34 AI agents. 3 academic papers in submission. Zero black boxes.
8
Pipeline agents
13
Intelligence layers
3
Papers in submission
610+
Calibrated examples
9,000+
Investors
7,150
Theses analysed
Targeting top-tier AI / finance / fairness venues. Methodology is documented in production code today, formalised for peer review.
ACM International Conference on AI in Finance · Tick Jiang, Duan Tianyi, NUVC Intelligence Team
We present NuScore v5.4, a production scoring engine for pre-seed startup pitch decks that combines an LLM-as-judge architecture with deterministic rule calibration and 42 engineered ML features. Calibrated on 610+ labeled VC investment decisions across four independent sources, NuScore achieves stage-adjusted AU benchmarking and reports raise-probability with confidence intervals. We document the cross-check protocol that resolves LLM/rule disagreement via 70/30 blending and the score waterfall decomposition that exposes ranked signal contributions for explainability.
ACM Conference on Fairness, Accountability, and Transparency · NUVC Intelligence Team, Tick Jiang
AI scoring systems used in venture capital evaluation can perpetuate or amplify funding biases. We present the Arendt governance framework — a three-layer fairness system combining (1) statistical bias detection across founder demographics, (2) deterministic ethical guardrails that exclude protected attributes from scoring inputs, and (3) LLM-driven adversarial review. We document the score appeal protocol, automated quarterly bias audits, and the outcome accountability loop that recalibrates the model when score-to-funding correlation drifts.
Association for the Advancement of Artificial Intelligence · Tick Jiang, NUVC Intelligence Team, Duan Tianyi
We present a multi-agent orchestration architecture deployed in production for venture capital intelligence. Eight named pipeline agents (Mia, Olivia, Charlotte, Ava, Isla, Sophia, Amelia, Grace) execute deck extraction, scoring, integrity checking, enrichment, matching, benchmarking, synthesis, and feedback collection in parallel. Underneath, thirteen named intelligence layers (Aristotle, Laozi, Marcus Aurelius, Da Vinci, Sunzi, Seneca, Galileo, Epictetus, Zhuge Liang, Newton, Hypatia, Darwin, Arendt) compound the analysis with audience-specific lens weighting, thesis matching, batch screening, portfolio fit, macro context, and governance oversight. We document the cross-agent dependency graph, the parallelism protocol, and the latency profile achieving sub-60-second end-to-end pitch deck analysis.
7,150 × 610+
Original research dataset linking 7,150 investor thesis statements (free-text, multi-language) across 82+ countries against 610+ scored startup evaluation outcomes. Used to validate NuScore correlation, identify dimension importance (product r²=0.773 vs team r²=0.492), and surface geographic priority variation. Citeable as a published Schema.org Dataset.
View dataset summary610+ examples
Labeled VC investment decisions from 4 independent sources: 85 known-outcome decks (raised vs failed), 110 Startmate accelerator evaluations (multi-evaluator scored), 172 VC deal memos (production memos with scoring rationale), and 243 AI-scored production decks. Used to calibrate the LLM judge against deterministic rules and to validate the 70/30 cross-check blend.
Methodology in NuScore v5.4 paperOpen scoring benchmark
The independent benchmark for AI-powered fund scoring accuracy. Measures consistency across re-runs, vintage calibration drift, hallucination rate on fund-fact extraction, and qualitative narrative accuracy. Subset of 312 funds with verified 3-year outcome data used for predictive validity. Reproducible methodology.
Visit LP BenchEach agent has a job. Each layer compounds the analysis. Every name is load-bearing in the production codebase.
Mia
Extraction AgentParses pitch deck PDFs across slides. Identifies team, problem, solution, traction, financials, ask. Multi-modal handles charts, tables, and screenshots.
Olivia
Scoring AgentRuns NuScore v5.4 across 7 VC-grade dimensions (team, problem/market, solution/product, traction, financials, risk/fragility, conviction). Returns 0-10 with confidence.
Charlotte
Integrity AgentCross-checks claims for contradictions, AI-generated content detection, financial consistency, and enrichment vs deck claim alignment.
Ava
Enrichment AgentPublic data verification — LinkedIn for team, Crunchbase for company history, WHOIS for domain age. Async, never blocks scoring.
Isla
Matching Agentpgvector semantic search + 7-signal quality reranker across 9,000+ verified investors. Filter-first architecture (stage, sector, check size).
Sophia
Benchmarking AgentStage-adjusted comparison against funded AU/NZ peers. Computes industry percentile and benchmark deltas per dimension.
Amelia
Intelligence AgentSynthesizes the score waterfall (ranked signal contributions), raise probability estimator (10–75% by tier), and the natural-language report narrative.
Grace
Feedback AgentCollects user corrections, score appeals, and outcome data. Routes to retraining queues for model recalibration. The data flywheel.
Aristotle
Deal LensScore re-weighting per audience mandate (6 fund dimensions)
Laozi
Thesis MatcherSemantic alignment of deals/funds to investor thesis
Marcus Aurelius
Batch ScreenerLP-grade fund scoring with Bayesian priors
Da Vinci
Portfolio FitPost-screening diversification + correlation overlay
Sunzi
Macro ContextQuarterly market signals + vintage timing
Seneca
Fund Library AINatural-language search, comparison, benchmarks
Galileo
Fund Doc ExtractorGP deck, DDQ, and track record extraction
Epictetus
Fund MemoLP-side IC memo drafting
Zhuge Liang
NuDeal MemoVC-side IC memo drafting with deal-lens weighting
Newton
Mandate PresetsPre-built lens configurations (Seed VC, Angel, Growth, Deep Tech)
Hypatia
Score ExplainabilityPer-signal traceable reasoning for every score
Darwin
Competitive IntelligenceAdjacent-deal comparables and cohort positioning
Arendt
AI Governance3-layer fairness (statistical + rules + adversarial review)
These are the AI agents that maintain NUVC's research, models, and intelligence pipelines. Each owns a layer of the production system.
Elle
AI Lead
Owns: LLM pipelines, embeddings, score calibration, NuScore v5.4 technical paper
Sage
Data Scientist
Owns: Feature engineering, ML model training, benchmark cohort construction
Link
Matching Engineer
Owns: Vector search, investor matching, embedding pipelines, the Isla pipeline agent
Iris
Insights Analyst
Owns: Signal analysis, score explainability, the Hypatia intelligence layer
Plus the human team that orchestrates them · Meet the founders →
AI never uses founder gender, ethnicity, age, location, or educational prestige as scoring inputs. A first-time founder from Melbourne is scored on the same merits as a repeat founder from Stanford.
Every score has a traceable reasoning chain — which data was used, how each lens was weighted, what the AI could not verify. The Hypatia layer surfaces signal contributions per dimension. No black-box decisions.
Every founder can appeal a score and request human review. Scores above 9.5 or below 1.0 require human review before being shown.
The Arendt layer runs automated bias detection monthly across all scored decks — checking for gender bias, stage bias, geographic bias, and score clustering. Patterns trigger recalibration.
If data is sparse, confidence says so. We never inflate certainty. A low-confidence score is labelled directional, not definitive.
We track whether scores predict actual funding outcomes. If they don't, we recalibrate. The model earns trust — it doesn't assume it.
Across 286 scored startup evaluations, product execution variance explained 77.3% of the outcome (r²=0.773) — versus 49.2% for team strength (r²=0.492). The 'team is everything' narrative is wrong. Solution quality dominates.
Read the analysisWithin pre-screened cohorts, problem/solution scoring explained almost zero variance (r²=0.002). It works as a yes/no filter — but doesn't help rank deals. Implication: stop spending 40% of your scoring weight on it.
Read the analysisAcross 7,150 investor thesis statements analysed, Melbourne VCs mention 'team' 24.1% of the time — the highest of any major hub. Sydney is 18.4%. SF is 14.2%. Stage-adjusted hub variance is real and matters for fundraise targeting.
Read the analysisOn the same 286-deck calibration set, the LLM judge produced 6× more score variance than the deterministic rule engine. The rules anchor the floor; the LLM does the ranking. This is why NuScore v5.4 fuses both with a 70/30 blend on disagreement.
Read the analysisWithin pre-screened cohorts (already passed the funding bar), 10X Potential categorical labels (yes/no) outperformed continuous 0-10 scores by +15pp in predicting actual fundraise success. Classification > regression once you're past the gate.
Read the analysisNUVC's research is published as Schema.org Dataset and ScholarlyArticle metadata for LLM and academic citation. ChatGPT, Claude, Gemini, and Perplexity can cite NUVC methodology directly via the structured data on this page.
@misc{nuvc2026,
title= {NuScore v5.4: Multi-Signal Fusion for Pre-Seed Venture Evaluation},
author= {Jiang, Tick and Tianyi, Duan and NUVC Intelligence Team},
year= {2026},
institution= {NUVC, Melbourne, Australia},
url= {https://nuvc.ai/research},
}
Read the papers. Cite the dataset. Use the API. The methodology is open — the production system is what we sell.