UK Company Financials
Available for LicensingAn ML-ready dataset of 3.7 million UK company filings across approximately 3.5 million companies, parsed from Companies House iXBRL accounts and normalised across the FRS 102 and FRS 105 taxonomies.
Clean, verified, provenance-tracked datasets and AI benchmarks for structured finance, company filings, and machine-readable financial infrastructure.
These projects are designed around a simple thesis: models are increasingly commoditised, but clean, verified, provenance-rich financial data is scarce. The focus is not generic dashboards. It is analyst-grade data products that can be used by humans, LLMs, and downstream data pipelines.
An ML-ready dataset of 3.7 million UK company filings across approximately 3.5 million companies, parsed from Companies House iXBRL accounts and normalised across the FRS 102 and FRS 105 taxonomies.
A curated, structured-finance map of 551 UK securitisation SPVs and non-bank lenders, grouped by shelf, sponsor, and asset class, with charges-register intelligence converted into analyst-ready fields.
A free, open-source Claude / Cowork plugin with a local SEC EDGAR data connector that turns US structured-finance filings into analyst-ready output — searching registered ABS and CMBS deals, pulling prospectuses and investor reports, and running loan-level analysis on Form ABS-EE data. Contributed back to Anthropic’s financial-services repository.
Can frontier LLMs read UK company accounts? A 1,000-question verified benchmark and five-model evaluation comparing proprietary and open-weight systems. The core finding: LLMs can read the accounts well; the defensible asset is the clean, verified data layer.
Dataset distribution profile for published samples, gated datasets, and machine-readable data products.