Data and methodology

Trust infrastructure starts with data provenance.

Compliance buyers care about provenance. This page explains where data comes from, how status is labelled, and how Thesmios avoids pretending that every match is a verified fact.

Sources

Corporate

Free

Companies House, Charity Commission, Insolvency Service, London Gazette

Regulatory

Free

FCA Register, SRA, GMC, NMC, HMRC MLR

Sanctions

Free

OFSI, OFAC, EU, UN, OpenSanctions

Courts and tribunals

Free

BAILII, Employment Tribunal judgments, civil records where available

Media and public web

Free

Google News, Bing News, public social content where consented

Premium planned

Planned

ComplyAdvantage, OpenCorporates, Land Registry, LexisNexis

Verification levels

Verified and self-declared data must never look the same.

Source-verified

Checked against an authoritative register, API, issuer, or provider.

Self-declared

Provided by the subject, labelled clearly, and optionally escalated for review.

Public-source match

Matched from public records with match strength, source link, and review status.

Entity resolution

False positives are a product problem, not a footnote.

Phase 1 uses deterministic matching and search-based entity resolution with reviewer-visible match factors. Phase 2 adds specialist entity resolution tooling and stronger deduplication.

The target is not to hide uncertainty. The target is to show match strength, source quality, review status, and whether the subject has confirmed, disputed, or contextualised the finding.

Refresh cadence

Static credentials

Refreshed on expiry, change, or user update.

Public lists

Refreshed according to source cadence and risk tier.

Monitoring

Daily or event-driven where supported by source and plan.