Source in.
Clean data out.

Castor Catalyst reads source data and turns it into structured, audit-ready records. A human reviewer accepts it.

 
 
How Catalyst works

Three steps. One pipeline.

Catalyst sits between your source documents and your EDC. It handles the typing, the mapping, and the validation, so your team handles the judgment calls.

Source arrives: a document is uploaded, de-identified, and marked ready for processing in Catalyst.
Choose your workflow

Two paths into Catalyst. Same pipeline. Same audit trail.

The right entry point depends on where your source lives, where your sites are, and the level of site involvement.

Global / upload any document

Site Upload

Sites upload source documents directly to Catalyst from any country, in any language. The AI extracts. Your team approves.

Use this if you’re running
  • A global trial with sites outside the US
  • Paper-heavy studies, scanned PROs, or lab panels (from sites, participants, or a central lab)
  • A paper-source migration program
  • A study where sites stay in their existing workflow
  • Highly unique studies requiring heavy site involvement, where traditional records fall short
US / EHR retrieval

Direct-to-Patient

Catalyst pulls records straight from a connected EHR via HIPAA release, TEFCA, or FHIR API. No site uploads. No fax chase.

Use this if you’re running
  • A US chart review or registry
  • A post-market surveillance study
  • An external control arm or real-world evidence program
  • A study where site burden has to stay near zero

Running both? Catalyst handles direct-to-patient retrieval and site uploads side by side, with one audit trail. The hybrid model removes the back-and-forth of chasing external physicians for records and then transcribing those PDFs by hand.

Security and compliance

Compliance isn't a feature. It's the spec.

Every source document Catalyst ingests is held in the Source Vault, an encrypted store that is logically segregated from your EDC. Your data is never sent to external LLM providers for training. Source is destroyed 30 days post-study with a written certificate.

01

Source Vault

Where Catalyst stores your original source documents. Encrypted at rest and in transit. Role-based access. Full audit trail. No bulk export. Logically segregated from your EDC environment.

02

PII handling

Identifiers are detected and isolated before storage. Only pseudonymized structured data flows into the CRF.

03

External LLMs never train on your data

Your data and de-identified derivatives never reach third-party AI models.

21 CFR Part 11

EU MDR

HIPAA BAA

GxP

FHIR R4

AES-256

ICF AI disclosure

30-day source purge
Peer reviewed

The Castor platform, rated by the people who use it.

Recognized by independent analysts

Rated strongly for customer loyalty in ISR’s 2025 eCOA/ePRO Benchmarking.

The questions others asked

The questions others asked.

Castor Catalyst is an AI-powered source data extraction layer for clinical trials. Catalyst reads source data from paper forms, PDFs, scans, CSVs, EHR exports, and FHIR feeds, extracts structured fields with confidence scores, applies your EDC’s edit checks, and lands clean, audit-ready data directly into the Castor EDC. Your team reviews and approves each extracted value before it enters the clinical record, or Castor’s medically trained staff can run the review as a service.

Catalyst supports two workflows. Site Upload lets sites upload source documents directly and is available globally. As a service, Direct-to-Patient uses HIPAA releases and Patient Right of Access to retrieve EHR data via FHIR APIs or TEFCA, and is currently available only in the US. Both workflows use the same AI pipeline and produce the same audit trail.

Castor Catalyst is built into the Castor electronic data capture system and complies with 21 CFR Part 11, HIPAA, EU MDR, and GxP. Source documents are stored in a logically segregated Source Vault and purged 30 days after study close.

Source documents land in a logically segregated, encrypted Source Vault. Identifiers are detected and isolated before storage. Only pseudonymized structured data flows into your EDC. Source is purged 30 days after study close with a written destruction certificate.

Two options, depending on where your source lives, where your sites are, and the level of site involvement. Site Upload works globally. Sites in any country, any language, upload source documents directly. It fits paper-heavy studies, scanned PROs, lab panels, international trials, and highly unique studies requiring heavy site involvement where traditional records fall short. Direct-to-Patient is US-only because it depends on HIPAA release and TEFCA or FHIR endpoints unique to the US market. It fits US chart review, post-market registries, and EHR data retrieval at scale. Both routes use the same AI pipeline and produce the same audit trail.

From source document to verified eCRF.

Walk through Catalyst on one of your real protocols. We’ll bring a worked example, sample CTA language, and a per-site cost estimate.

What you'll see

  • Live extraction on your protocol's source
  • Role-based views for site, monitor and sponsor
  • Visual audit trail with bounding boxes
  • Sample CTA addendum for legal and IRB

See what the world’s leading AI says about us.

Try Castor EDC For Yourself

Start designing your own study structure and forms today.

Try For Free