The problem Sangrag Ganguli, MD, put on the table wasn’t abstract. “Anywhere from ten to forty five minutes of chart review per patient entry,” he said. “And this is a pretty big time commitment.” As General Surgery Resident and inaugural AIS fellow through the SAGES HPB solid organ committee at the University of Chicago Medicine, Sangrag oversees two national minimally invasive surgery registries, aMILES and aMIPS, spanning 21 centers across North America. At any given time, his team runs around 10 active research projects, most requiring someone to open an EMR, trace through operative reports and discharge summaries, identify the relevant variables, and enter them by hand.
The May 6 SAGES x Castor Catalyst webinar, hosted by Derk Arts, MD, PhD, CEO of Castor, was about what changes when that work looks different.
What the session covered
The hepatobiliary database Sangrag’s team maintains is granular. Preoperative data includes patient demographics, frailty index, and tumor imaging. Intraoperative fields capture surgical approach (laparoscopic, robotic, open), conversion status, blood loss, and pathology margins. Postoperative data requires prospective follow-up to record complications, readmissions, and cancer recurrence. Each entry means cross-referencing the operative report, clinic notes, discharge summary, and imaging records, then translating all of it into structured CRF variables. “It’s very careful, detail-oriented work,” Sangrag said, “but it’s also extremely time consuming.”
Variability compounds the problem. Different reviewers interpret the same documentation differently. The same reviewer introduces inconsistencies across a long patient list, especially when balancing chart review against clinical responsibilities. This is a data quality problem as much as an efficiency one.
Derk walked through Castor Catalyst in a live interface. The workflow: upload a surgical source document, Catalyst auto-de-identifies PHI, extracts structured data from free text, and presents each result with a source citation pointing back to its exact location in the document. The reviewer accepts, overrides, or adjusts, then submits into the electronic data capture system. “You basically go from manually entering it one by one to reviewing it in bulk,” Derk said, “but always being able to see where it found the information.”
Sangrag’s reaction was immediate. He walked through the variables Catalyst would handle in his own registries: liver segments resected, surgical approach, conversion status, blood loss, anesthesia type, and positioning.
“I think that’s where this platform really shines… not only giving you the data, but also showing you where that data was gotten from.”
— Sangrag Ganguli, MD, SAGES x Castor Catalyst webinar, May 6, 2026
The more substantial point came at the close. It was addressed directly to surgical trainees watching the session.
“I think the important point is that this automation really doesn’t replace you. [I] think this really elevates the role of the trainee that frees up your time to do more high value work like study design, learn how to do your own stats or refine your manuscript writing skills and so on. And those are the skills that really define academic development and your impact within research, not how well you can extract patient information from the EMR.”
— Sangrag Ganguli, MD
What’s in the full recording
The 35-minute recording includes several exchanges not covered here:
- Sangrag walks through the complete variable set the hepatobiliary database collects, from preoperative labs to long-term cancer recurrence follow-up, giving a precise picture of what 10 to 45 minutes per chart actually means in practice 09:33
- A direct exchange about trust in AI: Sangrag rates himself as a beginner in AI adoption and walks through exactly which research tasks he would and wouldn’t rely on AI for today 13:14
- The live de-identification step: Catalyst draws redaction boxes on a surgical PDF in real time before extraction begins, and the interaction between the document view and the review interface shows what accepting or overriding a specific extracted value actually looks like 23:26
- How a new extraction pipeline is calibrated from five example files, a subject matter expert voice-over, and synthetic training data, with performance benchmarked against a gold-standard reference set 27:19
- The custom domain architecture: how a hepatobiliary surgical report domain built for aMILES can be copied across future studies without rebuilding the pipeline from scratch 34:05
The recording is 35 minutes. It’s built for PIs managing multi-center surgical registries, research residents, and anyone responsible for the data quality of a study that still runs on manual chart review.
Questions about this session
Answers drawn from the session transcript. Production performance data sourced from Castor internal results — confirm with Derk’s team before publishing.