Dr. Hanieh Razzaghi | Behind the Screens: Improving Health Outcomes with Better Data

Iliyah Maddox

Audiobook

scipod.global
$
SciPod Library
$
health and medicine
$
Dr. Hanieh Razzaghi | Behind the Screens: Improving Health Outcomes with Better Data

Apr 29, 2025 | health and medicine

About this episode

We can imagine our health as a jigsaw, with each individual piece representing a different aspect of our medical history. These pieces might include blood test results, X-ray images or the notes taken by a doctor as we describe our symptoms. These jigsaw pieces are ultimately recorded and stored in electronic health records (or EHRs). EHRs are a valuable resource, providing an overview of someone’s health and they could have the potential to allow clinicians and researchers to unlock new medical insights. However, there’s a fly in the ointment – not all the pieces in such records always fit together correctly, and they may not completely capture the required information. Some clinical event documentation may not be complete, others do not align with related pieces, and some events are even missing entirely. This data quality problem was tackled by Dr. Hanieh Razzaghi of the Children’s Hospital of Philadelphia, and her colleagues, in their innovative work on the PRESERVE study, a research project exploring chronic kidney disease in children (the PRESERVE study itself was led by Drs. Michelle Denburg and Christopher Forrest). Using EHRs from 15 different hospitals across the United States, the team aimed to understand how various treatments could potentially slow down chronic kidney disease progression. However, initially, they had to make sure that the data they were relying on were accurate, reliable, and suitable for the required complex analyses. More

When researchers use EHRs to gain medical insights, they are not starting with information that was originally collected for scientific purposes. These records were created to help doctors diagnose and treat individual patients, not to answer research questions. As a result, the data might have gaps or quirks. Imagine trying to study a population’s eating habits but only receiving records from diners who forgot to order dessert. Without the full story, it’s easy to draw the wrong conclusions.

In medical research, these kinds of errors can have serious consequences. Misclassification of a patient’s condition, missing data points, or inconsistencies in how tests are reported can skew results and, worse, lead to incorrect recommendations for patient care. Recognizing this, Dr. Razzaghi’s team set out to systematically assess and improve the quality of the data they were using for the PRESERVE study.

The PRESERVE study is a large-scale investigation focused on children with chronic kidney disease, a condition that can lead to kidney failure and other serious health problems. The study examines whether controlling high blood pressure, a key risk factor in kidney disease, can slow down the decline in kidney function. To do this, researchers needed high-quality data on everything from blood pressure readings and kidney function tests to the medications children received and how often they visited specialists.

However, combining EHR data from 15 hospitals presented a major challenge. Each hospital used slightly different systems, recorded information in unique ways, and sometimes left out important details. To address these issues, the research team employed a rigorous approach called Study-Specific Data Quality Assessment (SSDQA).

The SSDQA framework was developed using a theoretical framework to create data quality checks that are reproducible beyond just this use case. Application of the framework for this study involved two rounds of detailed testing. In the first round, the researchers ran checks on high-level summaries of the data. This helped them spot broad problems, such as missing test results or inconsistent coding for procedures. For instance, they found that some hospitals didn’t record certain kidney function tests at all, while others used different codes for the same procedures.

In one striking example, the team noticed significant gaps in the recording of a key kidney function test, serum cystatin C, which is crucial for understanding how well a patient’s kidneys are working. Because this test wasn’t available consistently across hospitals, the researchers had to adjust their plans and use another measure, serum creatinine, which was more widely reported but less precise.

This round of testing also identified problems with identifying patient visits associated with nephrologists at two institutions. By allowing early remediation to take place, these institutions’ data could be included in the study. If it had been identified later without this systematic process, the results of data from these institutions risked being highly biased.

The second round of testing focused on row-level data, which includes individual patient records. This deeper dive revealed subtler issues, such as anomalies in how frequently certain tests were performed or whether important diagnoses were missing entirely. For example, one hospital recorded abnormally high rates of specific lab results, likely due to technical errors in how the data was extracted from their system.

Dr. Razzaghi and her colleagues identified and resolved over 270 data quality issues across the two rounds of testing. These ranged from missing blood pressure measurements to inconsistencies in how dialysis procedures were recorded. By working closely with each hospital’s data teams, they were able to correct many of these problems, ensuring that valuable patient data wasn’t excluded or misinterpreted by the study unnecessarily.

Some of the most significant improvements included increased completeness. For instance, for one participating institution, the percentage of patients with valid urine protein test results, a key indicator of potential kidney damage, rose from less than 5% to over 70% after data issues were addressed. The researchers also enhanced data accuracy. For instance, errors in how patient heights were recorded, which affected the calculation of kidney function through a measure called estimated glomerular filtration rate, were corrected. They also assisted the analytics team in making decisions about how to collect more precise data about dialysis, relying on the United States Renal Data System instead of source EHR records due to complex segmentation of care inconsistently captured across institutions.

These efforts not only improved the quality of the PRESERVE study but also highlighted broader issues in how EHR data is used for research. Dr. Razzaghi’s work is a powerful reminder that good research starts with good data. By systematically identifying and fixing data problems, the team ensured that the PRESERVE study could provide meaningful insights into how to better care for children with chronic kidney disease.

However, the benefits of their approach extend far beyond this single study. The tools and methods developed by Dr. Razzaghi’s team can be applied to other research projects, helping to improve the quality of medical studies worldwide. And by collaborating with hospital data teams, they’ve also set the stage for more accurate and reliable EHR systems in the future.

Download .mp3

Original Article Reference

This Audio is a summary of the paper ‘Systematic data quality assessment of electronic health record data to evaluate study-specific fitness: Report from the PRESERVE research study’, in PLOS Digital Health, https://doi.org/10.1371/journal.pdig.0000527

Contact

For further information, you can connect with Dr. Hanieh Razzaghi at razzaghih@chop.edu

This work is licensed under a Creative Commons Attribution 4.0 International License.

What does this mean?

Share: You can copy and redistribute the material in any medium or format

Adapt: You can change, and build upon the material for any purpose, even commercially.

Credit: You must give appropriate credit, provide a link to the license, and indicate if changes were made.

Increase The Impact Of Your Research!

Create Audiobook

Audiobook

About this episode

Original Article Reference

Contact

Increase The Impact Of Your Research!

More episodes

Dr Nina Gmeiner | 21st Century Trends in Property Regimes: Progressive Commons

Dr Selina Våge | Modelling Microbes to Understand Ecosystem Dynamics and Infectious Diseases

Professor Eckehard Schöll | Understanding Spontaneous Synchronisation in Epileptic Seizures

Taher Saif | Dr Andrew Holle – Mechanobiology – Exploring the Mechanics of Cell Behaviour

Dr Stella Laletas | How High-conflict Divorce Can Impact Children: Understanding the Perspective of Teachers

Professor Samantha Punch | Benefits of Bridge: The Partnership Mindsport