Bridging the Data Gap in Diabetes Care: Prioleau’s DiaTrend Dataset Offers a New Lens on Type 1 Diabetes Management

In an era where data-driven innovation is reshaping healthcare, there is a powerful new tool accelerating breakthroughs in diabetes research: the DiaTrend Dataset. Developed by Dr. Temiloluwa Prioleau, a new Assistant Professor in the Department of Computer Science at Emory University joining us from Dartmouth College, the dataset includes over 27,000 days of continuous glucose monitor (CGM) data and more than 8,000 days of insulin pump data from individuals with type 1 diabetes. Now publicly available, DiaTrend offers researchers a rare window into real-world diabetes management.
“We created DiaTrend in response to a critical need for high-quality, open datasets in the diabetes space,” said Prioleau. “Despite the rapid advances in diabetes technology, there’s been a real bottleneck in access to clinical-grade data that researchers and developers can use to build robust decision-support tools.”
A Vision Rooted in Equity and Innovation
DiaTrend was created not only to provide raw data but also to inspire a new wave of collaborative, reproducible research in diabetes. It supports a wide range of tasks—from predicting adverse blood glucose events to uncovering behavioral trends and designing patient-centered tools.
“We envision DiaTrend as a foundational resource for researchers to tackle both well-established and emerging challenges in diabetes care,” said Prioleau. “Our hope is that it sparks collaboration, enables reproducible research, and accelerates the development of personalized, tech-enabled care strategies.”
The dataset already supports over 70 researchers across the globe, all of whom have shared their intended data use publicly to promote transparency and community learning.
Behind the Dataset: Rich in Data, Real-World Limitations
DiaTrend includes an average of 510 days of CGM data per participant, recorded every five minutes using devices like the Dexcom G6—resulting in up to 150,000 glucose data points per person. Insulin pump data covers about 152 days per participant. Demographic and clinical information, such as age, gender, race, and hemoglobin A1C levels, is also included.
While the dataset offers a wealth of high-frequency health data, it currently lacks contextual information like physical activity or diet—an area Dr. Prioleau is determined to address in future efforts. “Understanding the impact of what someone ate, how much they exercised, or how they slept is critical for precision diabetes care”, she said.
Prioleau is candid about the limitations of the DiaTrend dataset. Most participants are non-Hispanic White and female, and older adults are underrepresented. There's also a lack of perfect temporal alignment between CGM and insulin pump data in some cases.
“We recognize the limitations in representation and data synchronization,” she said. “But DiaTrend is only the beginning. Our vision is to expand the dataset to reflect a more diverse population and to include additional forms of wearable data. Equity and inclusivity are central to our next steps.”
Moving Forward: A Catalyst for Change
DiaTrend was derived from two larger studies focused on digital self-management tools for young adults with type 1 diabetes. Participants were selected based on their use of CGM and insulin pump technologies, and their device data was downloaded retrospectively using platforms like Tidepool and Glooko.
“Our inclusion criteria are rooted in the need to work with high-quality, dense data from individuals who rely on advanced technology,” said Prioleau. “But we’re committed to broadening that scope as we grow.”
Looking ahead, Dr. Prioleau and her team envision DiaTrend to be an important resource for data science and technology innovators working to build next-generation solutions that can help patients better manage their condition and prevent serious complications. However, she also wants to join forces with related efforts in the field.
Dr. Prioleau and her team recently worked on Glucose-ML – a collection of 10 public diabetes datasets comprising glucose data from people with type 1 diabetes, type 2 diabetes and prediabetes – to further accelerate data-driven research in the field. This work is currently under review, but the preprint is available here: https://arxiv.org/abs/2507.14077.
"At the end of the day, our goal is simple," Prioleau emphasized. "We want the DiaTrend and Glucose-ML datasets to support research and development of technology that can help people living or at risk of diabetes."