Research Statement

My research lies at the intersection of applied data science and the social and behavioral sciences. I focus on developing machine learning methods to support systematic reviews, with a particular emphasis on explainability, active learning, and reproducibility. By combining large-scale simulations, open science practices, and FAIR data standards, I aim to build tools that make evidence synthesis more transparent, efficient, and accessible to researchers across domains.

Research Areas

Applied Data Science

Data science focuses on predicting what will happen and creating new insights by developing algorithms and models for complex datasets

Machine Learning

Working models that improve prediction, classification, and decision-making in natural language processing and other areas.

Open Science

In my work, I promote transparency, reproducibility, and FAIR data practices in scientific workflows.

Generative AI

Working with generative AI models, with a focus on making them reproducible, consistent, and fair. These systems can easily produce low-quality output, so my work emphasizes careful, minimal application rather than overuse.

Natural Language Processing

I use NLP techniques to understand and analyze (scientific) text.

Systematic Review Automation

I work on tools that accelerate and enhance evidence synthesis with active learning.

Research Impact

My work has been widely adopted in the research community: the software I contributed to has been downloaded over half a million times, and my contributions to open datasets have surpassed four million downloads. These efforts not only advance open science but also directly improve the performance of machine learning methods for systematic reviewing. By increasing accuracy and recall of relevant records, my work helps researchers across many fields — including medicine, where identifying every critical study can meaningfully strengthen the evidence base.