Home | Academic and Professional Experiences | GitHub Projects
Hello, I’m Xuan Lu, a health researcher and data analyst at Columbia University Department of Medicine. I completed my MPH in Epidemiology and Applied Biostatistics at Columbia University Mailman School of Public Health in 2024, and hold a B.S. in Psychology from The Pennsylvania State University. Below summarizes my academic and professional experiences. If you have any questions, feel free to email me.
Current Role — Data Analyst, Columbia University Department of Medicine (2024–Present)
Since completing my MPH, I have been working as a data analyst with Dr. Max O’Donnell and Dr. Matthew Cummings, supporting multiple international clinical research projects focused on MDR-TB, HIV, and sepsis. This role has given me the opportunity to lead end-to-end analyses — from study design and pipeline development through to manuscript preparation and publication.
Some highlights of this work: I conceived the study design and led a project team that applied large language model and natural language processing techniques to de-identified counseling session notes, uncovering latent barriers and protective factors of MDR-TB/HIV treatment adherence (manuscript under review at PLOS One). I also developed an automated operational monitoring and data quality assurance pipeline that reduced manual screening time from full-day cycles to under 20 minutes. Across the team’s projects, I have contributed to analyses spanning machine learning, survival analysis, multi-omics, and bioinformatics on 970+ patients and multiple international cohorts.
Between 2025 and 2026, I have contributed to 21 publications, including 10 peer-reviewed journal articles, preprints, and conference abstracts. A full list is available on my Google Scholar page.
Education
MPH, Columbia University Mailman School of Public Health (2022–2024) Major: Epidemiology and Applied Biostatistics | GPA: 4.0/4.0
My MPH training built a rigorous foundation across epidemiological methods, observational study design, biostatistics, and data management. My thesis — Identifying the Association between Neighborhood Crime Risk and Depression among High School Students in LA County, California — applied multilevel regression methods to the Happiness and Health Study dataset under the supervision of Dr. Katherine Keyes.
B.S., Pennsylvania State University (2018–2021) Major: Psychology (Life Sciences option); Minors: Special Education, Spanish | GPA: 3.7/4.0
My undergraduate training introduced me to the empirical study of human behavior, clinical assessment, and research methodology. I graduated with honors and was awarded the Ron and Sandie Musoleno Scholarship and the William and Estelle Turney Scholarship in the College of Education. My senior thesis examined the impact of environmental stimuli on gender-based language, presented at the Psi Chi Penn State Undergraduate Research Conference (2021).
Graduate Teaching
During my MPH program, I served as a Graduate Teaching Assistant at Columbia Mailman School of Public Health across three courses: Research Methods & Applications (Quantitative), Introduction to Biostatistics, and Analysis of Categorical Data. My responsibilities included leading lab sessions, holding office hours, supporting lecture delivery, and grading.
Early Research Endeavors
Before setting foot into the world-renowned halls of Columbia, I had already ventured into the challenging realm of healthcare research. At the Peking University Institute of Mental Health, I served as a research assistant under the mentorship of Dr. Weihua Yue. My work there included conducting structured clinical interviews and psychometric assessments (MINI, Ham-D/Ham-A, MADRS, Matrics CCB) for patients with major depressive disorder, generalized anxiety disorder, panic disorder, and schizophrenia. I also contributed to causal inference analyses using Mendelian Randomization to identify genetic variants associated with antipsychotic-induced weight gain, and supported predictive modeling for MDD treatment prognosis. This experience anchored my commitment to data-driven approaches in mental health research.
Applied Public Health Experience
My analytical journey saw another milestone during my practicum with Dr. Arash Alaei at the Institute for International Health and Education. Working with Tajikistan’s national HIV database, I characterized the impact of COVID-19 on linkage to care among people living with HIV — navigating a dataset of over 15,000 observations and 280 variables. This experience reinforced the importance of data quality and careful analytical judgment in the context of real-world policymaking.