Home | Academic and Professional Experiences | GitHub Projects
Project 1: Glioblastoma Multiforme (GBM) Market Analysis (repository link)
This project examines the structure of Glioblastoma Multiforme (GBM)
market and existing treatment gaps using information collected from GBM
patients. Data provided by Michael Allen Company. For this project, I
utilized R version 4.3.1, and R packages dplyr
,
summarytools
, forcats
, tidyverse
,
psych
, broom
, knitr
, and
ggplot2
.
You can access the project report Here.
Project 2: NYC Zip Code Level Population Changes (repository link)
This project examines NYC ZIP code-level population changes using USPS Change of Address (COA) data. You can access the project report HERE.
Project 3: Visualization and EDA (repository link)
This project is broken down to three different smaller projects. It
uses data “The Instacart Online Grocery Shopping Dataset 2017” or
instacart
, “Behavioral Risk Factors Surveillance System for
Selected Metropolitan Area Risk Trends (SMART) for 2002-2010” or
smart
, and accelerometer data collected on 250 participants
in the NHANES study or accel
.
For more detailed description of each project, please visit:
Project 4: Data Cleaning SOP (repository link)
This project is broken down to three different smaller projects. It
uses data “FiveThirtyEight” or 538
, “Mr. Trash Wheel” or
trashwheel
, and dataset collected in an observational study
to understand the trajectory of Alzheimer’s disease (AD) biomarkers, or
amyloid
.
For more detailed description of each project, please visit:
Project 5: Flexdashboard
For this project, I created a flexdashboard using a random sampling
of 500 observations of the instacart
dataset. Click HERE to view.
Project 6: Investigating the Association Between Depression and Hypertension (repository link)
This is a group project for Application of Epidemiological Research Methods class. Using NHANES 2017 - 2018 data, we analyzed the crude association between binary exposure of Major Depressive Disorder and binary outcome of Hypertension with bivariate logistic regression and Pearson chi-square test. We also included potential confounding variables to assess the adjusted association with multivariable logistic regression and Mantel-Haenszel chi-square test. We calculated the odds ratio and their 95% confidence interval for both crude and adjusted association.
I was responsible for the research question conceptualization and formalization, conducting literature review, SAS programming for data cleaning, analysis, and error-checking throughout the coding and analyzing process.
You may access the final project abstract HERE and the SAS coding pipeline HERE.