Registrar of Voters Office, County of Santa Clara · San Jose, CA
Public Service
Click to expand ↓
To be updated...
▾
Data Scientist I
03/2026 – Present
W74 · Remote
Data ScienceEnergy ConsultingBuilding Energy ConsumptionPythonTime-seriesXGBoostRandom ForestDeep Neural NetworkLSTMTransformerSklearnPytorchGithub CodespaceClaude CLT
Click to expand ↓
Provided statistical and deep learning expertise to inform clients’ decision on energy consumption.
▾
Computational Researcher
07/2025 – Present
the Gladstone Institutes · San Francisco, CA
R Package DevDifferential Gene Expression AnalysisVariant Effect PredictionMulti-omicsscRNA-seqscATAC-seqRPythonPermutation TestingPseudobulkCase-controlTransformerdevtoolsSeuratArchRtidyverseggplot2AlphaGenomeSynapse
Click to expand ↓
Pioneered the development of an R package (permuteDE) with a permutation testing-based pipeline to reduce false positives in differential gene expression analysis in Alzheimer's disease.
Collected and integrated eight massive-scale public scRNA-seq datasets from Synapse to demonstrate the package's usability.
Applied AlphaGenome ↗ to study how the MAPT gene variants might alter splicing patterns.
Removed doublets, performed dimensionality reduction and clustering of scATAC-seq data.
Communicated complex statistical findings and effective data visualizations clearly to diverse audiences.
▾
Student Life Assistant
02/2024 – 01/2025
Stanford Online High School · Remote
Administrative SupportSchedule PlanningEvent CoordinationStudent Records ManagementPresentation Slide CreationGoogle Sheet/ExcelGoogle Slide/PowerPoint
Click to expand ↓
Supported various event planning such as meeting agenda, student activities, arrival/departure bus schedules, etc.
Designed engaging slides for weekly themed class meetings.
Maintained student records in Google Sheets, using data entry and Excel functions with strong attention to details.
Led discussions, graded homework and assignments, held office hours, and bridged communication between professors and students.
Maintained course materials and student records in Canvas and Gradescope.
▾
Student Food Assistant
02/2022 – 05/2022
Cornell Dining · Ithaca, NY
Customer ServiceDietary Protocol
Click to expand ↓
Replenished food in the Morrison dining hall.
Made pizzas and chow meins to feed hungry students.
Learned about dietary guidelines and protocols.
▾
Data Analyst Project Intern
07/2021 – 09/2021
Tencent · Remote
Machine Learning PredictionE-commerce Sales DataPythonExploratory Data AnalysisRegressionData VisualizationnumpypandasSklearnmatplotlibBeautiful Soup
Click to expand ↓
Scraped e-commerce sales data with Python and built linear regression models via Sklearn to forecast sales trends across product categories.
Delivered evidence-based marketing reports with matplotlib visualizations.
Project Highlight
Filter by
No results for this filter.
▾
Computational Researcher · the Gladstone Institutes (the Corces Lab)07/2025 – Present
permuteDE: Permutation Testing for Differential Gene Expression Analysis
R Package DevCase-control Statistical TestingDifferential Gene Expression Analysissc/snRNA-seqRPseudobulkPermutation TestingSeuratBPCellsedgeRDESeq2tidyverseggplot2
Click to expand ↓
Differential expression analyses are susceptible to false positives. permuteDE uses permutation testing to identify which comparisons have a higher number of significant differentially expressed features than would be expected by chance.
Developed the permuteDE R package, overseeing maintenance, debugging, and feature development.
Collected and integrated eight large-scale public snRNA-seq datasets from Synapse, performing rigorous data cleaning, metadata curation, and structuring data into Seurat objects and BPCells matrices.
Applied the permuteDE pipeline across eight datasets to demonstrate its real-world utility.
Note: Currently we are finalizing the package and analyses; permuteDE will be released upon manuscript publication.
▾
Graduate Student Researcher · Stanford University (the Kundaje Lab)04/2024 – 06/2025
Deep Learning + Single-Cell Multi-Omics to Study Schizophrenia & Bipolar Disorder
I joined the PsychENCODE project at Prof. Anshul Kundaje's lab. I led this project analyzing scRNA-seq and scATAC-seq multiome data from three brain regions to investigate cell type-specific chromatin accessibility and gene expression patterns underlying schizophrenia and bipolar disorder to better understand their pathogeneses.
Preprocessed high-dimensional multiome data through quality control, dimensionality reduction, clustering, doublet detection, marker identification, and cell type annotation.
Performed differential gene expression & peak accessibility analyses (pseudobulk: Wald test; single-cell: Wilcoxon + MAST) across multiple cell types.
Trained cell type-specific ChromBPNet↗ models to identify causal genomic variants and their downstream effects on chromatin accessibility and gene regulation.
▾
Graduate Student Researcher · UCSF (the Graff Lab)10/2023 – 06/2025
Black–White Metabolomic Disparities in Prostate Cancer
Black men face higher prostate cancer incidence and mortality than White men. This project leverages LC-MS metabolomics on a cohort of 34 men (17 Black, 17 White) from the IRONMAN Registry ↗ to characterize metabolites associated with such disparities. I presented this project as my Master's thesis.
Applied t-test, PCA, PLS-DA, random forest, logistic & linear regression, chemical similarity enrichment analysis, pathway analysis, and WGCNA to LC-MS metabolomics data, characterizing metabolites associated with Black-White prostate cancer disparities.
Interpreted complex statistical findings and communicated them through clear scientific writing and effective data visualizations; defended before Stanford Epidemiology faculty and students.
This project visualized the interplay between life expectancy (measured by central death rates) and economic development, healthcare access, energy production, and consumption patterns.
Developed interactive data visualization dashboards (global heatmaps, cross-country correlation analyses, and time-evolving bubble charts) using JavaScript D3.
▾
Class Project · Stanford University03/2024 – 06/2024
[Hypothetical] Integrating Single-Cell Multi-omics to Investigate Alzheimer’s Disease
NIH Grant Proposal PracticeLiterature ReviewTeam CollaborationGoogle Doc
In this project, I practiced NIH-style grant proposal writing by developing a theoretical proposal that integrates single-cell multiomics data to investigate the molecular mechanisms using factor analysis underlying Alzheimer's Disease.
Drafted the Specific Aims section of a conceptual NIH grant proposal and presented the multi-omics factor analysis plan to ~30 professors and students.
Class Project · Stanford University10/2023 – 12/2023
Subarachnoid Hemorrhage Covariate Association and Risk Modeling
Data ManagementDisease ModelingCase-control Statistical TestingClinical and Demographic DataSASt-testchi-squareUnivariate and multivariate logistic regressionMantel–Haenszel testSAS OnDemand
Click to expand ↓
This project evaluated associations between multiple exposures (age [categorical and continuous], sex, race, and smoking) and assessed alcohol as an effect modifier of the smoking–SAH relationship.
Cleaned and managed case-control datasets in SAS OnDemand: resolved duplicate records, standardized missing values, and created derived variables.
Conducted descriptive analyses (chi-square tests and t-tests) and applied univariate and multivariable logistic regression to estimate crude and adjusted odds ratios for smoking.
Performed stratified analysis using Mantel–Haenszel testing to adjust for alcohol use as a confounder.
Generated summary tables and data visualizations for reporting.
Nominated as Honorable Mention for Best UI for Hack Challenge Spring 2022.
This project developed an iOS app to ease the booking of a study space on campus in one of Cornell’s many open libraries and study spaces.
Programmatically developed a Cornell library study room booking system iOS app by integrating UIKit, AutoLayout, Navigation, UITable & UICollectionView, MVC, Delegation, and Animation.
Implemented GET all libraries and available rooms, POST new reservation(s), UPDATE reservation history & DELETE reservation(s) that interact with backend API using Alamofire.
Implemented UI with designers and collaborated with backend teammates for backend requests.
Led discussions, graded homework and exams, held office hours, and proofread problem sets before release.
Grader
BTRY 3080: Probability Models and Inference
Cornell University
08/2022 – 08/2022
Graded homework and exams.
Teaching Assistant (Summer)
Introductory Biology & Physics I
JNC Study Abroad Platform
07/2022 – 08/2022
Led discussions, graded homework and assignments, held office hours, and bridged communication between professors and students.
Teaching Assistant
BIOMG 2801: Laboratory in Genetics and Genomics
Cornell University
01/2021 – 05/2021
CRISPR/Cas9 knockout work on fruit flies; guided 20 students in mutation analysis on the UCSC Genome Browser.
Publications
[1] Shin, J.‡, Brady, E.‡, Chen, C., Lauderdale, K., Agrawal, A., Zhang, Y., Jiang, X., Nambiar, P., Herbert, J., Mallen, D., Ly, K., Guo, Z., Sant, C., Thomas, R., Miller, S., Cobos, I., Palop, J.. APOE4 and Aβ synergize to drive neuronal network dysfunction and lysosomal-ER proteostasis dysregulation in the preclinical stages of Alzheimer's disease, Nature Neuroscience, 2025. Submitted ↗
[2] Poster session presenter and first author, Causal effect of type II diabetes on prostate cancer in the East Asian population: A two-sample Mendelian randomization study, AACR Special Conference: Aging and Cancer, 2022. Published ↗