Zhongyi (James) Guo 过仲懿🌈

About Me | Education | Publications | Experience | Technical Summary | Personal Life ↗


Get In Touch

outreach at gzy1s dot me

M+P-AMA-DG1J (text me to set up a call)

Skill Set


🤖 Machine Learning

🧠 Deep Learning

👨🏻‍💻 Data Science

📊 Statistical Modeling

📝 NIH Proposal Writing

📄 Shell Scripting

Certifications


SAS

Coursera Badges & Courses

DeepLearning.AI

IBM AI Engineering

Neural Networks with PyTorch (In progress)

Stanford

Writing in the Sciences (In progress)

Languages


  • English
  • Mandarin

Log


11/20/2024: Updated research experience. I will update project experience and work experience shortly.

11/12/2024: Minor updates applied to the certification section.

08/12/2024: I redesigned my website. Now I'm slowing migrating project experience and work experience over.

About Me


I am James Guo 过仲懿. I am a Wuxiness (无锡人🇨🇳) located in the United States 🇺🇸 now.

My Research Interests:

  • AI & Stats + Omics
  • Quantitative Finance (curious and exploring)

My Personal Life:

  • laid-back body building
  • laid-back butterfly swimming
  • laid-back gymnastics

Please do not hesitate to contact me. Thanks!

Education



Publications


[1] Guo, Z.‡, Chen, D.‡, Stopsack, K. H., Soule, P., Ajit, D., Ramamoorthy, P., Hoffmann, T. J., Chan, J. M., Mucci L. A., Graff, R. E. (2025). Metabolomic Disparities Between Black and Non-Hispanic White Men with Metastatic Hormone-Sensitive Prostate Cancer: A Pilot Study. (Manuscript in preparation)

[2] Poster session presenter and first author, Causal effect of type II diabetes on prostate cancer in the East Asian population: A two-sample Mendelian randomization study, AACR Special Conference: Aging and Cancer, 2022. published ↗

‡ indicates co-first authorship.

Experience


Research

Deep Learning + Single-Cell Multiomics to Study Schizophrenia and Bipolar Disorder
Graduate Researcher @ Stanford 04/2024 - Present

I was very fortunate to join the PsychENCODE project at Prof. Anshul Kundaje's lab. In this project, I analyzed scRNA-seq and scATAC-seq multiome data of 3 brain regions from PsychENCODE to investigate the pathogenesis of schizophrenia (SCZ) and bipolar disorder (BPD).

Since chromatin accessibility and gene regulation are highly cell-type-specific, this project aims to analyze them at single-cell resolution. The overarching goal is to understand how variations in chromatin accessibility and gene expression across different cell types contribute to the pathogeneses of SCZ and BPD compared to controls.

Here is what I have done:

  • Pre-processed the single-cell multiome dataset using quality control, dimension reduction, annotation with reference, cell clustering, doublet removal, and marker gene detection.
  • Conducted differential gene expression and peak accessibility analyses (pseudobulk: DESeq2; single-cell: Wilcoxon rank-sum test and MAST) across cell types by sex and disease.
  • Creatively visualized and validated differential patterns for all detected genes and peaks, presenting findings to collaborators.

Here is what I will do:

  • Perform gene ontology analysis to elucidate the biological pathways associated with the identified differentially expressed genes.
  • Cross-reference cell-type-specific ChromBPNet outputs to address two key questions:
    • Which disease variants are causal for SCZ and BP
    • How mutations impact chromatin accessibility and, consequently, gene regulation

Representative packages I have used in this project:

  • R: Seurat, Signac, Azimuth, DoubletFinder, EnsDb, SingleCellExperiment, DESeq2
  • Python: CellBender, chrombpnet

Computationally Investigate Black-White Metabolomic Disparities in Prostate Cancer
Graduate Researcher @ UCSF 10/2023 - Present
Download my presentation slide: ppt ↗ || pdf ↗

Black men experience significantly higher incidence and mortality rates of prostate cancer compared to White men. This project aims to investigate this racial disparity through LC-MS metabolomics.

The cohort included 17 Black men and 17 White men in the United States, from the IRONMAN Registry ↗.

Here is what I have done:

  • Discovered key contributors through chemical similarity enrichment analysis (ChemRICH) by designing and implementing three methods: sub-pathway information, correlation modules, and predicted Medical Subject Headings (MeSH) classes.
  • Found corroborative contributing metabolites using Principle Component Analysis (PCA), Partial Least Squares Discriminant Analysis (PLS-DA), Random Forest, Support Vector Machine (SVM), logistic regression, quantitative set enrichment, and pathway analysis using MetaboAnalystR.
  • Identified upregulated/downregulated compounds influenced by aging through differential expression analysis using limma, after network construction and module detection using weighted gene co-expression network analysis (WGCNA), motivating research into aging's role.

The manuscript is in preparation. It will also be my thesis for my Master degree.

Representative R packages: tidyverse, ggplot2, ChemRICH, WGCNA, MetaboAnalystR, limma

Teaching

Beta Tester and Teaching Assistant
INFO 2950 Introduction to Data Science, Cornell University 01/2023 - 05/2023
  • Led discussions, graded homework and exams, held office hours, and proofread assignments and solutions before releasing.
Grader
BTRY 3080 Probability Models and Inference, Cornell University 08/2022 - 12/2022
  • Graded homework and exams.
Teaching Assistant (Summer)
(1) Introductory Biology, and (2) Physics I, JNC Study Abroad Platform 07/2022 - 08/2022
  • Led discussions, graded homework and exams, held office hours, and communicated between professors and students.
Teaching Assistant
BIOMG 2801 Laboratory in Genetics and Genomics, Cornell University 01/2021 - 05/2021
  • Created and stabilized knockout mutations on target gene of fruit flies using CRISPR/Cas9.
  • Assisted with designing and cloning primers with sgRNA and guided 20 students in analyzing mutations vs. wildtype on the UCSC Genome Browser and in locating sgRNA transgenes.
  • Directed me to computational work (sorry let's just be real)

Class Projects

Slowly updating...
Use Deep Learning to Study Differential Gene Expression Patterns in Sickle Cell Anemia Ischemic Stroke

to be updated...

Work


Gladstone Institute at UCSF
Bioinformatician, Gladstone Institute at UCSF 07/2025 - No idea
  • To be updated...

Match Group
Mobile (iOS) Development Intern (Remote), Match Group 07/2022 - 08/2022
  • Revised the Enhanced Interests feature using SwiftUI, allowing users to personalize tags with text and emojis, enabling better expression of personalities.
  • Improved code efficiency by replacing UIKit with SwiftUI and streamlined A/B testing.

Tencent
Data Analyst Project Intern (Remote), Tencent 07/2021 - 09/2021
  • Extracted sales statistics from e-commerce platforms using web scraping in Python.
  • Advised marketing strategies with an evidence-based report that built linear regression models using Sklearn to forecast sales trends by analyzing customer shopping patterns across multiple product categories, and included appealing visualizations created with Matplotlib in Python.

Technical Summary


Programming Languages

Markup/Styling Languages

Integrated Development Environments (IDEs) Command-Line Tools Version Control Visual Display