Hi! I am a final-year PhD candidate working at the intersection of machine learning, computational biology, and cheminformatics.
My research broadbly focuses on making ML models more trustworthy and robust in the face of unreliable data - using synthetic data, data augmentation, multimodal contrastive learning and reinforcement learning. Unlike natural language, where large, high-quality datasets are abundant, biology and chemistry often suffer from data scarcity and noise, which makes developing methods that perform well in low-data and/ or imperfect-data regimes especially critical. Most recently, I’ve become particularly interested in leveraging these techniques to build large reasoning models for biology and chemistry. You can find all of my existing publications and select talks on my google scholar page.
Prior to starting my PhD, I completed my BS in Chemical and Biomolecular Engineering at UC Berkeley’s College of Chemistry. While at Berkeley, I took several courses in data science and machine learning, which catalyzed my jump to working on problems in AI4Science when I started graduate school at the department of Chemical and Biological Engineering at Northwestern University. At Northwestern, I am fortunate to be co-advised by two brilliant scientists, Linda Broadbelt and Keith Tyo. I am also a member of the Center for Synthetic Biology at Northwestern and of the new pathway development group at the Joint BioEnergy Institute at Lawrence Berkeley National Laboratory.
09/2025: Moved back to Chicago, IL from Cambridge, MA to enter the final year of my PhD.
08/2025: Completed my research internship at Tatta Bio in Cambridge, MA. At Tatta, I had the opportunity to work in a small of team 6 and explore several projects at the intersection of machine learning, genomics, and cheminformatics.
07/2025: Our work on the de novo and machine learning-aided design of biosynthetic pathways with polyketide synthase megaenzymes is published in Nature Communications!
06/2025: Moved to Cambridge, MA to start my role as an ML intern at Tatta Bio.
03/2025: Presented my work on the de novo and machine learning-aided design of biosynthetic pathways with polyketide synthase megaenzymes at the DOE Biological Systems Science Division (BSSD) PI meeting in Washington DC.
01/2025: Presented my work on the de novo design of biosynthetic pathways with polyketide synthase megaenzymes at the Society for Industrial Microbiology and Biotechnology (SIMB) Annual Meeting, Natural Product Discovery in the Genomic Era in San Diego, CA.
11/2024: Our work on synthetically generating infeasible enzymatic reactions to augment known, feasible reactions and training supervised classifiers on the resulting dataset is published in the Royal Society of Chemistry’s Molecular Systems Design and Engineering journal!
10/2024: Gave a talk on my work on harnessing the biosynthetic machinery of natural products for the design of novel biosynthetic pathways to high-value commodity chemicals at the American Institute for Chemical Engineer’s (AiChE) annual meeting in San Diego, CA.
08/2024: Moved back to Chicago, IL from Emeryville, CA to enter the fourth year of my PhD.
01/2024: Moved to Emeryville, CA to start my role as a visiting computatonal biology researcher at the Joint BioEnergy Institute.
12/2023: Our opinion piece on the development of retrosynthesis software that can merge enzymatic and synthetic organic chemistry is published in Current Opinion in Biotechnology.
10/2023: I’ve passed my qualifying exams and am officially a PhD candidate!