Research

Table of content

Research vision - signal detection in complex cellular environments

My research is driven by the following fundamental questions: what are the molecular mechanisms by which biomolecules perform specific cellular functions, and how can we design and discover novel therapeutics based on these insights? To answer these questions, I develop imaging and image analysis methods for cryo-electron microscopy/tomography (cryo-EM/ET) to quantitatively model cells, cellular compartments, and biological pathways. My research plans aim to revolutionize biomolecular imaging and image processing by integrating techniques from the fields of statistics and computer science.

In the past decade, the development of direct electron detectors (DEDs) and image processing methods has led to exponential growth in the biomolecular structures solved by EM. In single-particle cryo-EM, purified biomolecules are dispersed onto a support grid and rapidly frozen to form a thin layer of vitreous ice. The grid is then imaged in an electron microscope to record movies that are called “micrographs”. Thousands of individual particle images of different views can be picked from noisy micrographs, aligned, and averaged to reconstruct the high-resolution 3D structure of the macromolecule. In additional to single-particle cryo-EM, thinned cellular samples can be prepared by focused ion beam (FIB) and imaged via cryo-ET to reconstruct the 3D organization of cellular environments to study the structure of biomolecules in situ. Since the so-called resolution revolution in 2013, cryo-EM has been used to determine the structures of many biomolecular structures that are not tractable with other methods.

Advancements in microscope hardware and image processing methods are required to improve signal-to-noise ratio and data throughput. My research combines numerical analysis and deep learning algorithms to improve cryo-EM/ET data processing workflows. Specifically, building on the high-resolution 2D template matching (2DTM) approach developed in my postdoctoral lab, I will leverage the expanding repository of high-resolution structures and AlphaFold predictions as prior information to detect more challenging targets in images of diverse types of specimens.

Vision of research

My research lies on the interface of structural biology, visual proteomics, and machine learning. By accurately modeling the conformational states of biomolecules, we establish prior knowledge of their localization within cells, thereby deepening our understanding of their functions and advancing structure-based drug discovery.

Past and current research projects

1. Improved cryo-EM reconstruction of sub-50 kDa complexes using 2D template matching (2DTM)

In my ongoing research (manuscript in preparation), I explore the lower molecular weight limit of single-particle cryo-EM through a combination of theoretical and experimental approaches. By applying scattering theory, I recently determined the updated lower molecular weight limit to be ~20 kDa using 300 keV electrons. I also applied the newly developed 2DTM p-value (described in the following project) and found that 2DTM improves the alignment of targets smaller than 50 kDa and reconstructs the cofactor-binding site with higher resolution. We envision that this method will enable the study of smaller drug-binding complexes and advance structure-based drug design to new targets.

2. Robust target detection in cryo-EM images using high-resolution 2D template matching (2DTM)

Detecting smaller targets by 2DTM remains difficult since the detected signal depends on the molecular weight of the target. Moreover, low-resolution contrast can be a reliable indicator of a target, but is down-weighted in the current 2DTM workflow. I addressed these challenges by developing a new statistical metric for 2DTM, the 2DTM p-value, that improves the detection of several previously challenging targets, including a 193 kDa clathrin monomer. We envision that the 2DTM p-value is useful for detecting targets of 50 kDa and smaller. We also believe that the p-value increases our ability to reliably detect rare targets that might require a higher detection threshold to lower the chances of false positives.

Publication:

  
Kexin Zhang, Pilar Cossio, Aaditya V. Rangan, Bronwyn A. Lucas, Nikolaus Grigorieff*.

A New Statistical Metric for Robust Target Detection in Cryo-EM Using 2D template matching

IUCrJ (2025).

3. In situ cryo-EM single particle classification

2DTM can be used to detect different 60S ribosome intermediates in images of FIB-milled yeast cells. I developed a maximum likelihood method that probabilistically models the identities of individual targets detected by 2DTM with multiple templates. This method allowed us to model the spatial distribution of different molecular populations in the cell and study the ribosome biogenesis pathway. This study was the first to show that 2DTM can be used for in situ single particle classification without the need for 3D reconstruction.

Publication:

  
Bronwyn A. Lucas, Kexin Zhang, Sarah Loerch, Nikolaus Grigorieff.

In situ single particle classification reveals distinct 60S maturation intermediates in cells

eLife (2022).

4. Multiscale modeling of RNA structures using NMR chemical shifts

The central dogma of molecular biology states that genetic information is stored in DNA and passed to proteins by RNA. The proteins then carry out the cellular functions encoded by genetic information from DNA. Thus, for a long time, RNA was considered to be the intermediate of genetic information. However, it was discovered that only 2\% of the human genome is translated into proteins, and the remaining transcripts are thought to be functional non-coding RNAs (ncRNAs). To carry out biological functions, some ncRNAs may sample different conformational states and fluctuate between a ground state and transient states contingent on environmental conditions. The structures of these transient RNA states provide significant information regarding their function.

Solution state NMR has been the primary technique for RNA structure determination, and NMR-derived chemical shifts are considered structural “fingerprints” of RNA conformational state(s). My PhD thesis aimed to develop computational methods to accurately model the structures (secondary structures in particular) of RNA conformational states, including sparsely populated transient states, based on their chemical shift signatures.

Accurately determining the structure of an RNA is the first step in studying its spatiotemporal properties. To address this challenge, I developed three computational frameworks - CS-Fold, CS-BME, and CS-Annotate - that utilize readily accessible NMR chemical shifts to achieve the following objectives: guiding de novo RNA (secondary) structure prediction, probabilistically modeling the conformational landscape of RNA ensembles, and evaluating the quality of RNA structural models. These tools incorporate a variety of machine learning techniques.

Publication:

  
Kexin Zhang, Aaron T. Frank*.

Conditional Prediction of Ribonucleic Acid Secondary Structure Using Chemical Shifts

The Journal of Physical Chemistry B (2019).
Kexin Zhang, Aaron T. Frank*.

Probabilistic Modeling of RNA Ensembles Using NMR Chemical ShiftsArticle

The Journal of Physical Chemistry B (2021).
Kexin Zhang, Kyrillos Abdallah, Pujan Ajmera, Kyle Finos, Andrew Looka, Joseph Mekhael, Aaron T. Frank*.

CS-Annotate: A Tool for Using NMR Chemical Shifts to Annotate RNA Structure

Journal of Chemical Information and Modeling (2021).