Researchers at UChicago are developing cutting-edge techniques to study gene expression in thousands of individual cells at a time, seeking to answer questions about biology and disease.
Understanding how our bodies translate the genetic code provided by our DNA into the protein molecules that keep the human machine running is an important part of not only understanding human biology but also treating and preventing a wide variety of diseases, including cancer.
The key dogma of biology is that genetic information in our cells is permanently encoded in DNA, much like a recipe is written in a cookbook. That information is translated into RNA, an intermediate molecule that can be thought of as the raw ingredients needed to cook a meal. The information carried by the RNA molecules is used by the cell’s molecular machinery to build proteins, the final product and main action-drivers that keep cells alive and functional.
At each step of the process, external factors can impact the pathway — so not every bit of information that’s encoded by the DNA gets turned into exactly what one might expect in the final protein, and while the amount of DNA in a cell remains constant, differences in how the DNA and RNA are processed can lead to different amounts of protein. Looking only at how a gene is encoded in the DNA doesn’t always provide much useful information on the activity, function, or impacts of the final protein product.
Using tools that measure parts of this process can give scientists a snapshot of the biological activity in a cell or within tissues or organs at a particular moment in time; they can use this information to understand how different diseases, conditions, and drugs affect biological function. One such method is known as RNA sequencing, or RNA-Seq. RNA-Seq uses high-throughput technology to closely examine the type and amount of RNA in a given biological sample, and can even give insights into the different ways genes and RNA expression have been modified.
RNA-Seq has been used to help pick apart how gene expression changes in disease states, identify pathways that can be targeted with drugs, and even make genetic diagnoses. And now, this powerful technology is becoming even more powerful, thanks to the development of even more advanced techniques, including single-cell RNA-Seq.
Most RNA-Seq technology uses whole tissue samples, so all of the RNA from within a given piece of tumor or skin sample is all muddled up together, and scientists can’t tell what RNA belongs to which cell types. Single-cell RNA-Seq uses microfluidics to isolate individual cells from within a tissue sample and examine the RNA within each individual cell. It’s not possible to gather all of the genetic information from every single cell, but researchers use the data generated by single-cell RNA-Seq combined with data analysis techniques to see how groups of cells cluster together.
Single-cell RNA-Seq is especially useful for studying complex tissues, such as brain tissue, or cancerous tumors. Researchers such as Anindita “Oni” Basu, PhD, a world expert in developing single-cell RNA-Seq technologies, have been able to use these platforms to dive deeply into the different cell types and gene expression patterns contributing to disease. In a recent paper, Basu, an Assistant Professor of Genetic Medicine at UChicago, and her team used a single-cell RNA-Seq method known as Drop-seq to pick apart the microenvironment of metastatic ovarian cancer tumors, hoping to find targets for better immunotherapies for the disease. By sorting individual cells into new categories based on differences in gene expression, the researchers identified two new groups of immune cells that indicated different immune responses to the cancerous tumors. Understanding the gene expression profiles within the tumors of individual patients can be used to help clinicians determine which treatments have the best chance of success, and can aid in the development of personalized medicine, so physicians can provide an individually-tailored therapy regimen.
Investigators like Basu continue to develop even more advanced single-cell analysis techniques, increasing the speed and accuracy of the technology, to gain new insights into questions like how many different subtypes of brain cells exist, or how microbial communities in the gut or on the skin are composed.
But this technology is still new, and has limitations. A common challenge is the fact that genetic information is inevitably lost when tissue is prepared for analysis; not every cell will survive the processing and some will not contain enough high-quality RNA for proper analysis. So how do scientists deal with these challenges?
On one hand, scientists must simply accept that biology is complex, and so biological data will always be imperfect. At the same time, updated statistical modeling approaches can help take those imperfections into account and provide a better idea of what the noisy biological data are saying.
A recent paper by Abhishek Sarkar and Matthew Stephens, PhD, Professor of Statistics and Human Genetics at UChicago, outlined suggestions for how to think about these models in ways that reduce confusion over single-cell RNA sequencing analysis. One key point the authors make is that there is a distinction between the biological variability inherent in a sample and the technical variability in the data generated by the processing and analytical techniques. Being explicit about the type of variability, they argued, is important for understanding analysis results.
Sarkar and Stephens recommend separating single-cell RNA-Seq models into two categories: models that describe the variation introduced by the measurement process, and models that describe the true biological variation in expression levels. This approach would help reduce confusion about how to interpret blank spots in the data, and make it easier to understand gene expression variation between cells. Furthermore, they recommend a number of simple modeling starting points to support these kinds of analyses. These kinds of discussions can help researchers better align their research approaches and ensure that there are common guidelines for interpreting data.
As our single-cell RNA sequencing technologies continue to develop, so too will our analytical techniques. Other related techniques include single-cell DNA methylome sequencing, which allows researchers to study epigenetic differences between individual cells, as well as single-cell genome sequencing for studying individual organisms within a microbial community or examining differences between cancerous cells. These approaches, and others like them, can provide deep insights into the unique biology of cell populations, which can in turn help us better understand the human body, as well as supporting the development of new therapies for human diseases.
By Alison Caldwell, PhD