Although the human genome has been sequenced and we think most protein-coding genes have been identified, there are still many things researchers don't know about the human genome. One of the most critical aspects of genes is how their expression is regulated in different types of cells. There are about 20,000 protein coding genes in the human genome, and while a handful of genes are expressed in many cells, there are also genes that are only active in small subsets of specific types of cells. For example, insulin is expressed in pancreatic beta cells, but not in places like the retina, and if it was, there would be a problem.
But if we are going to understand human genes well enough to treat diseases that are related to genetics, which many are, we have to fully understand gene expression and how it is regulated. Some of that regulation involves genetic elements that do not code for protein; some may encode RNA molecules that can influence gene expression, for example, while others may impact the structure of DNA.
Since DNA is a huge molecule, it has to be carefully packaged by proteins, creating a protein-DNA complex known as chromatin, making some regions of the molecule more, or less, accessible. When genes are transcribed by machinery in the cell, that transcriptional machinery has to be able to access active genes. Thus, the physical structure of DNA can influence how genes are expressed.
There is mounting evidence that non-coding regions of DNA are important influences on the development of multiple diseases, including autoimmune diseases, Alzheimer's, and diabetes, noted study co-author Kyle J. Gaulton, Ph.D., assistant professor in the Department of Pediatrics at University of California San Diego School of Medicine.
Scientists have been using computational tools to reveal variations in the sequence of non-coding regions of DNA that are contributing to disease. These variants seem to be disrupting the function of genetic elements that help control gene expression in cell types that are relevant to disease, explained co-first study author Kai Zhang, Ph.D., a postdoctoral fellow in the Department of Cellular and Molecular Medicine. "A major barrier to unlocking the function of noncoding risk variants, however, is the lack of cell-type-specific maps of transcriptional regulatory elements in the human genome."
This work has created maps like that, and revealed regions of chromatin accessibility, where the active portions of the genome can be reached and regulated. It has done so for a variety of cell types, creating an atlas of cis-regulatory elements. This work can now be used to reveal more about the regulation of genes in different cell types, and how that regulation goes awry in disease.
Sources: University of California San Diego, Cell