UCSD Jacobs School of Engineering

Hard to study mutations implicated in the expression of genes associated with schizophrenia and more

San Diego, Calif., Nov. 5, 2019 -- Hard-to-study mutations in the human genome, called short tandem repeats, known as STRs or microsatellites, are implicated in the expression of genes associated with complex traits including schizophrenia, inflammatory bowel disease and even height and intelligence.

That’s the conclusion of a study published in the Nov. 1 issue of Nature Genetics by a team of researchers at the University of California San Diego. They were led by Melissa Gymrek, a UC San Diego professor of computer science and medicine, and Alon Goren, a UC San Diego professor of medicine. 

Short tandem repeats are composed of sequences of between one to six of the DNA’s basic components, called nucleotides, repeat over and over again, sometimes up to hundreds or thousands of times.

These mutations have already been implicated in about 30 conditions. The best known is perhaps Huntington’s Disease, which causes the progressive breakdown of nerve cells in the brain. About 30,000 people suffer from the condition in the United States. These people all have more than 40 copies of a specific repeat, known as the CAG trinucleotide. The more copies they have, the sooner they are affected by the disease and the more severe it is.

But until now, mostly due to lack of proper datasets, genome-wide studies of the effects of short tandem repeats on gene expression had only found limited connections. 

In this study, by leveraging whole genome sequencing and expression data for 17 tissues from the Genotype-Tissue Expression Project (GTEx) the team identified short tandem repeats in which the expression of nearby genes is impacted by the number of occurrences of the repetitive units in the genome. Researchers named these eSTRs – expression associated short tandem repeats. They found more than 28,000 such expression associated short tandem repeats in the genome. The 28,000 eSTRs can be found at http://webstr.gymreklab.com/ The website allows users to interactively explore eSTR results as well as additional information for each STR, including mutation rates and genetic variation across different populations.

“Overall, our results support the hypothesis that these mutations contribute to a range of human phenotypes and will serve as a valuable resource for future studies of complex traits,” Gymrek said.

The group then used statistical methods to measure the probability that each of these effects is significant. By doing so, they identified hundreds of such eSTRs which are responsible for effects previously found by whole genome analysis studies. The study results implicate specific repeat mutations in traits including height and schizophrenia, inflammatory bowel disease and intelligence.

“The study is a significant step towards understanding the ways cells interpret the differences in the number of occurring repeats to change the activity of genes,” said Goren. "For instance, higher number of such repeats can modify the ability of activating proteins to bind to the genome and induce the expression of the gene."

The impact of short tandem repeat variation on gene expression

Print News Release