New tool reveals DNA structures that influence disease


DNA strand

With a newly created algorithm that quickly locates and helps elucidate the complex functions of topologically associating domains (TADs), researchers are making it easier to study these important structures and help prevent disease.

Image: CC0 Public Domain

Disruption of certain DNA structures – called topologically associating domains, or TADs – is linked with the development of disease, including some cancers. With its newly created algorithm that quickly locates and helps elucidate the complex functions of TADs, an international team of researchers is making it easier to study these important structures and help prevent disease.

“On your DNA you have genes and regulatory elements – such as promotors and enhancers – that control gene expression, but these two things can be far away from each other,” said Qunhua Li, associate professor of statistics at Penn State. “Similar to a dresser drawer that keeps your clothes organized and available for use, TADs bring genes together with their regulatory elements, which enables them to begin the process of gene expression.”


Drawing of TAD

Researchers found that the more densely coiled the DNA is inside TADs (as in the right image compared to the left) the greater the gene expression, likely due to the fact that more genes are brought into contact with their regulatory elements. TADs are areas on the genome within which DNA sequences physically interact with each other more frequently than with sequences outside the TAD.

IMAGE: Qunhua Li

Gene expression is the process by which the information encoded in DNA gives rise to observable traits.

According to Ross Hardison, T. Ming Chu Professor of Biochemistry and Molecular Biology at Penn State, disruption of the boundaries that form TADs can expose genes to the wrong regulatory elements and lead to aberrant gene expression that can result in the initiation of cancer, for example.

“This algorithm will help us better understand how these important structures function to prevent disease, which can take us one step further toward finding solutions,” he said.

Called OnTAD, the team’s computational algorithm rapidly identifies the locations of TADs in the genome and enables examination of their internal architectures, which are important for understanding their biological functions. The researchers describe their work today (Dec. 17) in Genome Biology.

OnTAD refers to optimized nested TAD caller. According to Hardison, the “nesting” or hierarchy of DNA interactions is analogous to the different levels of organization in a city.

“Think about New York City, with its boroughs, neighborhoods within boroughs, and street locations within neighborhoods. Each level of organization is nested within a higher level,” he explained. “Just like you are more likely to interact with someone on the same street rather than someone in another borough, DNA interactions are more frequent within the inner-most nested TADs. This is important because interactions among DNA segments – such as genes and enhancers – are needed for proper gene regulation. The OnTAD algorithm rapidly and efficiently reveals these levels of organization in DNA interactions.”

He added that by working within this hierarchical view of DNA interactions, he and his colleagues learned that the more densely coiled the DNA is inside TADs the greater the gene expression, likely due to the fact that more genes are brought into contact with their regulatory elements.

“As we better understand how DNA interactions function in normal gene regulation, we can be better prepared to uncover how mutations in DNA can alter those interactions that can lead to incorrect gene expression and influence the development of cancers and other diseases.”

Li noted that preexisting methods have focused solely on identifying the locations of TADs, with little investigation of the biological functions of hierarchical organization inside TADs in gene regulation.

In addition to revealing increased gene expression in hierarchical TADs, OnTAD showed that hierarchical TADs are characterized by more active epigenetic states. Epigenetic processes control cell memory and identity; for example, ensuring that kidney cells behave as kidney cells and not as liver cells.

“These results demonstrate that OnTAD is a powerful tool for revealing different levels of DNA organization across a genome,” said Li. “It should facilitate improved investigations into the roles of this organization in gene regulation.”

Other authors on the paper include former Penn State graduate students Lin An, computational biologist at CAMP4 Therapeutics, and Tao Yang, associate scientist at Regeneron Pharmaceuticals, and current graduate student Guanjue Xiang. Also on the paper are Jiahao Yang, undergraduate student at Tsinghua University, China; Johannes Nuebler, postdoctoral fellow at Massachusetts Institute of Technology; and former Penn State Associate Professor of Statistics Yu Zhang, quantitative researcher at Two Sigma.

The National Institutes of Health supported this research.

/Public Release. View in full here.