论文标题

神经网络促进了线性公式的促进:关于制定DNA基序与基因表达之间关系的案例研究

Neural network facilitated ab initio derivation of linear formula: A case study on formulating the relationship between DNA motifs and gene expression

论文作者

Liu, Chengyu, Wang, Wei

论文摘要

开发具有高解释性甚至得出公式以量化生物数据之间关系的模型是一种新兴的需求。我们在这里提出了一个基于可解释的神经网络模型的新方法,用于序列基序和线性公式的从头启动推导的框架,称为上下文回归模型。我们表明,该线性模型可以使用具有与深神经网络模型相当的性能的启动子序列来预测基因表达水平。我们发现了在基因表达上具有重要调节作用的300个基序的列表,并表明它们对154种不同细胞类型中细胞类型的特异性基因表达也有显着贡献。这项工作说明了提取公式以表示可能不容易阐明的生物学定律的可能性。 (https://github.com/wang-lab-ucsd/motif_finding_contextual_regression)

Developing models with high interpretability and even deriving formulas to quantify relationships between biological data is an emerging need. We propose here a framework for ab initio derivation of sequence motifs and linear formula using a new approach based on the interpretable neural network model called contextual regression model. We showed that this linear model could predict gene expression levels using promoter sequences with a performance comparable to deep neural network models. We uncovered a list of 300 motifs with important regulatory roles on gene expression and showed that they also had significant contributions to cell-type specific gene expression in 154 diverse cell types. This work illustrates the possibility of deriving formulas to represent biology laws that may not be easily elucidated. (https://github.com/Wang-lab-UCSD/Motif_Finding_Contextual_Regression)

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源