2019年5月bioRxiv生信好文速览

更新时间:2023-06-22 04:59:08 阅读: 评论:0

2019年5月bioRxiv生信好文速览
到上个月,距生信人推出月度的bioRxiv生信好文速览栏目已经整整一年了。大约一年前,我们曾在“bioRxiv速览”中对Nature的world view板块刊发的来自伦敦的科学记者Tom Sheldon的一篇文章有过报道,该文作者表示,因为预印本(preprint)未经同行评议,所以与正式发表的文章相比而言可能包含更多错误,而这些错误可能通过预印本被传播、放大,由此Sheldon大声疾呼学界采取措施加强对预印本发布的限制。
一年过去了,上个月,Nature官方为预印本“正名”了。5月15号,Nature杂志以Editorial的形式刊文,正式表示了Nature及其旗下杂志对于预印本的支持!
银行柜员职责
 
实际上,Nature早在1997年就对预印本有过点评,不过当时的预印本主要实在物理学界罢了。而现今,Nature的编辑认为,是时候表示对预印本,这样一种集发现优权先宣示、接受同行意见、快速展示研究进展于一体的文体表示支持的时候了。
By making early rearch findings accessible quickly and easily, preprints allow rearchers to claim priority of discovery, receive community input and demonstrate evidence of progress for funders and others.
文章作者还表示,这一次Nature对以下两个以前有些模棱两个的问题加以更新。第一,允许作者对预印本文章选择版权,且不会影响审稿,但需注意,版权选择可能会限制研究成果的分享和传播。第二,作者可以通过媒体报道预印本的研究成果,但与此同时也应强调这些结果并未经过同行评议。Nature的影响力毋庸置疑,当然,也不应忘记当年老牌经典杂志Genetics大概是第一个公开声明支持预印本的生物类学术期刊。陆军棋
喝黑豆浆的好处
五年多过去了,预印本的队伍——不论是使用者还是服务器——在迅速壮大蓬勃发展,这
一点从上月刊于elife上的对bioRxiv自成立以来发布的37000余篇preprints的调查报告中可见一斑【1,2】。预印本发展到今天,得益于无数先驱者们的努力,当然也离不开批评者们的声音。它的未来需要学术圈的共同努力。
1. 【Bioinformatics】终于来了:谷歌携深度学习进军基因功能注释,号称大幅提升预测效果和速度
Using Deep Learning to Annotate the Protein Univer
什么是幸福Understanding the relationship between amino acid quence and protein function is a long-standing problem in molecular biology with far-reaching scientific implications. Despite six decades of progress, state-of-the-art techniques cannot annotate 1/3 of microbial protein quences, hampering our ability to exploit quences collected from diver organisms. To address this, we report a deep learning model that learns the relationship between unaligned amino acid quences and their functional classification across all 17929 families of the Pfam databa. Using the Pfam ed quences we establish a rigorous benchmark asssment and find a dilated convolutional model that r
educes the error of both BLASTp and pHMMs by a factor of nine. Using 80% of the full Pfam databa we train a protein family predictor that is more accurate and over 200 times faster than BLASTp, while learning quence features it was not trained on such as structural disorder and transmembrane helices. Our model co-locates quences from unen families in embedding space, allowing quences from novel families to be accurately annotated. The results suggest deep learning models will be a core component of future protein function prediction tools.
象棋残局棋谱
BTW:本文发布后立即在网上引起广泛关注,也包括不少质疑声音。来自丹麦哥本哈根大学的Lars Juhl Jenn教授表示,谷歌团队在测试集选取时忽略了属于同一家族的蛋白在进化上的关联:
HMMER作者Sean Eddy也表达了相似观点,此外还表示文章里对自己的软件在关于速度的描述有严重偏差:
2. 【Bioinformatics】针对大基因组的从头组装软件Ra
Yet another de novo genome asmbler(CC-BY-NC 4.0)
日本美女写真集Advances in quencing technologies have pushed the limits of genome asmblies beyond imagination. The sheer amount of long read data that is being generated enables the asmbly for even the largest and most complex organism for which efficient algorithms are needed. We prent a new tool, called Ra, for de novo genome asmbly
of long uncorrected reads. It is a fast and memory friendly asmbler bad on quence classification and asmbly graphs, developed with large genomes in mind. It is freely available /lbcbsci/ra.
3. 【Bioinformatics】普林斯顿大学John Storey:RNA-q差异表达实验达到statistical power测序深度需达到多少?
Determining sufficient quencing depth in RNA-Seq differential expression studies(CC-BY-ND 4.0)
RNA-Seq studies require a sufficient read depth to detect biologically important genes. Sequencing below this threshold will reduce statistical power while quencing above will provide only marginal improvements in power and incur unnecessary quencing costs. Although existing methodologies can help asss whether there is sufficient read depth, they are unable to guide how many additional reads should be quenced to reach this threshold. We provide a new method called superSeq that models the relationship between statistical power and read depth. We apply the superSeq framework to 393 RNA-
Seq experiments (1,021 total contrasts) in the Expression Atlas and find the model accurately predicts the increa in statistical power gained by increasing the read depth. Bad on our analysis, we find that most published studies (> 70%) are underquenced, i.e., their statistical power can be improved by increasing the quencing read depth. In addition, the extent of saturation is highly dependent on statistical methodology: only 9.5%, 29.5%, and 26.6% of contrasts are saturated when using DESeq2, edgeR, and limma, respectively. Finally, we also find that there is no clear minimum per-transcript read depth to guarantee saturation for an entire technology. Therefore, our framework not only delineates key differences among methods and their impact on determining saturation, but will also be needed even as technology improves and the read depth of experiments increas. Rearchers can thus u superSeq to calculate the read depth to achieve required statistical power while avoiding unnecessary quencing costs.
4. 【Evolution】中山大学施苏华团队:以红树为例,基因组中有多少基因可以在物种间自由交换?
十二属相婚配
Genes and the species concept - How much of the genomes can be exchanged?(CC-BY-NC-ND 4.0)
In the biological species concept, much of the genomes cannot be exchanged between species1,2. In the modern genic view, species are distinct as long as genes that delineate the morphological, ecological and reproductive differences remain distinct2. The rest (or the bulk) of the genomes should be freely interchangeable. The core of the species concept therefore demands finding out the full potential of introgressions between species. In a survey of two cloly related mangrove species (Rhizophora mucronata and R. stylosa) on the coasts of the western Pacific and Indian oceans, we found that the genomes are well delineated in allopatry, echoing their morphological and ecological divergence. The two species are sympatric/parapatric in the Daintree River area of northeastern Australia. In sympatry, their genomes harbor 7,700 and 3,100 introgression blocks, respectively, with each block averaging about 3-4 Kb. The fine-grained and strongly-penetrant introgressions suggest that each species must have evolved many differentially-adaptive (and, hence, non-introgressable) genes that contribute to speciatio移动网络设置
n. We identify 30 such genes, ven of which are about flower development, within small genomic islets with a mean size of 1.4 Kb. In sympatry, the species-specific genomic islets account for only a small fraction (< 15%)="" of="" the="" genomes="" while="" the="" rest="" appears="">

本文发布于:2023-06-22 04:59:08,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/82/1011005.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:印本   表示   来自   文章   可能
相关文章
留言与评论(共有 0 条评论)
   
验证码:
推荐文章
排行榜
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图