The SOL Genomics Network. A Comparative Resource for

更新时间:2023-05-30 21:09:35 阅读: 评论:0

Bioinformatics
The SOL Genomics Network.A Comparative Resource for Solanaceae Biology and Beyond1
Lukas A.Mueller*,Teri H.Solow,Nicolas Taylor,Beth Skwarecki,Robert Buels,John Binns,
Chenwei Lin,Mark H.Wright,Robert Ahrens,Ying Wang,Evan V.Herbst,Emil R.Keyder,
Naama Menda,Dani Zamir,and Steven D.Tanksley
Department of Plant Breeding and Genetics,Cornell University,Ithaca,New York14853(L.A.M.,T.H.S., N.T.,B.S.,R.B.,J.B.,C.L.,M.H.W.,R.A.,Y.W.,E.V.H.,E.R.K.,S.D.T.);and Department of Field Crops, Vegetables,and Genetics,Faculty of Agriculture,Hebrew University of Jerusalem,Jerusalem,Israel
insurpriThe SOL Genomics Network(SGN;ll.edu)is a rapidly evolving comparative resource for the plants of the Solanaceae family,which includes important crop and model plants such as potato(Solanum tuberosum),eggplant(Solanum melongena),pepper(Capsicum annuum),and tomato(Solanum lycopersicum).The aim of SGN is to relate the species to one another using a comparative genomics approach and to tie them to the other dicots through the fully quenced genome of Arabidopsis(Arabidopsis thaliana).SGN currently hous map and marker data for Solanace
ae species,a large expresd quence tag collection with computationally derived unigene ts,an extensive databa of phenotypic information for a mutagenized tomato population,and associated tools such as real-time quantitative trait loci.Recently,the International Solanaceae Project(SOL)was formed as an umbrella organization for Solanaceae rearch in over30countries to address important questions in plant biology.Thefirst cornerstone of the SOL project is the quencing of the entire euchromatic portion of the tomato genome.SGN is collaborating with other bioinformatics centers in building the bioinformatics infra-structure for the tomato quencing project and implementing the bioinformatics strategy of the larger SOL project.The overarching goal of SGN is to make information available in an intuitive comparative format,thereby facilitating a systems approach to investigations into the basis of adaptation and phenotypic diversity in the Solanaceae family,other species in the Asterid clade such as coffee(Coffea arabica),Rubiaciae,and beyond.
The SOL Genomics Network(SGN;sgn. cornell.edu)is a genomics information resource for the Solanaceae family and related families in the Asterid clade,with the aim of building a comparative bioinfor-matics platform for answering questions about adap-tation,evolution,development,defen,biochemistry, and other facets of this clade.To date,SGN’s efforts have focud primarily on four areas:(1)cataloging and maintaining genetic maps and marker
s of the Solanaceae species;(2)disminating quence infor-mation for the different species of Solanaceae,mostly in the form of expresd quence tags(ESTs),for which SGN generates and publishes unigene builds;上映英文
(3)cataloging and publishing phenotypic information; and(4)asmbling,analyzing,and publishing data from the recently commenced quencing of the to-mato(Solanum lycopersicum)genome.Unlike many other plant resources on the Web,which often focus on a single plant species,such as the Arabidopsis In-formation Resource(TAIR;www.arabidopsis. org)on Arabidopsis(Arabidopsis thaliana;Rhee et al., 2003)or maizeGDB()on Zea mays(Lawrence et al.,2004),SGN has always had a strong comparative bias becau of its wider scope. With the advent of the tomato genome quence,this framework will be leveraged to place the tomato ge-nome in the center of comparisons,thus tying together all the other Solanaceae species and relating the quences to the other known plant genomes,such as Arabidopsis,Medicago,and rice(Oryza sativa).
海贼王主题曲下载extended是什么意思The Solanaceae have held great interest for many rearchers,breeders,and consumers for a long time. Indeed,the Solanaceae family is compod of more than3,000species,including the tuber-bearing potato (Solanum tuberosum),a number of fruit-bearing vege-tables(tomato,eggplant[Solanu
m melongena],and peppers[Capsicum annuum]),ornamental plants(pe-tunias[Petunia hybrida],Nicotiana),plants with edible leaves(Solanum aethiopicum,Solanum macrocarpon),and medicinal Datura,Capsicum;Knapp, 2002).The Solanaceae are the third most important plant taxon economically,the most valuable in terms of vegetable crops,and the most variable of crop species in terms of agricultural utility.In addition to their role as important food sources,many solana-ceous species have a role as scientific model plants, such as tomato and pepper,for the study of fruit development(Gray et al.,1992;Fray and Grierson, 1993;Hamilton et al.,1995;Brummell and Harpster, 2001;Alexander and Grierson,2002;Adams-Phillips et al.,2004;Giovannoni,2004;Tanksley,2004),potato for tuber development(Prat et al.,1990;Fernie and
1This work was supported by the National Science Foundation (grant nos.0116076,9872617,975866,and0421634)for the SGN and the tomato quencing project.
*Corresponding author;e-mail lam87@cornell.edu;fax607–255–6683.
/cgi/doi/10.1104/pp.105.060707.
Willmitzer,2001),petunia for the analysis of anthocy-anin pigments,and tomato and tobacco(Nicotiana tabacum)for plant defen(Bogdanove and Martin, 2000;Gebhardt and Valkonen,
2001;Li et al.,2001; Pedley and Martin,2003).The Solanaceae genomes have undergone relatively few genome rearrange-ments and duplications and therefore have very sim-ilar gene content and order.This exceptionally high level of conrvation of genome organization at the macro and micro levels makes this family a model to explore the basis of phenotypic diversity and adap-tation to natural and agricultural environments.Rec-ognizing the unique features of the Solanaceae,the International Solanaceae Project(SOL)was launched, tting rearch goals for the next10years.The SOL project address two key questions:(1)How can a common t of genes/proteins give ri to the wide range of morphologically and ecologically distinct or-ganisms that occupy our planet?(2)How can a deeper understanding of the genetic basis of plant diversity be harnesd to better meet the needs of society in an environmentally friendly and sustainable manner? To meaningfully analyze the gene-to-phenotype re-lationships,a large amount of quencing information is necessary.The most cost-effective way to get suffi-cient quence information to address the SOL ques-tions is to quence a high-quality reference genome and then map quences from other genomes onto the reference quence.Hence,thefirst cornerstone of the SOL project is the quencing of the full euchromatic portion of the tomato genome by an international consortium of10countries.Concomitantly,SOL will build a bioinformatics platform that allows intuitive and unrestricted access for rearchers,and integrates information from all Solanaceae rearch into a o
ne-stop shop on the Web that will ultimately allow ap-proaching Solanaceae biology from a systems biology perspective.In collaboration with other bioinformatics centers involved in SOL,SGN is actively building this infrastructure,which will be distributed in nature.It will rely on bioMOBY(Wilkinson and Links,2002)and other technologies to implement a virtual online center of information for the Solanaceae.
OVERVIEW OF THE DATABASE AND WEB SITE Like most other plant databas,SGN can be ac-cesd through an easy-to-u Web interface.The SGN homepage was recently revamped to improve the usability of the site.It now contains an intuitively organized Getting Started ction that provides links to the major features and Web pages,such as data overview pages,arch pages,map and markers, resources,etc.In the lower part of the screen is a ction providing links to related sites of interest,such as the Tomato Expression Databa(TED;Fei et al., 2004)and the Tomato Genomics Resource Center (TGRC)germplasm collection at the University of California,Davis(tgrc.ucdavis.edu).To make the site more interesting and accessible to casual or new urs,the entry page also contains a News ction that lists new features on the Web site and news from the community,an image of the week,a link to a recent article of interest,and a profile of a lab involved in Solanaceae rearch.
椭圆公式
All pages on SGN contain a toolbar at the top with links to the most frequently ud ctions of the site for easy navigation.The toolbar consists of the SGN logo, which is also a link to the SGN homepage,a quick arch function that arches the SGN databas and Web pages,and a menu bar with pull-down menus providing quick links to specific pages grouped by menu topic.Search pages for veral types of data are available,such as arches for markers,unigenes, expresd quence tags(ESTs),EST libraries,bacterial artificial chromosomes(BACs),and profiles of regis-tered SGN urs.Using the marker arch,markers can be queried by name,map,organism,map position, and whether a marker has associated information such as an overgo probe.Using the BAC arch,BACs can be arched by name,including wild cards,prence of end quence,and matches to overgo probes.Cur-rently,about75,000BACs have been end quenced from a Hin dIII library(Budiman et al.,2000)and10,000 BACs have been end quenced from a newly gener-ated Mbo I library.The numbers will grow to ap-proximately200,000total BACs for a nominal400,000 BAC end reads over the next few months.The Maps and Markers ction links to the different maps avail-able on SGN,which are rendered using the newly developed Comparative Viewer that allows visualiz-ing the maps in an interactive comparative format(e below).The tools available on SGN include the SGN Web BLAST,an identifier conversion tool,the intron finder tool(recently developed at SGN),the Genes That Make Tomatoes databa(Menda et al.,2004),the real-time quantitative trait lo
ci tool(Gur et al.,2004), and the bulk download tool pages,which provide utilities to download partial and complete datats for further analysis by the ur.For the download of partial datats,lists of identifiers(for unigenes,mi-croarray spots,or ESTs)can be entered to download associated information.Entire datats can be down-loaded from the SGN FTP rver(ftp://ftp.sgn. cornell.edu),including tomato genome data such as BAC end quences and BLAST databas,full BAC quences,and supporting data.
On SGN,all information is freely accessible to all urs.A login system exists only for the purpo of a ur-managed databa of Solanaceae rearchers and for submission of EST quences,and is required for quencing centers that participate in the tomato quencing program to update the BAC status in-formation in the SGN databa.In the near future, login will also allow urs to comment on data objects, such as markers and unigenes,and to make ur-contributed annotations to genes.In summary,SGN strives to operate under the guiding principles:(1) all data should be accessible without restrictions;(2) SGN,a Comparative Resource for the Solanaceae and Beyond六月英文缩写
original data should be stored wherever possible (chromatograms,asmblyfiles,gel images,etc.)to ensure complete reproducibility;(3)all data should be attributed to the submitters and data generators;(4)all annotations should be carried out using standard vocabularies and annotation guid
elines;(5)free and open-source software is ud where possible and SGN-developed software is made available to all as open source;and(6)all the data are loaded into interconnected SGN relational databas such that, ultimately,a systems approach to Solanaceae biology becomes possible.
SGN SITE ARCHITECTURE AND IMPLEMENTATION The SGN databa consists of a number of interre-lated relational databas implemented in MySQL ().Most software is written in Perl.The Web site us the Apache(www. )Web rver with the mod_perl integrated Perl interpreter.In keeping with the philosophy of open systems and open-source software,all rvers and most development machines run the Debian distribution of the GNU/Linux operating system. More information on the databa schemas,software, and tup at SGN can be found on the SGN Web site (ll.edu).
SGN DATA AND TOOLS
2015考研数学SGN Solanaceae Unigene Builds
As no full genome quence of a reprentative Solanaceae species is yet available,much of the exist-ing quence data on SGN consists of EST datats for Solanaceae species.However,as the t
omato -quencing project begins to bear fruit,SGN’s focus will change more to genomic quence data.From the EST datats and other known transcript quences, unigenes are asmbled in an effort to approximate the transcriptome t of each organism.SGN currently produces unigene builds for Solanaceae species that have EST quences available with associated chroma-tograms.Unigene builds are available for tomato, potato,pepper,eggplant,and petunia(e Table I).A Web interface is also available for submitting new quence datats.
In contrast to many other unigene asmbly meth-ods,the SGN custom unigene asmbly pipeline starts at the level of the raw chromatogram in order to apply the same quality standards to all data,thereby increas-ing the consistency and overall quality of the builds. The asmbly pipeline,which is tightly integrated with the SGN databa,works as follows.First,the chroma-tograms are ba called with phred and the raw quences are loaded into our databa.Next,the quences are procesd to determine a high-quality region excluding low-quality or cloning vector quen-ces.Then,Escherichia coli or lambda phage contamina-tion is detected with an automated National Center for Biotechnology Information(NCBI)BLAST arch,and contaminated quences areflagged in the databa. Inrts that contain quences matching the multiple cloning site of the vector areflagged as chimeric.A cond chimera screen is also applied that attempts to align the ends of a r2014广东理综
ead with any Arabidopsis coding quences.If the two ends match unrelated Arabidop-sis genes,the quence isflagged as chimeric.Flagged quences are not ud in subquent unigene builds. The unigene asmbly proceeds through a custom preclustering program,which generates clusters that are fed into the cap3program(Huang and Madan, 1999).cap3is run with the following parameter t-tings:-e5000-p90-d10000-b60.The overlap identity is t to a stringent90%(default75%,option-p),and the effects of some options are minimized by tting them to the maximum allowed values.The parameters have beenfine tuned over a period of years and comparisons to tomato mRNAs(that were not part of the input quences)indicate that the unigenes are of high quality.Table I summarizes the unigene builds currently available from SGN.The unigene builds also rve as the basis for a number of analys,in-cluding prediction of coding regions,Interpro do-mains,Markov clustering into gene families,and development of simple quence repeat and conrved ortholog t(COS)markers.Some of the results are available from the Web interface and some can be downloaded from the FTP site.
Markers,Maps,and the SGN Comparative Viewer
The SGN databa currently hous six maps,the Tomato E3PEN2000(Fulton et al.,2002),the Tomato E3PEN1992(Tanksley et al.,1992),the Tomato E3 HIR,the Tomato E3PIMP,the Potato T3B(T
anksley
Table I.A snapshot of the SGN unigene builds
Currently,SGN has unigene builds for tomato,potato,pepper,petunia,and eggplant.
Build Species Included No.ESTs No.Unigenes No.Singletons No.Contigs Tomato Solanum lycopersicum,Solanum pennellii,Solanum hirsutum184,86030,57611,16019,416 Potato Solanum tuberosum97,42524,9319,11715,814 Pepper Capsicum annuum20,7389,5546,6182,936 Petunia Petunia hybrida11,4795,1353,4761,659 Eggplant Solanum melongena3,1811,8411,224617 Mueller et al.
et al.,1992),and the Eggplant L 3M (Doganlar et al.,2002;Table II).A classic and an integrated pepper map will soon be added to the databa,and some maps,such the pepper SNU2map (Lee et al.,2004),are available as static images only.For the nonstatic maps,all marker and mapping data are stored,archable,and browsable in the SGN databa.The maps and markers are a good example of how SGN attempts to integrate and prent information intuitively in a com-parative format.The SGN Comparative Viewer ac-cess the SGN databa information and displays the maps as shown in Figure 1.By default,the Compar-ative Viewer shows a reference chromosome on the left-ha
nd side of the screen with the option of showing additional information tracks,such as a physical map,inbred lines (currently,the Zamir lines can be viewed),or a centiMorgan ruler.Becau there are usually many more markers than can be displayed on a full-
Table II.SGN maps and markers
Maps for pepper and coffee will be added to the site in the near future.网课平台哪个好
Species
Parents
Map Name
No.Markers
Tomato
S.lycopersicum 3S.pennellii F2-2000
1575S.lycopersicum 3S.pennellii E 3PEN1992553S.lycopersicum 3S.habrochaites E 3HIR1997135S.
lycopersicum 3S.pimpenellifolium E 3PIMP2001139Potato S.tuberosum 3S.berthaultii T 3B1992178Eggplant
S.longena
L 3M2002
220
Figure 1.The SGN Comparative Viewer.The viewer shows the relationships between different genetic maps stored in the SGN databa and allows urs to visualize other information associated with the maps,such as rulers,inbred lines,and the tomato physical map.A zooming function lets the ur examine the maps in detail.A cond map can be displayed on the right-hand side of the screen;the relationships between markers on the two maps are shown with lines.The interactive Comparative Viewer is controlled with the toolbar visible at the bottom of the screen.
SGN,a Comparative Resource for the Solanaceae and Beyond
chromosome map reprentation,only a lection of approximately a dozen markers is shown by name;the other markers are shown as tick marks only until the view is zoomed in.Clicking a point on the chromo-some will show an enlarged view of the lected region on the right,giving all markers by name within the enlarged interval.A zoom level button adjusts the size of the interval so that a convenient number of markers can be displayed in the zoomed-in ction.All marker labels are conveniently linked to their corre-sponding marker detail pages.When markers appear on more than one map,comparative views of two maps can be displayed.A pull-down menu shows all other chrom
osomes from other maps in the databa that have markers in common with this one.Choosing a chromosome from this menu adds that chromosome to the display alongside the reference chromosome and draws connecting lines between markers that are prent on both maps,facilitating comparison between different mapping populations or different species. Many of the common markers in the databa are COS markers,which were developed for the purpo of comparative mapping(Fulton et al.,2002)and which are currently expanded with a new t of markers called COSII(F.Wu and S.D.Tanksley,unpublished data).To get a bird’s eye view of how two different maps relate to each other,the View Entire Comparative Map link provides a view of the reference map chro-mosomes vertically displayed on the left and the comparison map on the right,with all chromosomal connections shown between the two maps.A powerful and ur-friendly marker arch is also available, allowing arching for markers by type,chromosome, position,map,species,and other criteria.Ud to-gether,SGN’s Comparative Viewer and marker arch capabilities give the biological community a powerful resource for comparing genetic maps of species in our databa.
Phenotypic Information
SGN hous a collection of phenotypic information in the Genes That Make Tomatoes databa describing a mutant population of13,000Solanum lycopersicum M2 families of the M82variety.This c
omprehensive mutant population is a uful basic resource for exploring gene function.The mutants were generated using ethyl methanesulfonate and fast-neutron mutagenesis,and the plants were visually phenotyped in thefield and then categorized into a morphological catalog encom-passing15primary and48condary categories.Cur-rently3,417mutations have been cataloged;among them are most of the previously described phenotypes from the monogenic mutant collection of the TGRC, plus over1,000new mutants with multiple alleles per locus.The phenotypic databa indicates that most mutations fall into more than a single category(they are pleiotropic),with some leaves)more prone to alterations than others.All data and images can be arched and accesd through SGN.The mutants were generated and phenotyped and the databa is admin-istered by the Zamir lab at the Hebrew University in Jerusalem,Israel.
Another tool is real-time quantitative trait loci,which prents the results of eight independent phenotyping experiments on an isogenic inbred line population and allows viewing correlations online in real time.A tighter integration of phenotypic data with the molec-ular and mapping data on SGN is planned for the future.
banyan treeTOMATO SEQUENCING PROJECT
The tomato quencing project was initiated in2004 by a consortium of10countries,with each of the following countries quencing one chromosome: Korea(chromosome2),China(chromosome3),Great Britain(chromosome4),India(chromosome5),The Netherlands(chromosome6),France(chromosome7), Japan(chromosome8),Spain(chromosome9),and Italy(chromosome12).The United States quenced three chromosomes(chromosomes1,10,and11;e Fig.2).The tomato genome is compod of approxi-mately950Mb of DNA,more than75%of which is heterochromatin and largely devoid of genes.The majority of genes are found in long contiguous stretches of gene-den euchromatin located on the distal portions of each chromosome arm.The quenc-ing strategy is to quence a minimal tiling path of BAC clones through the approximately220Mb of euchro-matin.The starting points for quencing the genome will be1,500anchor points,where the physical map has been linked to the genetic map using overgo probes. The results of the overgo analysis are available on SGN (ll.edu).In addition,SGN is involved in tting up part of the infrastructure for the tomato quencing project,such as a BAC registry,so that the status of each BAC in the quencing pipeline can be tracked by urs and quencers alike.Thefinished BAC quences will be deposited in both GenBank and SGN.On SGN,annotations will be included that will be viewable online,bad on Gbrow(Stein et al.,2002). The tomato quence will provide the resource for map
ping quences of other species onto it and,in conjunction with the comparative maps,generate vir-tual quences for other Solanaceae species,and to compare the features of the Solanaceae genomes with other quenced species,such as Arabidopsis,rice,and Medicago.SGN will produce a unified interface for viewing this information.More details on the SOL project and the tomato quencing project,including the SOL white paper and the tomato quencing stand-ards document,can be found on SGN at ll.edu/solanaceae-project.
CURATIONAL ACTIVITIES AT SGN
To date,the curational activities have focud on maps and markers,and the functional annotation,
Mueller et al.

本文发布于:2023-05-30 21:09:35,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/78/815916.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:海贼王   椭圆   考研   理综   网课
相关文章
留言与评论(共有 0 条评论)
   
验证码:
推荐文章
排行榜
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图