An allometric model for mapping ed development in plants
Zhongwen Huang,ChunfaT ong,W enhao Bo,Xiaoming Pang,Zhong W ang,Jichen Xu,Junyi Gai and Rongling W u
Abstract
Despite a tremendous effort to map quantitative trait loci (QTLs)responsible for agriculturally and biologically im-portant traits in plants,our understanding of how a QTL governs the developmental process of plant eds remains elusive.In this article,we address this issue by describing a model for functional mapping of ed development through the incorporation of the relationship between vegetative and reproductive growth.The time difference of reproductive from vegetative growth is described by Reeve and Huxley’s allometric equation.Thus,the implementa-tion of this equation into the framework of functional mapping allows dynamic QTLs for ed development to be identified more precily .By estimating and testing mathematical parameters that define Reeve and Huxley’s allo-metric equations of ed growth,the dynamic pattern of the genetic effects of the QTLs identified can be analyzed.We ud the model to analyze a soybean data,leading to the detection of QTLs that control the growth of ed dry weight.Three dynamic QTLs,located in two different linkage groups,were
detected to affect growth curves of ed dry weight.The QTLs detected may be ud to improve ed yield with marker-assisted lection by altering the pattern of ed development in a hope to achieve a maximum size of eds at a harvest time.Keywords:allometry;functional mapping;quantitative trait loci (QTL);developmental trait
INTRODUCTION
Although traditional breeding strategies bad on phenotypic lection have substantially contributed to improvement in plant yield,quality and resistance,their further u has proved to be limited for lecting superior varieties [1–3].This is mainly becau many traits are complex,controlled by polygenes and their interactions with developmental signals.In the past 2decades,tremendous developments in molecular marker technologies and statistical models have given ri to the revolution of genetic analysis approaches with which any phenotypic trait can be
Zhongwen Huang is an Associate Professor of crop breeding at Henan Institute of Science and Technology.His rearch focus on QTL mapping of quantitative traits in crop breeding.
Chunfa T ong is a Professor of quantitative genetics at Nanjing Forestry University.His rearch interest is in genetic mapping and modeling of complex traits.
2月13
师德教育案例
W enhao Bo is a post-doctoral rearcher in forest genetics and tree breeding in the Center for Computational Biology at Beijing Forestry University.His rearch focus on the genetic mapping of quantitative traits in Populus .
Xiaoming Pang is an Associate Professor of Tree Breeding at Beijing Forestry University.His rearch interest focus on the utilization of biotechnologies to study population genetic diversity in fruit trees.
Zhong W ang is a Rearch Associate at the Pennsylvania State University.He was a visiting scholar at Beijing Forestry University when this study was conducted.He writes computer software for statistical genetic models.
Jichen Xu is a Professor of plant molecular genetics at Beijing Forestry University.His rearch focus on gene cloning and gene function analysis in forest trees.
Junyi Gai is a Professor of crop breeding at Nanjing Agricultural University and Director of National Center for Soybean Improvement and National Key Laboratory for Crop Genetics and Germplasm Enhancement.He has long-standing experience in crop breeding and has brought molecular genetic techniques to soybean improvement.
Rongling W u is a Professor of Biostatistics and Bioinformatics and the Director of the Center for Statistical Genetics at The Pennsylvania State University.He founded the Center for Computational Biology at Beijing Forestry University.He is interested in quantitative and statistical genetics.
Corresponding authors.Rongling Wu,Center for Statistical Genetics,Pennsylvania State University,Hershey,PA 17033,USA.Tel:þ0017175312037;Fax:þ0017175310480;E-mail:rwu@phs.psu.edu;Junyi Gai,National Center for Soybean Improvement/National Key Laboratory for Crop Genetics and Germplasm Enhancement/Soybean Rearch Institute,Key Laboratory of Biology and Genetic Improvement of Soybean of the Ministry of Agriculture,Nanjing Agricultural University,Nanjing 210095,China.Tel:þ862584395405;Fax:þ862584395405;E-mail:sri@njau.edu
BRIEFINGS IN BIOINFORMATICS.VOL 15.NO 4.562^570doi:10.1093/bib/bbt019
Advance Access published on 29March 2013
ßThe Author 2013.Published by Oxford University Press.For Permissions,plea email:journals.
at Michigan State University on September 2, 2014
千元加盟小吃店discted into its underlying genetic components,known as quantitative trait loci (QTLs),with DNA-bad linkage maps.With tho identified QTLs,the improvement of economically important traits in soy-beans can be made more efficient and effective.
There has been a wealth of literature on the con-struction of genetic linkage maps and detection of QTLs for different traits in plants [4–6].As one of the most important traits,genetic mapping of ed traits has received considerable attention.For ex-ample,94QTLs related to ed weight have been reported in soybeans [7–12],three of which have been confirmed [13,14].Despite the efforts,most studies ignore dynamic and developmental changes implicated in ed growth.More recently,Teng et al .[6]performed QTL mapping for ed weight by measuring its developmental behavior,but they ud a traditional mapping approach bad on indi-vidual time points.A novel statistical method for map-ping dynamic traits,called functional mapping,has been developed in the literature [15–22].Functional mapping implements mathematical aspects of biolo-gical principles to describe the changes of gene actions and interactions triggered by QTLs during trait devel-opment.Practical applications of functional mapping can be found for diameter and rooting ability in pop-lars [15,23],programmed cell death in rice [24],plant height in soybeans [25]and body mass growth in mic
e [26].The QTLs detected in the examples display different temporal patterns in governing the formation and expression of a trait during development.
The earlier work of functional mapping dealt with the dynamic growth of vegetative traits.Becau re-productive growth may be initiated from different stages of vegetative growth,functional mapping for ed development needs the adjustment for time dif-ferences of reproductive from vegetative growth.Thus,to better describe the developmental trajec-tories of ed traits,we capitalized on Reeve and Huxley’s [27]allometric equation,which allows the initial value of a trait not to go through the origin.A mixture multivariate normal model was formulated with a correlation matrix structured by an autoregressive model of order one.Maximum likelihood estimates (MLEs)of unknown parameters in the model were obtained by implementing the EM algorithm and Nelder–Mead simplex method.Results about QTL detection from a biologically meaningful functional mapping model should have potential to improve plant ed yield with marker-assisted lection.
ST A TISTICAL MODEL
We modified the original statistical model for func-tional mapping to characterize QTLs for ed devel-opment by incorporating Reeve and Huxley’s [27]allometric equation.Evolved from the widely
ud allometry equation,y ¼a x b [28,29],by adding an additional parameter,g ,Reeve and Huxley’s equa-tion is written as
y ¼a ðx Àg Þb
ð1Þ
where y is a biological dependent variable,x is the body mass,a is a constant parameter and b is a power component.The additional parameter in Reeve and Huxley’s equation is considered as the point in ontogeny where the development of y begins relative to x [30].
Seeds develop from xually mature plants.In practice,it is possible to investigate the time to ini-tiate eds.Becau there is considerable variation in the timing of ed formation,we ud Reeve and Huxley’s equation to capture this variation through parameter g .Genetic mapping relies on a gregating population,such as the F 2,backcross,double hap-loids or recombinant inbred lines (RIL).Suppo we have constructed a high-density linkage map for a mapping population using molecular markers.An RIL population allows its individual progeny to be replicated genotypically.By planting multiple repli-cates for each RIL in a randomized block design,we can measure its whole-plant biomass and ed bio-mass at a ries of time points by destructive sam-pling.For RIL i ,T i tim
e points,which can be either even-spaced or uneven-spaced,are measured.In an RIL population,there are two homozygous geno-types QQ (1)and qq (2)at a QTL,with allele Q derived from one parent and allele q derived from the cond parent.For an RIL i ,the phenotypic value of its ed biomass,y ij ,at time t ij (j ¼1,...,T i ),affected by a QTL,can be expresd by a non-linear regression model,expresd as
y ij ¼
X 2k ¼1
z ik a k ðx ij Àg k Þb k þe ij
ð2Þ
where x ij is the whole-plant biomass of RIL i at time t ij ;z ik is an indicator variable with z ik ¼1if this line has QTL genotype k ,otherwi z ik ¼0;a k ,b k and g k are unknown parameters that correspond to QTL genotype k ;and e ij is a residual error assumed to be normally distributed with mean 0and variance s 2j .The two QTL genotypes have different growth curves of eds if the parameter t (a k ,b k and g k )
Mapping ed development
563
at Michigan State University on September 2, 2014
is genotype-dependent.Thus,by testing how the parameters differ between the two genotypes,we can identify the pattern of QTL effects on ed development.
If we have n RILs,the likelihood of unknown parameters given phenotypic (y )and marker data (M )can be expresd,in terms of a mixture model,as
L ð?j y ,M Þ¼
Y n i ¼1
o 1j i f 1ðy i ,m i 1Þþo 2j i f 2ðy i ,m i 2Þ
ÂÃð3Þ
where ?contains unknown model parameters to be estimated;o k j i is the conditional probability of QTL genotype k ,conditional on the genotype of two flanking markers of RIL i ,which are referred to as Wu et al .[31];and f k (y i ,m ik )is a multivariate normal function of RIL i that carries QTL genotype k ,ex-presd as
f k ðy i ,m ik Þ¼
1
ð2p ÞT i =2j D i j À1=2Âexp À12
ðy i Àm ik Þ0D À1
i ðy i Àm ik Þ
!ð4Þ
where y i ¼(y i1,...,y i T i )is the phenotypic vector of RIL i measured at T i time points;m ik ¼(m ik 1,...,m ik Ti )is the mean vector of QTL genotype k for RIL i measured at T i time points;and Æi is the re-sidual covariance matrix of RIL i with T i repeated measurements.
For a longitudinal covariance matrix R i ,it is sug-gested that an appropriate statistical model be ud to model its structure.A number of models have been available to model the covariance structure within the functional mapping framework [32,33].In a real example for QTL mapping using an RIL popu-lation of soybeans (Figure 1),it appears that time-varying variability in ed developmental trajectories can be modeled by a simple autoregressive model of order one [AR(1)]for covariance structure.As an example,here,we u the AR(1)by assuming that variance and covariance are stationary,expresd as
D i ¼1
君爱色r t i 2Àt i 1
ÁÁÁr t iT i Àt i 1r t i 2Àt i 11ÁÁÁr t iT i Àt
.....
....r
居家隔离几天面对困难t iT i Àt i 1r t iT i Àt i 2
ÁÁÁ1
266643
7775s 2
ð5Þ
where s 2is the variance and r is the proportion parameter with which the correlation decays with time lag.For growth data like one in this example,other parametric approaches for covariance structure can be ud,such as structured antedependence
models that relax the assumptions of variance statio-narity and covariance stationarity [34].
Likelihood Equation (3)contains three types of unknown QTL location described by o j j i ,genotype-specific curve parameters (a k ,b k ,g k )and covariance-structuring parameters (r ,s 2).To obtain the MLEs of the unknown parameters,we ud a hybrid of the EM algorithm and Nelder–Mead simplex method.The algorithm is a direct-arch optimization method for non-linear functions in low dimensions and it does not need any deriva-tive information.
The hypothesis to test whether there exists a QTL affecting ed growth curves at a specific genomic position can be formulated as
H 0:a 1¼a 2,b 1¼b 2,
g 1¼g 2versus H 1:at leastone of the equalities aforementioned does not hold
ð6Þ
where the H 0corresponds to the reduced model and
the H 1corresponds to the full model.The log-likeli-hood ratio of the full model over the reduced model is applied to test the aforementioned hypothes,
LR ¼À2log
L 0ð~?Þ
L 1ð^?
Þ"
#ð7Þ
where ~?
瘦腿最有效的方法and ^?denote the MLEs of the unknown parameters under the H 0and H 1,respectively.Permutation tests were performed to determine genome-wide critical threshold [35],by which a QTL is asrted to exist in a position of chromosome if a high peak of LR profiles exceeds the threshold.Each plant experiences reproductive behavior only when it reaches a particular size through vege-tative growth.Parameter g can describe the time delay of reproductive behavior relative to vegetative growth.Whether this time delay is controlled by the QTL can be tested by the hypothes
H 0:g 1¼g 2versus H 1:g 1¼g 2
ð8Þ
The rejection of the null hypothesis implies that the QTL triggers a significant effect on the amount of vegetative growth that is ready to initiate eds.
APPLICA TION Plant materials
An RIL population of soybean derived from the cross between cultivars Kefeng No.1and Nannong1138-2was ud to validate the model for ed-development
564Huang et al .
at Michigan State University on September 2, 2014
mapping.The population consists of 184RILs who first linkage map constructed from 452markers was published by Zhang et al .[36].This map was recently updated by adding some new SSR makers and dump-ing some unreliable markers.The new map contains 834molecular makers covering a length of 2308cM in 24linkage groups,with an average genetic distance of 2.85cM between adjacent markers.
In 2006,the RILs and their parents were planted in a 14Â14simple lattice design with two replica-tions,in the National Center of Soybean Experi-ment,Jiangsu,Nanjing Agricultural University,China.Each RIL was planted in a 4Â2.5m 2plot with five rows spaced 0.5m apart.The lattice design ud can reduce the field experiment error.Ten plants in the cond row of a plot for each RIL were randomly lected for measuring ed dry weight at multiple times in the whole growing ason.Starting 20August 2006,100eds were sampled from a single plant to measure their dry weights and then calculate the average dry weight per ed once every a week until eds stop growing.Four to eight repeated measurements were taken for the RILs studied.
RESUL TS
Figure 1illustrates the plot of ed biomass over time for 184RILs and two original parents.The two
parents showed substantially different growth pat-terns;parent Nannong 1138-2forms eds much later than parent Kefeng No.1but the former dis-plays a much greater rate of ed growth than the latter.There is great variability in ed growth tra-jectories among different RILs.Our field obrvation showed that some RILs formed eds as early as in early August,whereas others did not generate eds until early September.It is interesting to e that early types stop ed growth in mid-September when late types just started ed formation.Also,there is an amount of variation in ed growth rate,with some RILs similar to parent Nannong1138-2,whereas others similar to parent Kefeng No.1.Such substantial variation in the timing of ed formation and ed growth rate suggests that this mapping population provides excellent material for functional mapping of ed development.
By analyzing this data t of ed growth,we de-tected three significant QTLs on linkage groups B1and O,as evidenced by their LR values beyond the genome-wide threshold determined by permutation tests (Figure 2).On linkage group B1,two QTLs are claimed becau their peaks are wel
l parated (>30cM)from each other.A summary of the esti-mates of QTL locations,genotype-specific curve parameters and covariance-structuring parameters is listed in Table 1,where the standard errors of each estimate obtained by re-sampling are also given.
It
Figure 1:Plots of ed biomass over time for 184RILs in soybeans.The plots of two parents Kefeng No.1(P 1)and Nannong 1138-2(P2)are indicated by thick lines.
Mapping ed development 565
at Michigan State University on September 2, 2014
ems that all the parameters can be reasonably precily estimated,although the time of reproduct-ive delay is more difficult to estimate.The estimated curve parameters were ud to illustrate the devel-opmental trajectories of ed biomass for each geno-type at each QTL detected.As shown in Figure 3,two genotypes perform differently in the timing of ed formation and growth rate.At the two QTLs detected on linkage group B1,the time to form eds was much earlier for the genotypes with alleles in-herited from parent Kefeng No.1than tho with alleles from parent Nannong1138-2.This is consist-ent with the parental difference,as parent Kefeng No.1is an early genotype,whereas parent Nan-nong1138-2is a late genotype.However,at the two QTLs,small pare
nt Kefeng No.1contributes favorable alleles to increasing ed sizes,leading to large eds for the progeny compod of the Kefeng No.1alleles than the Nannong 1138-2alleles.It is interesting to note that the time of reproductive delay differs significantly (P <0.001)between two genotypes at each of the three QTLs detected,
as
Figure 2:The log-likelihood ratios (LR)of the full model (there is a QTL)over the reduced model (there is no QTL)at every 2cM along the genetic linkage map compod of 950molecular markers.Tick marks on the ceilings of each panel reprent the positions of molecular markers on each linkage group.The dashed horizontal line indicates the genome-wide threshold value for asrting the existence of a QTL at the significant level 0.01.
T able 1:Maximum likelihood estimates of ed biomass trajectory parameters and their stand-ard errors (in brackets)for different genotypes at each of the QTLs detected in an RIL population of soybeans
Parameter estimate
Q TL A
B
C
Linkage group B 1
B 1
O
Maker interval Satt426^GMKF080GMKF082c^GMKF 168b GNE035^Sat 231log(a 1) 3.703(1.161) 4.685(1.265)À15.445(4.725)B 11.790(0.304) 2.055(0.319) 4.621(1.006)À1
À0.211(2.672)À1.783(2.879)À14.193(9.086)log(a 2)À14.713(5.496)À12.222(3.877)À3.937(1.158)b 2 4.535(1.189) 4.103(0.888)1.897(0.303)g 2À10.753(9.894)À3.680(6.362)À0.051(2.543)Q
绝不的英文0.991(0.001)0.993(0.001)0.992(0.001)2
0.773(0.097)0.931(0.122)0.831(0.109)LR
41.54
45.61
39.99
566Huang et al .
at Michigan State University on September 2, 2014