Estimation of extreme values from sampled time ries

更新时间:2023-07-10 00:48:39 阅读: 评论:0

Estimation of extreme values from sampled time ries
A.Naess a,*,O.Gaidai b
a Centre for Ships and Ocean Structures &Department of Mathematical Sciences,Norwegian University of Science and Technology,Trondheim,Norway b
Centre for Ships and Ocean Structures,Norwegian University of Science and Technology,Trondheim,Norway
a r t i c l e i n f o Article history:
Received 30November 2007
Received in revid form 23June 2008Accepted 30June 2008
Available online 22August 2008Keywords:
Extreme value estimation Sampled time ries
Approximation by conditioning Mean exceedance rate Monte Carlo simulation
a b s t r a c t
The paper focus on the development of a method for extreme value estimation bad on sampled time ries.It is limited to the ca when the extreme values asymptotically follow the Gumbel distribution.The method is designed to account for statistical dependence between the data points in a rational way.This avoids the problem of declustering of data to ensure independence,which is a common problem for the peaks-over-threshold method.The goal has been to establish an accurate method for prediction eme wind speeds bad on recorded data.The method will be demonstrated by application to both synthetic and real data.From a practical point of view,it ems to perform better than the POT and Gum-bel methods,and it is applicable to nonstationary time ries.
Ó2008Elvier Ltd.All rights rerved.
1.Introduction
Extreme value statistics,even in applications,have very often been bad on asymptotic results.This is done either by assuming that the epochal extremes,for example yearly extreme wind speeds,are distributed according to the generalized (asymptotic)extreme value distribution with unknown parameters to be esti-mated on the basis of the obrved data.Or it is assumed that the exceedance
s above high thresholds follow a generalized (asymptotic)Pareto distribution with parameters to be estimated from the data,e [1].The major problem with both of the ap-proaches is that the asymptotic extreme value theory itlf cannot be ud in practice to decide to what extent it is applicable for the obrved data.Hence,the assumption that an asymptotic extreme value distribution is the appropriate distribution for the obrved data is bad more or less on faith or convenience.
In an effort to ameliorate this situation,we have developed an approach to this problem that is less restrictive and more flexible than the ones bad on asymptotic theory.In particular,it has the capability to capture subasymptotic behaviour of the data,which ems to be of some importance for accurate prediction.However,the propod approach in its prent form has one limi-tation,it is restricted to cas where the Gumbel distribution is the appropriate asymptotic extreme value distribution.
2.Cascade of conditioning approximations
Consider a stochastic process Z (t ),which has been obrved over a time interval,(0,T )say.Assume that values X 1,...,X N ,which have been derived from the obrved process,are allo-cated to the discrete times t 1,...,t N in (0,T ).This could be simply the obrved values of Z (t )at each t j ,j =1,...,
N ,or it could be average values or peak values over smaller time intervals cen-tered at the t j ’s.Our goal in this paper is to accurately determine the distribution function of the extreme value M N =max{X j ;j =1,...,N }.Specifically,we want to estimate P (g )=Prob(M N 6g )accurately for large values of g .An underlying premi for the development in this paper is that a rational approach to the study of the extreme values of the sampled time ries is to con-sider exceedances of the individual random variables X j above gi-ven thresholds,as in classical extreme value theory.The alternative approach of considering the exceedances by upcross-ing of given thresholds by a continuous stochastic process has been developed in [2,3].The approach taken in this paper would em to be the appropriate way to deal with the recorded data time ries of,for example,the daily largest wind speeds ob-rved at a given location.
In the following we outline possible approaches for practical implementation of a cascade of approximations bad on condi-tioning,where the first one is a Markov-like approximation in the n that it is a one-step memory approximation.This approx-imation concept is described in [4,5].
From the definition of P (g )it follows that
0167-4730/$-e front matter Ó2008Elvier Ltd.All rights rerved.doi:10.1016/j.strusafe.2008.06.021
*Corresponding author.
E-mail address:u.no (A.Naess).
Structural Safety 31(2009)
325–334
Contents lists available at ScienceDirect
Structural Safety
j o u r n a l h o m e p a g e :w w w.e l s e v i e r.c o m /l o c a t e /s t r u s a f
e
PðgÞ¼Prob f X16g;...;X N6g g
¼Prob f X N6g j X16g;...;X NÀ16g g Prob f X16g;...;X NÀ16g g ¼
钳工Y N
j¼2
Prob f X j6g j X16g;...;X jÀ1gÁPðX16gÞ:
ð1ÞIn general,the variables X j are statistically dependent.Hence,in-stead of assuming that all the X j are statistically independent,which leads to the classical approximation
PðgÞ%
Y N
j¼1
PðX j6gÞ;ð2Þ
蚊人
the following Markov-like,or one-step memory,assumption will to a certain extent account for dependence between the X j,
Prob f X j6g j X16g;...;X jÀ16g g%Prob f X j6g j X jÀ16g gð3Þfor26j6N.This can be extended to
Prob f X j6g j X16g;...;X jÀ16g g%Prob f X j6g j X jÀ26g;X jÀ16g g
ð4Þfor36j6N,and so on.
Eqs.(3)and(4)reprent refinements of the independence assumption.One would expect that such approximations would be increasingly more able to capture statistical dependence be-tween neighboring data in the time ries.As will be en in the examples in the following ction,P(g)computed using Eq.(4)is quite clo to the value obtained using Eq.(3).This indicates that in practice,Eq.(3)is often able to capture the effect of statistical dependence wind speed d
植树节ppt
ata with good accuracy.However, there is no noticeable increa of numerical effort by using Eq.(4), or its further refinements by including three or more preceding peaks.
Combining Eq.(1)with Eq.(3),the following relation is obtained
PðgÞ%Q N
j¼2
p
2j
ðgÞ
Q
j¼2
p
1j
ðgÞ
;ð5Þ
where we have introduced the notation p kj(g)=Prob{X jÀk+16g,..., X j6g}for j P k.
It is of interest to compare the values for P(g)obtained by using Eq.(5)as compared to Eq.(2).Now,Eq.(2)can be rewritten in the form
PðgÞ%
Y N
j¼1
ð1Àa1jðgÞÞ;ð6Þwhere
a1jðgÞ¼Prob f X j>g g¼1Àp
1j
ðgÞ:ð7ÞThen
PðgÞ%P1ðgÞ¼expÀ
X N
j¼1a1jðgÞ
!
:ð8Þ
Alternatively,Eq.(5)gives
PðgÞ%
Y N
j¼2
ð1Àa2jðgÞÞp1NðgÞ;ð9Þwhere a kj(g)=1Àp kj(g)/p kÀ1,jÀ1(g),for j P k P2.That is
a kjðgÞ¼Pro
b f X j>g j X jÀkþ16g;...;X jÀ16g gð10Þdenotes the exceedance probability conditional on kÀ1previous non-exceedances.From Eq.(9)it is obtained that,PðgÞ%P2ðgÞ¼expÀ
X N
j¼2
a2jðgÞÀa1NðgÞ
!
;ð11Þ
since p1N(g)%exp(Àa1N(g)).
Conditioning on two previous obrvations X jÀ2,X jÀ1preceding X j gives
PðgÞ%P3ðgÞ¼expÀ
X N
j¼3
a3jðgÞÀa2NðgÞÀa1NðgÞ
!
;ð12Þwhile conditioning on three prior obrvations leads to the equation PðgÞ%P4ðgÞ¼expÀ
X N
j¼4
a4jðgÞÀa3NðgÞÀa2NðgÞÀa1NðgÞ
!
ð13Þand so on.Therefore,extreme value prediction by the conditioning approach described above reduces to estimation of(combinations) of the a kj(g)functions.For most practical applications N)1,so that (k P2)
P kðgÞ%expÀ
X N
j¼k
a kjðgÞ
!
:ð14Þ
Going back to Eq.(8),and the definition of a1j(g),it follows that P N
j¼1
a1jðgÞis equal to the expected number of exceedances of the threshold g during the time interval(0,T).Eq.(8)therefore express the approximation that the stream of exceedance events constitute a(non-stationary)Poisson process.This opens for an understanding of Eq.(11)and subquent approximations by interpreting the expressions
P N
j¼k
a kjðgÞþa kÀ1;NðgÞþ...þa1NðgÞ%P N j¼k a kjðgÞas the expected effective number of(independent)exceedances provided by conditioning on kÀ1previous obrvations.
3.Empirical estimation of the mean exceedance rates
It is expedient to introduce the concept of(conditional)average exceedance rates(AER)as follows:
e kðgÞ¼1
NÀkþ1
X N
j¼k
a kjðgÞ;k¼1;2;...:ð15Þ
In practice there are typically two scenarios for the underlying pro-cess Z(t).Either we may consider it to be a stationary process,or,in fact,even an ergodic process.The alternative is to view Z(t)as a pro-cess that depends on certain parameters who variation in time may be modelled as an ergodic pro
cess in its own right.For each t of values of the parameters,the premi is that Z(t)can be mod-elled as an ergodic process.This would be the scenario that can be ud to model long-term statistics[6].
For both the scenarios,the empirical estimation of the condi-tional AER e kðgÞproceeds in a completely analogous way by count-ing the total number of favourable incidents,that is,exceedances combined with the requisite number of preceding non-excee-dances,for the total data time ries and thenfinally dividing by NÀk+1%N.This can be shown for the long-term situation by using a similar analysis as in[6].
A few more details on the numerical estimation of e kðgÞfor k P2are uful.We start by introducing the following random functions:
A kjðgÞ¼1f X j>g;X jÀ16g;...;X jÀkþ16g g;j¼k;...;N;k¼2;3;...
ð16Þand
B kjðgÞ¼1f X jÀ16g;...;X jÀkþ16g g;j¼k;...;N;k¼2;...;
ð17Þ
326  A.Naess,O.Gaidai/Structural Safety31(2009)325–334
where 1f A g denotes the indicator function of some event A .Then
a kj ðg Þ¼
E ½A kj ðg Þ
E ½B kj ðg Þ
;
j ¼k ;...;N ;k ¼2;...;ð18Þ
where E[Á]denotes the expectation operator.Assuming an ergodic process,then obviously  e k ðg Þ¼a kk ðg Þ¼...¼a kN ðg Þ,and it may be assumed that for the time ries at hand
e k ðg Þ¼lim N !1
P N
j ¼k A kj ð
g Þ
P j ¼k B kj ðg Þ
:
ð19Þ
Clearly,lim g !1P N
j ¼k B kj ð
g Þ¼N Àk þ1%N .Hence,lim g !1~
e k ðg Þ= e k ðg Þ¼1,where
~e k ðg Þ¼lim
N !1
P N
j ¼k A kj ð
g Þ
:ð20Þ
In the following we shall u ~e k ðg Þinstead of  e k ðg Þfor k P 2.The advantage of using the modified conditional AER function ~e k ðg Þfor k P 2is that it is easier to u for non-stationary or long-term sta-tistics than  e k ðg Þ.Since our focus is on the values of the AER at the extreme levels,we may u any function that provides correct esti-mates of the AER function at the extreme levels.
For both stationary and non-stationary time ries,the sample estimate of ~e k ðg Þwould be
^e k ðg Þ¼
1X R
r ¼1
^e ðr Þ
k ðg Þ;ð21Þ
where R is the number of realizations (samples),and
^e ðr Þk ð
g Þ¼
P N
j ¼k A ðr Þ
kj ð
g Þ
N Àk þ1
;ð22Þ
where the index (r )refers to realization no.r .
It is of interest to note what events are actually counted for the
calculation of the various ^e k ðg Þ,k P 2.Let us start with ^
e 2ðg Þ.It fol-lows from the definition o
f ~e 2ð
g Þthat ~
e 2ðg ÞðN À1Þcan be inter-preted as the expected number ofour的序数词是什么
f exceedances above the level
g satisfying the condition that an exceedance is counted only if it is immediately preceded by a non-exceedance.A reinterpretation of this is that ^e 2ðg ÞðN À1Þequals the average number of clumps of exceedances above g for the realizations considered,where a clump of exceedances is defined as a maximum number of conc-utive exceedances above g .In general,^
e k ðg ÞðN À1Þthen equals the average number o
f clumps of exceedances above
g parated by at least k À1non-exceedances.
Now,let us look at the problem of estimating a confidence
interval for ~e k ðg Þ.The sample standard deviation ^s k ðg Þcan be esti-mated by the standard formula
^s k ðg Þ2¼
1X R r ¼1^e ðr Þ
k
ðg ÞÀ^e k ðg Þ  2
:
ð23Þ
Assuming that realizations are independent,for a suitable number R ,e.g.R P 20,Eq.(23)leads to a good approximation of the 95%
confidence interval CI=(CI À(g ),CI +(g ))for the value ~
e k ðg Þ,where CI Æðg Þ¼^e k ðg ÞÆ1:96^s k ðg Þ=ffiffiffi
R p :
ð24Þ
The approach to extreme value prediction prented in this paper derives from an assumption about the sampled time ries to be ud as a basis for prediction.This assumption derives from an underlying premi concerning the relevant asymptotic extreme value distribution,which is assumed here to be of Gumbel type.The implication of this assumption on the possible sub-asymptotic functional forms of ~e k ðg Þcannot easily be decided.However,using the asymptotic form as a guide,it is assumed that the behaviour of the mean exceedance rate in the tail is dominated by a function
of the form exp{Àa (g Àb )c }(g P g 1P b )where a ,b and c are suit-able constants,and g 1is an appro
priately chon tail level.Hence,it will be assumed that
~e k ðg Þ%q k ðg Þexp fÀa k ðg Àb k Þc k g ;g P g 1;
ð25Þ
where the function q k (g )is slowly varying compared with the expo-nential function exp fÀa k ðg Àb k Þc k g and a k ,b k ,and c k are suitable
constants,that in general will be dependent on k .Note that the va-lue c k =1corresponds to the asymptotic ca.From Eq.(25)it follows that:
Àlog log ð~e k ðg Þ=q k ðg ÞÞj j %Àc k log ðg Àb k ÞÀlog ða k Þ:
ð26Þ
Therefore,under the assumptions made,a plot of Àlog j log ð~e k ðg Þ=q k ðg ÞÞj versus log(g Àb k )will exhibit an almost perfectly linear tail behaviour.
It is realized that if the function q k (g )could be replaced by a constant value,q k say,we would imme
diately be in a position to apply a linear extrapolation strategy.In general,q k (g )is not con-stant,but its variation in the tail region is usually sufficiently slow to allow for its replacement by a constant.A similar approach was successfully ud by the authors for mean up-crossing rate estima-tion for extreme value analysis of the respon process related to different dynamic systems,cf.[2,3].Details and examples are dis-cusd later in this paper.
Since the linearity in the plotting procedure described above is dependent on an appropriate choice of the parameters (b k ,q k ),it is important to discuss this issue in some detail.
First,we cut from consideration the very tail of the data,where uncertainty is high.As a practical procedure we suggest to neglect data points,where the relative confidence band width is greater than some constant d ,that is
1:96^s k ðn Þ=ffiffiffi
R
p ~e k ðg Þ
>d ;
ð27Þ
where the value chon for d is dependent on the actual ’roughness’of the data tail,but its value would typically be in the interval (0.5,1).Next,we come to the estimation of the predicted respon level and its 95%confidence interval.
First,the tail marker g 1is identified from visual inspection of
the log plot ðg ;log ~
e k ðg ÞÞ.The value chon for g 1corresponds to the beginning o
f regular tail behaviour in a n to be discusd below.Next,initial estimates for b and q are found by the proce-dure to linearize the tail on the transformed scale.Occasionally,some care should be exercid if an optimization algorithm is ud to determine b and q .It may happen that the optimization is actu-ally almost ill defined in practice in the n that the algorithm may have difficulties locatin
g an optimal (b ,q )pair.It has been ob-rved that in suc
h cas the optimal value of b may for example em to be at a very large negative value.It can be shown that a good approximation would then often be obtained by putting b =0and c =1.This can be partly explained by noting that in the special ca that c =1,the is no unique optimal pair of parameters (b ,q ).In fact,it is recognized that an infinity of (b ,q )-pairs exist that result in exactly the same AER.The initial value of the parameters a and c would generally be determined from the initial’optimal’straight line Àcx Àlog a (x =log(g Àb ))approximating the data tail on the transformed plot (26).
Instead of doing the optimization directly on the loglog–log plot,which is appealing in the n that it only involves two parameters,a more robust optimization may in fact be obtained by doing it on the log plot even if the optimization must be carried out with respect to all four parameters a ,b ,c ,q .Our experience has been that the Levenberg–Marquardt least squares optimization method is well suited for the task [7].The mean square error func-tion to be minimized is written as
A.Naess,O.Gaidai /Structural Safety 31(2009)325–334327
X N j¼1w j log^mþðg jÞÀlog qþaðg jÀbÞc
2;ð28Þ
where w j=(log CI+(g j)Àlog CIÀ(g j))À2denotes a weight factor that puts more emphasis on the more reliable data points.The choice
of weight factor is,of cour,to some extent arbitrary,and if it is
considered more appropriate to put a stronger emphasis on the lar-
ger data,this can be simply achieved by replacing the exponentÀ2
by,for example,À1in the definition of w j.
The practical approach currently adopted is to get afirst idea of
the values of the parameters a,b,c,q by having a look at the loglog–
log plot.The values may then be ud as starting values for the
Levenberg–Marquardt algorithm.For estimation of the confidence
interval for the predicted return value provided by the optimal
curve,the empirical confidence band is reanchored to the optimal
curve.The range offitted curves that stay within the reanchored
confidence band will determine an optimized confidence interval
of the predicted return value.As afinal point,it has been obrved
that the predicted return value is not very nsitive to the choice of g1.
To offer a comparison of the predictions obtained by the meth-
od propod in this paper with tho obtained by other methods,
we shall u the predictions given by the two methods that em
to be most favored by practitioners,the Gumbel method and the
peaks-over-threshold method.
4.The Gumbel method
The Gumbel method is bad on recording epochal extreme
values andfitting the values to a corresponding Gumbel distri-
bution.By assuming that the recorded extreme value data are
Gumbel distributed,then reprenting the obtained data t of
extreme values as a Gumbel probability plot should ideally result
in a straight line.In practice,one cannot expect this to happen,
but on the premi that the data follow a Gumbel distribution,
a straight line can befitted to the data.Due to its simplicity,a
popular method forfitting this straight line is the method of mo-
ments.That is,writing the Gumbel distribution of the extreme
value M N as
ProbðM N6gÞ¼exp fÀexpðÀaðgÀbÞÞg;ð29Þit is known that the parameters a and b are related to the mean va-
lue m M and standard deviation r M of M(T)as follows: b=m MÀ0.57722aÀ1and a=1.28255/r M[8].The estimates of m M and r M obtained from the available sample therefore provides esti-mates of a and b,which leads to thefitted Gumbel distribution by
the moment method.
Typically,a specified fractile value of thefitted Gumbel distri-
天门中断楚江开
bution is then extracted and ud in a design consideration.To
be specific,let us assume that the requested fractile value is the
100(1Àa)%fractile,where a is usually a small number,for exam-ple a=0.1.To quantify the uncertainty associated with the ob-tained100(1Àa)%fractile value bad on a sample of size e N,the 95%confidence interval of this value is often ud.A good estimate
of this confidence interval can be obtained by using a parametric
bootstrapping method[9,10].In our context,this simply means
that the initial sample of e N extreme values is assumed to have been generated from an underlying Gumbel distribution,who parameters are,of cour,unknown.If this Gumbel distribution had been known,it could have been ud to generate a large num-ber of(independent)samples of size e N.For each sample,a new Gumbel distribution would befitted and the corresponding 100(1Àa)%fractile value identified.If the number of samples had been large enough,then a very accurate estimate of the95% confidence interval on the100(1Àa)%fractile value bad on a sample of size e N could be found.Since the true parameter values of the underlying Gumbel distribution are unknown,they are re-
placed by the estimated values obtained from the initial sample.
Thisfitted Gumbel distribution is then ud as described above
to provide an approximate95%confidence interval.Note that the
assumption that the initial e N extreme values are actually gener-ated with good approximation from a Gumbel distribution cannot
easily be verified in general,which is a drawback of this method.
Compared with the POT method,the Gumbel method would also
em to u much less of the information available in the data.This
may explain why the POT method has become increasingly popular
over the past years,but the Gumbel method is still widely ud in
practice.
5.The peaks-over-threshold method
5.1.The generalized Pareto distribution
The POT method is bad on what is called the generalized Par-
eto(GP)distribution(defined below)in the following manner:It
has been shown[11]that asymptotically,the excess values above
和蔼可亲的老师a high level will follow a GP distribution if and only if the parent
distribution belongs to the domain of attraction of one of the ex-
treme value distributions.The assumption of a Poisson process
model for the exceedance times combined with GP distributed ex-
cess can be shown to lead to the generalized extreme value
(GEV)distribution for corresponding extremes,e below.The
expression for the GP distribution is
GðyÞ¼Gðy;a;cÞ¼ProbðY6yÞ¼1À1þc
y
a
À1=c
þ
:
ð30ÞHere a>0is a scale parameter and c(À1<c<1)determines the shape of the distribution.(z)+=max(0,z).
The asymptotic result referred to above implies that Eq.(30)can
be ud to reprent the conditional cumulative distribution func-
tion of the excess Y=XÀu of the obrved variate X over the
threshold u,given that X>u for u sufficiently large[11].The cas
c>0,c=0and c<0correspond to Fréchet(Type II),Gumbel(Type
I),and rever Weibull(Type III)domains of attraction,respec-
ion below.
For c=0,which corresponds to the Gumbel extreme value dis-
tribution,the expression between the parenthes in Eq.(30)is
understood in a limiting n as
GðyÞ¼Gðy;a;0Þ¼expðÀy=aÞ:ð31Þ5.2.Return periods
The return period R of a given wind speed,in years,is defined as
the inver of the probability that the specified wind speed will be
exceeded in any one year.If k denotes the mean exceedance rate of
the threshold u per ,the average number of data points
above the threshold u per year),then the return period R of the va-
lue of X corresponding to the level x R=u+y is given by the relation R¼
1
R
¼
1
:
ð32ÞHence,it follows that
ProbðY6yÞ¼1À1=ðk RÞ:ð33ÞInvoking Eq.(1)for c–0leads to the result
x R¼uÀa½1Àðk RÞc =c:ð34ÞSimilarly,for c=0,it is found that,
x R¼uþa lnðk RÞ;ð35Þwhere u is the threshold ud in the estimation of c and a.
328  A.Naess,O.Gaidai/Structural Safety31(2009)325–334
5.3.The de Haan estimators
Let n denote the total number of data points,while the number of obrvations above the threshold value u is denoted by k.The threshold u then reprents the(k+1)th highest data point(s). An estimate for k is^k¼k=n yr,where n yr denotes the length of the record in years.The highest,cond highest,...,k th highest,
(k+1)th highest variates are denoted by XÃ
n ;XÃ
nÀ1
;...;XÃ
nÀðkþ1Þ
;
nÀk
¼u,respectively.
The parameter estimators propod by de Haan[12]are bad on the following two quantities:
H k;n¼1
k
X kÀ1
i¼0
f lnðXÃ
nÀi
ÞÀlnðXÃ
nÀk
Þgð36Þ
and
Hð2Þ
k;n ¼
1
k
X kÀ1
定语从句的用法i¼0
f lnðXÃ
nÀi
ÞÀlnðXÃ
nÀk
Þg2:ð37Þ
Estimators for a and c are then given by the relations
^a¼q XÃ
nÀk
H k;n¼q uH k;nð38Þand
^c¼H
k;n þ1À
1
2
ðH k;nÞ2
Hð2Þ
k;n
()À1
;ð39Þ
where q=1if^c P0,while q¼1À^c if^c<0.
Subject to general conditions on the underlying probability law [12],showed that^a!a and^c!c as n?1(in probability).
The Hill estimators[13]are cloly related to the de Haan esti-mators.Their application to problems similar to the ones discusd in this paper were investigated by Naess and Claun[14].The con-clusion was that the Hill estimators lead to results that are quite similar to tho provided by the de Haan estimators.Since the Hill estimators require considerably higher numerical efforts than the de Haan estimators,we have therefore chon to exclude the Hill estimators from the prent study.Moment estimators also repre-nt an alternative to the estimation of distribution parameters, and apparently they may provide more accurate results than ob-tained by using de Haan estimators[15].In the prent paper we have chon to apply de Haan estimators as the are commonly known and extensively ud.However,it may be noted that ques-tions have been asked about the applicability of the estimators for limited ts of data[16].
6.Synthetic data
In this ction we illustrate the philosophy of tail linearization and also the95%CI estimation.We consider20years of synthetic wind speed data,amounting to2000data points,which is not much for de
tailed statistics.However,this ca may reprent a real situation when nothing but a limited data sample is available. In this ca it is crucial to provide extreme value estimates utilizing all data available.As we shall e,the tail extrapolation technique is significantly better than asymptotic methods such as POT or Gumbel.
The extreme value statistics willfirst be analyzed by application to synthetic data[15],for which the exact extreme values can be calculated.In particular,it is assumed that the underlying(normal-ized)stochastic process Z(t)is stationary and Gaussian with mean value zero and standard deviation equal to one.It is also assumed that the mean zero up-crossing rate m+(0)is such that the product m+(0)T=103where T=1year,which ems to be typical for the
wind speed process.Using the Poisson assumption,the distribution of the yearly extreme value of Z(t)is then calculated by the formula F1yrðgÞ¼exp fÀmþðgÞT g¼expÀ103expÀ
g2
2
;ð40Þ
where T=1year and m+(g)is the mean up-crossing rate per year,g is the scaled wind speed.The100years return period value g100yr is then calculated from the relation F1yr(g100yr)=1À1/100,which gives g100yr=4.80.
The Monte Carlo simulated data to be ud for the synthetic example are generated bad on the obrvation that the peak events extracted from measurements of the wind speed process, are usually parated by3–4days.This is done to obtain approxi-mately independent data,cf.[17].In accordance with this,peak event data are generated from the extreme value distribution
F3dðgÞ¼expÀq expÀ
g2
2
号号;ð41Þ
where q=m+(0)T=10,which corresponds to approximately T=3.65 days.
Since the data T=3.65days maxima)are indepen-dent, e kðgÞis independent of k.Therefor
e we put k=1.Fig.1pre-nts a plot of the average exceedance rate(AER)^e1for20years of simulated data.Since we have100discrete time steps in one year,the data amounts to2000data points.The transformed plot (26)is a loglog–log plotÀlog j log^e1ðgÞÀd j versus log(gÀb), where d=Àlog q.The parameters q and b were found by optimiza-tion.From the simulated data the predicted100years return level is~g100yr¼4:58,while the exact value g100yr=4.80,e Fig.1.Esti-mation of the95%confidence interval gives(4.44,5.02).The trans-formed plot is shown in Fig.2.Fig.3prents the analytical solution for e1on the transformed scale(26).
Fig.4prents POT predictions for different threshold numbers bad on de Haan estimators.In Fig.5is shown the parametrically bootstrapped PDF of the POT prediction for threshold number n=140bad on de Haan estimators.The predicted value is4.56, while the95%confidence interval is(4.04,5.00).The same data t as in Figs.3and1was ud.It is en that POT tends to under-estimate the exact value,which is4.80.A clear advantage of the AER method compared to POT is en when comparing95%confi-dence intervals.
In order to get more insight into AER and POT statistics,10inde-pendent20years MC simulations where done.Table1compares predicted values and confidence intervals.It must be mentioned that when analyzing a lot of data samples,POT as implemented here,generally underestimates the100ye
ars return level value.
A.Naess,O.Gaidai/Structural Safety31(2009)325–334329

本文发布于:2023-07-10 00:48:39,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/82/1088218.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:楚江   用法   天门   定语   中断
相关文章
留言与评论(共有 0 条评论)
   
验证码:
推荐文章
排行榜
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图