Statistical Analysis of Wind Power Forecast Error Hans Bludszuweit,JoséAntonio Domínguez-Navarro,Member,IEEE,and Andrés Llombart
Abstract—Wind power forecast error usually has been assumed to have a near Gaussian distribution.With a simple statistical analysis,it can be shown that this is not valid.To obtain a more appropriate probability density function(pdf)of the wind power forecast error,an indirect algorithm bad on the Beta pdf is propod.Measured one-year time ries from two different wind farms are ud to generate the forecast data.Three different fore-cast scenarios are simulated bad on the persistence approach. This makes the results comparable to other forecast methods. It is found that the forecast error pdf has a variable kurtosis ranging from3(like the Gaussian)to over10,and therefore it can be categorized as fat-tailed.A new approximation function for the parameters of the Beta pdf is propod becau results from former publications could not be confirmed.Besides,a linear approximation is developed to describe the relationship between the persistence forecast and the related mean measured power. An energy storage system(ESS),which reduces the forecast error and smooths the wind power output,is considered.Results for this ca show the ufulness of the propod forecast error pdf for finding the optimum rated ESS power.
Index Terms—Error analysis,forecasting,wind,wind power generation.
I.I NTRODUCTION
T HE increasing penetration of wind power in the electricity grids leads to the need of a better understanding of the wind power forecast error and if possible,its reduction.Obvi-ously,with rising penetration levels,the impact of wind gen-eration on the electrical energy system must be taken into ac-count[1].Although embedded wind generation can be positive for a weak grid[2],it can also lead to instabilities[3].To avoid such instabilities,the system rerve has to be incread[4]or sometimes,important amounts of wind energy must be dumped [5].Though forecast error does not only have technical but it also has economical aspects.Unforeen energy generation will cau additional costs,especially when wind energy is traded in short-term energy markets[6]–[8].In[6]it is stated that predic-tion error costs can reach as much as10%of the total incomes from generated energy.
An option to increa the system rerve(or spinning rerve) to deal with the forecast error may be an energy storage system (ESS).Placing such an installation near the wind generator has a number of very benign effects on the energy system.In[9]a good introduction is given about possible applications of energy storage in relation to high wind penetration in weak grids.In this context,ESS can be categorized as propod in[9]by their typ-ical run times,applications and energy sizes as sho
wn in Table I.
Manuscript received August7,2007;revid December7,2007.This work was supported by the CIRCE foundation.Paper no.TPWRS-00558-2007. The authors are with the Electrical Engineering Department,University of Zaragoza,and also with the CIRCE foundation,associated with it,Zaragoza, Spain(e-mail:hblud@unizar.es;jadona@unizar.es;llombart@unizar.es). Digital Object Identifier10.1109/TPWRS.2008.922526
TABLE I
T YPICAL R UN T IMES AND A PPLICATIONS OF E NERGY S TORAGE S YSTEMS
[9] The typical runtime is obtained by dividing the energy capacity by the rated power,which is usually in the megawatt range. The benefits of energy storage in the range of conds are shown in[10],while in[11]the economical benefits of a water storage(hours of runtime)combined with wind generation are shown.In[12]a method is prented to schedule energy storage for wind power plants in the electricity markets using time-step simulation for the ESS sizing.A probabilistic method for storage sizing is propod in[13],applying spectral analysis of the intermittent renewable generation.
In this work,the forecast error pdf is studied becau it can give important information for short-term trading in energy mar-kets or even for the optimal sizing of rated ESS power.
In the literature we canfind very few works about wind power forecast error distributions.The relationship of the statistical be-havior of the forecast error and ESS sizing is mentioned in[4] and[6]–[8],but little emphasis is given to the actual wind power forecast error pdf.In[7]histograms from measured time ries are ud for the forecast error description and a standard distri-bution is assumed in[4]and[11].In[14]the uncertainty of wind power forecasts is quantified using the well-known distribution of wind speed forecasts in conjunction with the power curve of the wind energy conversion system(WECS).
The difficulty infinding a proper definition for the forecast error pdf lies in the great variety of its shape depending on the forecast horizon and method.It is shown in this paper that the error pdf is fat-tailed with variable kurtosis.Therefore,it cannot be modeled with the normal distribution.
In the prent study,an alternative method bad on[15]is adopted to obtain a more suitable forecast error pdf.The method consists in dividing the forecast into50power class or bins and modeling the distribution of measured power within each forecast bin with the Beta pdf.The higher number of50bins, compared to only four bins in[15],leads to newfindings for the parameters of the Beta pdf.Given the power pdf associated to each bin,the forecast error pdf can be obtained.It is found that the Beta distribution gives reasonably accurate results.How-ever,further investigation on alternative distributions may im-prove the results.
The paper is organized as follows.In Section II the steps of the propod method are explained and in Section III results for the forecast error pdf and ESS sizing are prented.The obtained
0885-8950/$25.00©IEEE
Fig.1.Wind power time ries (thin line)with persistence forecast (bold line),forecast delay k ,and prediction time interval T (T =k =15min ).
forecast error pdf is ud to optimize the rated ESS power under given constraints and it is found that,in some cas,it can be more than 80%lower than the installed wind power.
II.M ETHODOLOGY
In Section II-A,a method is shown for simulating different forecast qualities or scenarios.In Section II-B,the forecast error is found to be fat-tailed with variable kurtosis and some possible distributions are discusd.The application of the Beta pdf is developed in Section II-C,where a new,nonlinear function is propod for the approximation of the parameters standard de-
viation and mean
value .In Section II-D the relationship between the persistence forecast and the mean measured power related to this forecast is discusd and the findings in [16]could be verified.In Section II-E the
calculation of the total forecast error distribution is explained and finally in Section II-F the pro-cedure to obtain the energy loss as a function of nominal ESS power is shown.
A.Modeling Forecast Scenarios Using Persistence
We have analyzed veral one-year time ries of generated wind power from two sites.The two datats will be called A and B.In datat A,10min means from 32different wind gen-erators from the same wind farm were aggregated,so that a data-ba of around 1million values was available for statistical anal-ysis.In datat B,15min means were available for three spa-tially very clo wind farms,with around 100000data values.In the world of wind power forecast exist a huge variety of different forecast methods.But every method has to be com-pared always with the simplest one:the persistence forecast.Therefore,this simple approach was chon to simulate three different scenarios of forecast quality.This allows us to investi-gate the changes in the error distribution with changing forecast scenario.For better understanding of the persistence method ud here,in Fig.1a measured wind power time ries (thin line)and the persistence forecast (step function,bold line)are depicted for a datat of 1cond mean wind power.The instant is shown when the forecast is done and the forecast time in-terval
width ,where the mean wind power is predicted.This time interval has to start later than ,so that a forecast delay has to be defined.
This delay describes the time gap between the instant,when the forecast is done and the beginning
of .In short-term energy
markets is termed market closure delay [8].
In Fig.1,the mean wind power obtained from time interval [0,15]is the forecast for the interval [30,45].
Therefore,it can be stated that the persistence forecast in this ca is 2T time-shifted relative to the interval,where the mean value was calculated.According to the nomenclature propod in [17],the persistence forecast can be written
as
(1)
where is the wind power forecast for
time
made at time origin
,the prediction
horizon,the prediction interval length
(here
),the measured wind power for time and the previous time steps
within
,the
number of time steps
within
and the time step length of the measured time ries
(
).In the following,three forecast scenarios are defined.This way,the impact of the forecast model is simulated without using a specific forecast model.If a forecast model is determined,it can be classified according to the scenarios.
The worst-ca scenario,
termed
2,is bad on a persis-tence forecast
with
.The forecast is only updated once for each prediction interval,so the forecast value is constant during the whole prediction
interval ,as shown in Fig.1.This is the worst ca becau no forecast model should perform wor than this.
The
1scenario is simulating an intermediate ca.The calculated mean value of each interval is shifted in time by the width of prediction
interval .This is equal to a persistence
forecast
with
,which means an important improvement in comparison with the
scenario
2.The best-ca scenario,
termed
0,is obtained by as-signing the measured mean value in the
interval as forecast
for the same interval
(
).This way,a perfect forecast of the mean value for each interval is simulated.The error only consists
in the power fluctuations within the
interval .
The normalized prediction
error is calculated as the dif-ference between two time ries of the same length.Following [17],it can be written as
follows:
(2)
where is the prediction
delay,is the wind power
forecast for
time
for the prediction made at the origin
and is the measured wind power.
B.Evaluating Possible Distributions of the Forecast Error As it will be shown,the tail of the error pdf is
of special interest for the ESS sizing.Therefore,kurtosis was chon as the statistical parameter to evaluate the tail of the studied pdf.The
kurtosis of a distribution of zero mean random
variable is defined
via
(3)
BLUDSZUWEIT et al.:STATISTICAL ANALYSIS OF WIND POWER FORECAST ERROR
985
Fig.2.Comparison of a histogram of 24-h forecast error data (kurtosis 4.8)with Gaussian and Laplace pdf having the same standard deviation as the fore-cast error.
where denotes the expectation operator
and the standard deviation.
If is larger than 3(the value associated to normal distribu-tion),the distribution is called leptokurtic or fat-tailed.By cal-culating the kurtosis of the forecast error data,it can be shown that the normal pdf is not appropriate.In fact persistence fore-cast errors show values
of above 10(forecast below 1h)and even below 3(beyond 24h),which means that the obrved dis-tributions in general are leptokurtic.
In Fig.2the histogram of an example of 24-h forecast data with a kurtosis of 4.8is shown.In the same
plot the Gaussian and Laplace pdf are depicted,which have the same standard
deviation as the sample data.The Laplace pdf
(with
)was chon as an example of fat-tailed distributions.As expected,the tail of the forecast error pdf is situated between the Gaussian
and the Laplace pdf,
being
.
As mentioned above,the kurtosis of the error ranges from below 3to above 10,indicating a considerable variation across the different forecast horizons.The two expod distributions show a fixed value of .There are well-known distributions with variable kurtosis,but up to now,no distributions have been pro-pod to model the forecast error,except the normal.Therefore,in this paper an indirect approach using the Beta distribution is applied bad on assumptions made in [15]as it is described in Section II-C.
The kurtosis of Beta pdf is variable and is calculated as in the following (e
[18]):
(4)
where
and are the parameters of Beta pdf.
Besides its variable kurtosis,the advantage of Beta pdf con-sists of its simplicity (it is defined by only two parameters)and that its values are limited to the interval [0,1],as well as the nor-malized wind power.
C.Approximation of a Beta Pdf for Each Power Forecast Bin The indirect approach,adopted to obtain the error pdf,is bad on [15].First,forecast results are sorted into power class or bins and the distribution of the measured power within
each
Fig.3.Approximated Beta pdf of measured wind power for nine different fore-cast power class.
forecast bin is assumed to follow a Beta pdf.Knowing the pdf associated to each forecast bin,the error pdf of this bin can be calculated.Adding up the error pdfs of all bins,the overall fore-cast error pdf can be obtained.
Therefore,the forecast time ries are generated as described in Section II-A and every forecast is assigned to a measured wind power value.Then the data pairs [measured,forecast]are sorted by the forecast values and assigned to forecast power bins.The bin width must be chon depending on the number of data available.In this ca 50bins with 0.02p.u.bin width em to be a good choice,but if the databa is smaller,the number of bins must be reduced.
It is assumed that all measured values of one forecast bin have
the same
forecast
,which is the mean value of all the cal-culated forecasts corresponding to bin .The distribution within
bin is
called
.The Beta pdf is given by the following
equations:
(5)(6)
where is the measured wind power per unit
byebye歌词(p.u.),
and are the parameters
and is the Beta function.
As can be en in (6),Beta
function
is simply the integral of the term in the numerator of (5),which normalizes
the integral
of
in the interval [0,1].Therefore in [15],the following form was chon to reprent Beta
pdf:
(7)
where is the normalization
factor .
In Fig.3an example is given for a possible result of approx-imated wind
power
for nine different predicted power class with mean
forecast
.As mentioned in [6],the
parameters
and are related to the
parameters (variance )
and (mean).The following equa-tions show the
relationships:
(8)(9)
986IEEE TRANSACTIONS ON POWER SYSTEMS,VOL.23,NO.3,AUGUST
2008
Fig.4.Comparison of the histogram of measured wind power (dots)with Beta pdf (bold solid)and normal pdf (solid);forecast power bin:5of 50(forecast:0.1p.u.).
Equations (8)and (9)can be reverd to
如何美白脸部皮肤calculate
and
directly
from
and as shown
in
(10)(11)
Next we show the advantages of using the Beta pdf in spite of the Gaussian pdf for approximating the forecast error dis-tribution.As an example,in Fig.4the histogram of one fore-cast power bin is compared to the fitted Beta pdf and the corre-sponding Gaussian.
The distribution of data in this example bin is very fat-tailed with a kurtosis of around 17.The fitted Beta reaches around 5,
which is significantly better than the normal pdf
(
).In Fig.4this can be appreciated for wind powers above 0.5p.u.,where the normal pdf tends much faster to zero than the Beta.However,both tend much faster to zero than the actual measured data does.It should be mentioned that the Gaussian and Beta distributions in this example have the same values
of
and as the measured data.
The previous example showed that the tail of the wind power distribution cannot always be modeled perfectly with the Beta pdf.As a conquence,the purpo of this paper is to evaluate its performance,as no better pdf has been propod yet in the literature.
After the approximation process,for every forecast bin ,pa-rameter
pairs
and are obtained.As in [15],
the pair is chon for the reprentation of the results.In
Fig.5examples for 1-h and 48-h forecasts are shown in com-parison with the curves published in [6]and [15].A nonlinear behavior was obrved in all the investigated forecast scenarios and prediction horizons,while in [6]and [15]a linear relation-ship
between
and is assumed.This nonlinearity can be ex-plained with the power curve of WECS.As shown in [14],the first-order derivative of the power curve is directly proportional to the uncertainty of the power forecast.At zero and nominal power (1p.u.)the derivative is near zero,while at around 0.5p.u.it pass a maximum.The same can be obrved for the standard deviation in Fig.5.
The difference in results may be caud by the different pro-cedures ud to obtain the data.In [15],the forecast
horizon
Fig.5.Two examples for the approximation of parameter pairs ( ; )from persistence forecast together with curves from Bofinger [15]and Fabbri [6];Datat A,T 20forecast.
is 48h
with
.Forecasts of one day mean values lead to statistical databa of only 365values.The bias correction,performed on this data,reduces the data further to 100values.As this is a very poor amount of data,the linear approximation was bad on only 4,non-equally distributed forecast bins.In [6]no information is given about how the linear approximation was obtained from the measured data.
In the prent work,50bins with equal bin widths were ud.So possibly,the nonlinear behavior was not revealed in [15]for the lack of power bin res-olution.
The higher standard deviation of the prented examples is due to the power fluctuations within the forecast interval which were not taken into account in [15].Considering,for example,a 1-h forecast,in the prent work the six corresponding 10-min measured values of this interval were ud to obtain the standard deviation,while in [15],every forecast interval only has one forecast associated to the measured mean value.
The following equation is propod as a new approximation
function:
(12)identitycard
where
and are the approximation parameters.
The curves named “Betafit”in Fig.5show two examples of this approximation.The special form of the function is due to the fact that the
parameter cannot be negative.From (10)follows (13)which shows clearly
that must be zero at the boundaries
of the
interval
:(13)
Therefore,it is practical to u the
term
()as a factor
in the polynomial approximation
of .
The analysis of different prediction horizons up to 30days
(
)has shown that good results can be obtained with only two polynomial coefficients.The
coefficient has a neg-ative sign,in order to obtain positive values
for
and .In Fig.6coefficients are shown as a function
of for the three forecast scenarios.It can be obrved that the two datats show quite different behavior,although the global tendency is similar.This can be explained by the power curves as described
BLUDSZUWEIT et al.:STATISTICAL ANALYSIS OF WIND POWER FORECAST ERROR
987
Fig.6.Approximation parameters for fitting function =f ( )as a function of prediction interval T and the three forecast scenarios (T 20,T 21,T 22)for datats A and B.
in [14].Datat A was taken from constant speed while datat B was taken from variable speed WECS.
Very interesting is that the coefficients in datat B are almost
identical.Equal values
of
and mean that the
curve is symmetric.If the symmetric ca is assumed,the approxima-tion function can be simplified as expresd in (14).The only remaining
parameter can be calculated as the square root of the former
parameters
(14)
brotherhoodFurther investigation with more datats is needed to evaluate if the assumption of
symmetric is valid for variable wind turbines in general or if the ca of datat B is an exception.D.Analysis of the Relationship Between Mean Power Forecast and Measured Power
In [6]and
六年级英语教学计划[15]
within a forecast bin is reprented as a function of the mean predicted
power
of this bin.It must be noted that the values can only be interpreted directly as param-eters of the Beta pdf
if
is equal to the mean measured
power in this bin.Indeed in [6]and
[15]is assumed,but without emphasizing that the
parameter of the Beta pdf must be derived from the measured data,and not from the forecast data.In [15]a simple neuronal network is applied to correct the
initial forecast with the aim to
approach
,while in [6]it is only stated that the forecast will be “centered in the mean value.”However,it will be shown now that this assumption is not valid for the persistence forecast.
英语口语900句文本
If a linear relationship
between
and is considered and it is further assumed that all curves cross the point of the long-term
mean,the fitting
function
can be written as
in (15)
where
is the mean forecast within a forecast
bin,is the slope of the linear fit,
and is the long-term mean.
As shown in Fig.7,for short forecast horizons up to 1
h,
is almost true,but not when the forecast horizon
becomes
larger.
Fig.7.Linear approximation of the mean forecast and measured mean within a forecast bin;examples for T =1h and 336h (14days)and T 22scenario.
As it can be en in Fig.7,the
slope
of the linear fit de-pends strongly on the forecast
horizon .While the linearity is maintained,the slope of the curve tends to zero for very large values
of .In fact,for the 14-day forecast,the mean value of measured power is almost constant for all forecasts and equal to the long-term mean (in this ca the total mean of the entire datat).
In Fig.8the linear approximations following (15)for veral
prediction
intervals are shown for the example
of 2sce-nario using datat A.
In other words,the long term mean value becomes the best forecast for large forecast horizons.Therefore,in [16]a modi-fied persistence model is propod as a new reference (cited also in [17]and [19]),which takes into account this effect.This new reference can be written
as
(16)
where
is the new reference
forecast,is the correlation coefficient
between
and
,is the persistence forecast
and is the long-term mean.
Equations (15)and (16)have the same structure of a linear function,with the only difference that the slope in (15)
is and in (16)is the autocorrelation coefficient .Using (15),slopes
988IEEE TRANSACTIONS ON POWER SYSTEMS,VOL.23,NO.3,AUGUST
2008
Fig.8.Linear approximation for measured mean as a function of mean forecast within a forecast bin for veral prediction intervals T ;T 22scenario,datat
A.
Fig.9.Comparison of the autocorrelation coefficient “a ”with slope “m ”of the linear fit as a function of prediction interval T ;forecast scenario T 21,datats A and B.
are calculated for all considered prediction
intervals .In Fig.
9
and for the two datats and forecast
scenario 1
are shown.Results
for
2are similar
and
for 0must obviously be unity.
From Fig.9the similarity
between
and is confirmed,but also some differences can be found.While for small values
of the
slope is almost equal to the autocorrelation
coefficient ,it is greater
than for higher values
of .This can be explained by the persistence forecast approach chon in this work.For large values
of ,the forecast is the average value over a large time interval.The autocorrelation does not take into account this averaging
while includes it.In datat
A,is almost equal
to up to
around ,but in datat B the similarity only
holds up
to
.
This can be due to the smaller databa of datat B (ten times smaller)which makes the calculation
cet4听力下载
of less reliable.But it is possible that this effect is also due to the individual behavior of the wind data,as,for example,the
symmetry
of
commonlaw
buddhist
.Interestingly,the autocorrelation of datat A and B is very similar,which shows that it is less nsible to the specific properties of the data available.E.Obtaining the Forecast Error Pdf From the Beta Distributions
Knowing the pdf of the measured wind power within every forecast bin,the error distribution of each bin can be obtained easily by subtracting the mean predicted power of the bin
from
Fig.10.Superpod forecast errors for nine different forecast power bins for a 1-h T 22forecast using datat
B.
Fig.11.Comparison of the frequencies in each forecast bin.“10min”is the histogram (0.01bin width)of the measured wind power of datat A (10min means);“24h”is the histogram of the persistence forecast for a forecast time interval T =24h .
the amplitude axis of the pdf.The result for nine of the 50bins chon in this ca can be en in Fig.10.特朗普就职演讲
Once the pdf of the forecast error is known for all bins,the total forecast error pdf can be calculated by adding up the pdfs of all bins.The pdf for each bin is normalized to 100%,though the sum must be weighted.The
weight of each bin is equal to the probability that a forecast belongs to it.Weighting the sum is necessary becau the probability that a forecast belongs to the 0.1-p.u.bin may be ten times higher than it is for the 0.9-p.u.bin.The weighted sum can be written as
in
(17)
where is the total forecast error pdf,the bin
number,the total number of
bins,the forecast power bin weight function
and the forecast error pdf in bin .
The weight
function
,reprenting the weight of each bin,is a histogram
with bins of long-term forecast data.The histogram depends strongly on the forecast interval
length as shown in Fig.11.The frequency values of 10-min and 24-h forecast data for the bins from 0.8to 1p.u.differ by a factor 10or more.
It should be noted that the histogram obtained from the persis-tence forecast is reprentative for any forecast method,becau