AN ECOLOGICAL ASSESSMENT OF THE
UNITED STATES MID-ATLANTIC REGION USING RANK FREQUENCY DISTRIBUTIONS BASED ON
WATERSHED QUINTILES
By G. P. Patil, C. Taillie, and Ryan Vraney
Center for Statistical Ecology and Environmental Statistics
Department of Statistics
The Pennsylvania State University
University Park, PA 16802
[Invited paper to appear in Community Ecology , 2002]
Technical Report Number 2001-1102
TECHNICAL REPORTS AND REPRINTS SERIES
December 2001
Department of Statistics
The Pennsylvania State University
University Park, PA 16802
G. P. Patil
Distinguished Professor and Director Tel: (814)865-9442 Fax: (814)865-1278
Email: gpp@stat.psu.edu www.stat.psu.edu/~gpp
An Ecological Asssment of the United States Mid-Atlantic Region
listeningUsing Rank Frequency Distributions Bad on Watershed Quintiles
G. P. Patil, C. Taillie, and Ryan Vraney
Center for Statistical Ecology and Environmental Statistics
羊肉做法Pennsylvania State University
University Park, PA, USA
Abstract
多元化发展
When working with raw data for multiple environmental indicators, it can be difficult to asss quality
or ‘health’ becau of the large number of indicators and inconsistencies among the indicators. By grouping the raw data into rankings, the data become more manageable and more comprehensible. We do not, however, want to lo information as a result of the groupings. It is possible to asss the quality of grouping options graphically by eing if the resulting asssments of ‘health’ are concordant with the raw data. This can be done through the u of CDF-index values, cumulative distribution function plots, parallel coordinates plots, and scatterplots. A major purpo of this paper is to prent approaches and the graphics for comparison and prioritization bad on quintiles ud, in this ca, for ecological asssment of a large region.
Keywords:Comparison and prioritization; Ecological asssment; Multiple watershed indicators; Parallel coordinates plots; Quintiles, Septiles, Nine-tiles;
Rank distribution function plots; Rank frequency distribution; Triangle
plots
composition
体育社区Introduction
The EPA has compiled a large body of ecological data regarding a five state subction of the United States in its An Ecological Asssment of the United States Mid-Atlantic Region (Jones, et al., 1998). The five states are Delaware, Maryland, Pennsylvania, Virginia and West Virginia, as well as the District of Columbia. The region is also partitioned into 114 major watersheds, which are ud as spatial units for the data. This paper examines the watershed indicator data that was compiled to asss the region. This is a list of 33 indicators, each a measure of the health of the watershed. Values of the 33 indicators are available for each of the 114 watersheds (Jones, et al, Table A1, pp. 96-101; this table can also be found at
v/maia/html/la-tablea1.html). The indicator names are abbreviated in Table
A1. Figure 1 in the appendix lists the full names of the indicators and gives a brief description of each (e the Ecological Asssment Atlas for full details).
Each indicator was then divided into quintile ranks, which reprent 20% divisions in the data. The top 20% of the watersheds for each indicator were given a rank of 1, the next 20% a rank of 2, and so on down to the bottom 20% being given a rank of 5. Bad on the indicator ranks, we would like to determine which watersheds are the healthiest, and which are in the most need of assistance. In
order to do this, we are not limited to looking at the quintiles, but we can also divide the raw data into ptiles (divisions of ven) and nine-tiles (divisions of nine). We will attempt to asss the watersheds using the divisions as well.
The indicators have not been weighted in any manner. Therefore, each indicator has the same value as every other indicator. We are not trying to asss the data bad on what we think is the most important indicator of the health of a watershed, but rather by collecting multiple indicators for each watershed and asssing the watersheds bad on the equal importance of the indicators. A major purpo of this paper is to prent approaches and the graphics for comparison and prioritization bad on watershed percentiles.
应聘书格式及范文
Analysis
We begin with the raw data that is listed in the EPA’s An Ecological Asssment of the United States Mid-Atlantic Region. The data has already been divided into quintile ranks. We have also divided the data into ptile and nine-tile ranks. Each ptile reprents 14.29% of the watersheds for that particular indicator, with the top venth receiving a rank of one and so on down to the worst venth receiving a rank of ven. Similarly, the top nine-tile reprents the best 11.11% of the watersheds for that indicator.
富人狼性Our next step is to find the cumulative distributions of the different ranks for each watershed. We will u the distribution functions as overall asssments of the quality or
‘health’ of the corresponding watersheds. We want to find the cumulative distributions for all three ranking options: quintiles, ptiles, and nine-tiles. Once we have found the cumulative functions, we can get a brief overview of them by looking at the plots of the functions.
If we were to look at the plots for the cumulative distribution functions of all 114 watersheds (Figures 2 – 4), we would notice that as the number of ranks increas, so does the spread of the graph. If you look at the picture for the quintiles, you notice that it is much more compact than the ptile graph, which in turn is more compact than the nine-tile graph. What we
would like to know is if there is an effect on comparisons of individual watersheds bad on this variability.
Figures 2 – 4
Figures 2-4: Cumulative distribution functions of 114 watersheds with respect to rank occurrence for the
quintile, ptile, and nine-tile analys
Now that we have the cumulative distribution functions, we will proceed by taking the sum of the values of the cumulative distribution function at each of the ranks. Thus, for quintiles:
长发女生
Σ[F(x)] = F(1) + F(2) + F(3) + F(4) + F(5)
ptiles:
Σ[F(x)] = F(1) + F(2) + F(3) + F(4) + F(5) + F(6) + F(7)
and nine-tiles:
瞎话Σ[F(x)] = F(1) + F(2) + F(3) + F(4) + F(5) + F(6) + F(7) + F(8) + F(9)
We refer to each of the sums as a CDF-index value. Larger CDF-index values indicate better watershed health. This is the ca becau a watershed with a large number of rank 1 scores will have a larger CDF-index value than a watershed with a small number of rank 1 scores. Alternatively, the CDF-index (minus 0.5) equals the area under the graph of the CDF in Figures 2 – 4, and higher graphs indicate better health. Thus, the CDF-index provides an objective way of combining multiple indicators into a single composite index of watershed health. After