matlabchi2gof,chi2gof函数function [h,p,stats] = chi2gof(x,varargin)
%CHI2GOF Chi-square goodness-of-fit test.
% CHI2GOF performs a chi-square goodness-of-fit test for discrete or
% continuous distributions. The test is performed by grouping the data into
% bins, calculating the obrved and expected counts for tho bins, and
% computing the chi-square test statistic SUM((O-E).^2./E), where O is the
% obrved counts and E is the expected counts. This test statistic has an
% approximate chi-square distribution when the counts are sufficiently
% large.
%
% Bins in either tail with an expected count less than 5 are pooled with
% neighboring bins until the count in each extreme bin is at least 5. If
% bins remain in the interior with counts less than 5, CHI2GOF displays a
% warning. In that ca, you should u fewer bins, or provide bin
% centers or edges, to increa the expected counts in all bins.
%
% H = CHI2GOF(X) performs a chi-square goodness-of-fit test that the data in
% the vector X are a random sample from a normal distribution with mean and
% variance estimated from X. The result is H=0 if the null hypothesis (that
% X is a random sample from a normal distribution) cannot be rejected at the
% 5% significance level, or H=1 if the null hypothesis can be rejected at
% the 5% level. CHI2GOF us NBINS=10 bins, and compares the test statistic
% to a chi-square distribution with NBINS-3 degrees of freedom, to take into
% account that two parameters were estimated.
%
% [H,P] = CHI2GOF(...) also returns the p-value P. The P value is the
% probability of obrving the given result, or one more extreme, by
% chance if the null hypothesis is true. If there are not enough degrees
% of freedom to carry out the test, P is NaN.
%
% [H,P,STATS] = CHI2GOF(...) also returns a STATS structure with the
% following fields:
% 'chi2stat' Chi-square statistic
% 'df' Degrees of freedom
% 'edges' Vector of bin edges after pooling
% 'O' Obrved count in each bin
% 'E' Expected count in each bin
%
% [...] = CHI2GOF(X,'NAME1',VALUE1,'NAME2',VALUE2,...) specifies
% optional argument name/value pairs chon from the following list.
% Argument names are ca innsitive and partial matches are allowed. %
% The following options control the initial binning of the data before
% pooling. You should not specify more than one of the options.
%
% Name Value
% 'nbins' The number of bins to u. Default is 10.
% 'ctrs' A vector of bin centers.
% 'edges' A vector of bin edges.
%
% The following options determine the null distribution for the test. You % should not specify both 'cdf' and 'expected'.
%
% Name Value
% 'cdf' A fully specified cumulative distribution function. This
% can be a ProbDist object, a function handle, or a function.
% name. The function must take X values as its only argument. % Alternately, you may provide a cell array who first
% element is a function name or handle, and who later
% elements are parameter values, one per cell. The function
% must take X values as its first argument, and other
% parameters as later arguments.
% 'expected' A vector with one element per bin specifying the
% expected counts for each bin.
% 'nparams' The number of estimated parameters; ud to adjust
% the degrees of freedom to be NBINS-1-NPARAMS, where
% NBINS is the number of bins.
%
% If your 'cdf' or 'expected' input depends on estimated parameters, you
% should u the 'nparams' parameter to ensure that the degrees of freedom % for the test is correct. Otherwi the default 'nparams' value is
%
% 'cdf' is a ProbDist object: the number of estimated parameters
% 'cdf' is a function: 0
% 'cdf' is a cell array: the number of parameters in the array
% 'expected' is specified: 0
%
% The following options control other aspects of the test.
%
% Name Value
% 'emin' The minimum allowed expected value for a bin; any bin
% in either tail having an expected value less than this
% amount is pooled with a neighboring bin. U the
% value 0 to prevent pooling. Default is 5.
% 'frequency' A vector of the same length as X containing the
% frequency of the corresponding X values.
% 'alpha' An ALPHA value such that the hypothesis is rejected
% if P
%
%
% Examples:
%
% % Three equivalent ways to test against an unspecified normal
% % distribution (i.e., with estimated parameters)
% x = normrnd(50,5,100,1);
% [h,p] = chi2gof(x)
% [h,p] = chi2gof(x,'cdf',@(z)normcdf(z,mean(x),std(x)),'nparams',2)
% [h,p] = chi2gof(x,'cdf',{@normcdf,mean(x),std(x)})
%
% % Test against standard normal (mean 0, standard deviation 1)
% x = randn(100,1);
% [h,p] = chi2gof(x,'cdf',@normcdf)
%
% % Test against the standard uniform
% x = rand(100,1);
% n = length(x);
% edges = linspace(0,1,11);
% expectedCounts = n * diff(edges);
% [h,p,st] = chi2gof(x,'edges',edges,'expected',expectedCounts)
%
% % Test against the Poisson distribution by specifying obrved and % % expected counts
% bins = 0:5; obsCounts = [6 16 10 12 4 2]; n = sum(obsCounts);
% lambdaHat = sum(bins.*obsCounts) / n;
% expCounts = n * poisspdf(bins,lambdaHat);
% [h,p,st] = chi2gof(bins,'ctrs',bins,'frequency',obsCounts, ...
% 'expected',expCounts,'nparams',1)
%
% See also CROSSTAB, CHI2CDF, KSTEST, LILLIETEST.
% Copyright 2005-2009 The MathWorks, Inc.
% $Revision: 1.1.8.4 $ $Date: 2011/07/20 00:08:15 $
narginchk(1,inf);
if ~isvector(x) || ~isreal(x)
error(message('stats:chi2gof:NotVector'));
end
% Process optional arguments and do error checking
okargs = {'nbins' 'ctrs' 'edges' 'cdf' 'expected' 'nparams' ...
'emin' 'frequency' 'alpha'};
defaults = {[] [] [] [] [] [] ...
5 [] 0.05};
[nbins,ctrs,edges,cdfspec,expected,nparams,emin,freq,alpha] = ... internal.stats.parArgs(okargs,defaults,varargin{:});
errorcheck(x,nbins,ctrs,edges,cdfspec,expected,nparams,emin,freq,alpha); % Get bins and obrved counts. This will also perform error checking on % the nbins, ctrs, and edges inputs.
x = x(:);
if impty(freq)
freq = ones(size(x));
el
freq = freq(:);
end
t = isnan(freq) | isnan(x);
if any(t)
x(t) = [];
freq(t) = [];
end
if ~impty(ctrs)
[Obs,edges] = statgetbins(x,freq,'ctrs',ctrs);
elif ~impty(edges)
[Obs,edges] = statgetbins(x,freq,'edges',edges);
el
if impty(nbins)
if impty(expected)
nbins = 10; % default number of bins
el
nbins = length(expected); % implied by expected value vector end
end
[Obs,edges] = statgetbins(x,freq,'nbins',nbins);
end
Obs = Obs(:);
nbins = length(Obs);
% Get expected counts
cdfargs = {};
if ~impty(expected)
% Get them from the input argument, if any
if ~isvector(expected) || numel(expected)~=nbins
error(message('stats:chi2gof:BadExpected', nbins));
end