Package‘Ball’
January3,2018
Type Package
Title Statistical Inference and Sure Independence Screening via Ball
Statistics
Version1.0.0
Date2017-12-18
Author XueQin Wang,WenLiang Pan,HeP-
ing Zhang,Hongtu Zhu,Yuan Tian,WeiNan Xiao,ChengFeng Liu,Jin Zhu
Maintainer Jin Zhu<zhuj37@mail2.sysu.edu>
Description Hypothesis tests and sure independence screening(SIS)procedure bad on ball statistics,including ball divergence,ball covariance,and ball correlation,
are developed to analyze complex data.The ball divergence and ball covariance bad distribution-free
tests are implemented to examine equality of multivariate distributions and independence
between random vectors of arbitrary dimensions.Furthermore,a generic non-parametric SIS
procedure bad on ball correlation and all of its variants are implemented to tackle the
challenge in the context of ultra high dimensional data.
Licen GPL-3
RoxygenNote6.0.1
Depends R(>=2.10),gam,survival
Imports utils
Suggests knitr,rmarkdown
我的人生哲学
VignetteBuilder knitr
NeedsCompilation yes
Encoding UTF-8
LazyData true
/Mamba413/Ball
/Mamba413/Ball/issues
Repository CRAN
Date/Publication2018-01-0318:25:13UTC
1
2ArcticLake R topics documented:
ArcticLake (2)
bcorsis (3)
bcov (5)
辣炒鱿鱼
bd (10)
bdvmf (14)
genlung (15)
macaques (15)
meteorology (16)
nhdist (17)
Index19 ArcticLake Arctic lake diment samples of different water depth
Description
Sand,silt and clay compositions of39diment samples of different water depth in an Arctic lake.
Format
ArcticLake$depth:water depth(in meters).
ArcticLake$x:compositions of three covariates:sand,silt,and clay.
Details
Sand,silt and clay compositions of39diment samples at different water depth(in meters)in an Arctic lake.The additional feature is a concomitant variable or covariate,water depth,which may account for some of the variation in the compositions.In statistical terminology,we have
a multivariate regression problem with diment composition as predictors and water depth as a
respon.All row percentage sums to100,except for rounding errors.
Note
Courtesy of J.Aitchison
Source
Aitchison:CODA microcomputer statistical package,1986,thefile name ARCTIC.DAT,here in-cluded under the GNU Public Library Licence Version2or newer.
References
Aitchison:The Statistical Analysis of Compositional Data,1986,Data5,pp5.
bcorsis Ball Correlation Sure Independence Screening
Description
Generic non-parametric sure independence screening procedure bad on ball correlation.Ball
correlation is a generic multivariate measure of dependence in Banach space.
Usage
bcorsis(x,y,d="small",weight=FALSE,method="standard",
dst=FALSE,parms=list(d1=5,d2=5,df=3),R=99,ed=4)
Arguments
x a numeric matirx or data.frame included n rows and p columns.Each row is
an obrvation vector and each column corresponding to a explanatory variable,
generally p>>n.
y a numeric vector,matirx,data.frame or dist object.
d th
e hard cutof
f rule suggests lectin
创建同义词
奶油英文g d variables.Setting d="large"or
d="small"means n-1or floor(n/log(n))variables are lected.If d is a
integer,d variables are lected.Default:d="small"
weight when weight=TRUE,weighted ball correlation is ud instead of ball correla-
tion.Default:weight=FALSE
method method for sure independence screening procedure,include:"standard","pvalue", "lm","gam","interaction"and"survival".Setting method="standard"
or"pvalue"means standard sure independence screening procedure bad on
ball correlation or p-value of ball correlation test while options"lm"and"gam"
carry out iterative BCor-SIS procedure with ordinary linear regression and gen-
eralized additive models,respectively.Options"interaction"and"survival"
are designed for detecting variables with potential linear interaction or associ-
ated with censored respons.Default:method="standard"
dst if dst=TRUE,y will be considered as a distance matrix.Arguments only avail-
able when method="standard",method="pvalue"or method="interaction".
Default:dst=FALSE
parms parameters list only available when method="lm"or"gam".It contains three
parameters:d1,d2,and df.d1is the number of initially lected variables,
d2is the number of variables collection size added in each iteration.df is de-
gree freedom of basis in generalized additive models playing a role only when
method="gam".Default:parms=list(d1=5,d2=5,df=3) R the number of replications.Arguments only available when method="pvalue".
Default R=99
ed the random ed.Arguments only available when method="pvalue".
Details
bcorsis implements a model-free generic screening procedure,BCor-SIS,with fewer and less restrictive assumptions.The sample sizes(number of rows or length of the vector)of the two variables
x and y must agree,and samples must not contain missing values.
BCor-SIS procedure for censored respon is carried out when method="survival".At that time,the matrix or data.frame pass to argument y must have exactly two columns and thefirst column is event(failure)time while the cond column is censored status,a dichotomous variable.
If we t dst=TRUE,arguments y is considered as distance matrix,otherwi y is treated as data.
BCor-SIS is bad on a recently developed universal dependence measure:Ball correlation(BCor).
BCor efficiently measures the dependence between two random vectors,which is between0and1, and0if and only if the two random vectors are independent under some mild conditions.(See the manual page for bcor.)
Theory and numerical result indicate that BCor-SIS has following advantages:
(i)It has a strong screening consistency property withoutfinite sub-exponential moments of the data.
Conquently,even when the dimensionality is an exponential order of the sample size,BCor-SIS still almost surely able to retain the efficient variables.
(ii)It is nonparametric and has the property of robustness.
(iii)It works well for complex respons and/or predictors,such as shape or survival data
(iv)It can extract important features even when the underlying model is complicated.
See(Pan2017)for theoretical properties of the BCor-SIS,including statistical consistency.
Value
ix the vector of indices lected by ball correlation sure independence screening procedure.
Author(s)
WenLiang Pan,WeiNan Xiao,XueQin Wang,HePing Zhang,HongTu Zhu
See Also
bcor
Examples
##Not run:
>>>Quick Start for bcorsis function>>>t.ed(1)
n<-150
p<-3000去湿气吃什么
x<-matrix(rnorm(n*p),nrow=n)
error<-rnorm(n)
y<-3*x[,1]+5*(x[,3])^2+error
res<-bcorsis(y=y,x=x)
head(res[[1]])
bcov5
>>>BCor-SIS:Censored Data Example>>>data("genlung")
result<-bcorsis(x=genlung[["covariate"]],y=genlung[["survival"]],
method="survival")$ix
top_gene<-colnames(genlung[["covariate"]])[result]
head(top_gene,n=1)
>>>BCor-SIS:Interaction Pursuing>>>t.ed(1)
n<-150
p<-3000
x<-matrix(rnorm(n*p),nrow=n)
山东章丘大葱
error<-rnorm(n)
y<-3*x[,1]*x[,5]*x[,10]+error
res<-bcorsis(y=y,x=x,method="interaction")
head(res[[1]])
>>>BCor-SIS:Iterative Method>>>library(mvtnorm)
t.ed(1)
切线的判定n<-150
p<-3000
sigma_mat<-matrix(0.5,nrow=p,ncol=p)
diag(sigma_mat)<-1
error<-rnorm(n)
y<-3*(x[,1])^2+5*(x[,2])^2+5*x[,8]-8*x[,16]+error
res<-bcorsis(y=y,x=x,method="gam",d=15)
res[[1]]
##End(Not run)
bcov Ball Correlation and Covariance Statistics
赋闲在家什么意思Description
Computes ball covariance and ball correlation statistics,which are multivariate measures of depen-dence in Banach space.
Usage
bcov(x,y,dst=FALSE,weight=FALSE)
bcor(x,y,dst=FALSE,weight=FALSE)