2.10(iii) From (2.57), Var() = 2/. 由提示:: , and so Var() Var(). A more direct way to e this is to write(一个更直接的方式看到这是编写) =, which is less than unless = 0.
(iv)给定的c但随着的增加,的方差与Var()的相关性也增加.小时的偏差也小.因此, 在均方误差的基础上不管我们选择还是要取决于, ,和n的大小 (除了的大小).
3.7We can u Table 3.2. By definition, > 0, and by assumption, Corr(x1,x2) < 0. Therefore, there is a negative bias in: E() <. This means that, on average across different random samples, the simple regression estimator underestimates the effect of the training program. It is even possible that E() is negative even though > 0. 我们可以使用表3.2。根据定义,> 0,由假设,科尔(X1,X2)<0。因此,有一个负偏压为:E()<。这意味着,平均在不同的随机抽样,简单的回归估计低估的培训计划的效果。 E(下),它甚至可能是负的,即使>0。 我们可以使用表格3.2。根据定义,> 0,通过假设,广州培训机构柯尔(x1,x2)< 0。因此,有一种负面的偏见:E()<。这意味着,平均跨不同的随机样本,简单的回归估计低估了培训项目的效果。甚至可能让prince edward islandE()是负的,尽管> 0。
injustice3.8 Only (ii), omitting an important variable, can cau bias, and this is true only when the omitted variable is correlated with the included explanatory variables. The homoskedasticity assumption, MLR.5, played no role in showing that the OLS estimators are unbiad. (Homoskedasticity was ud to obtain the usual variance formulas for the.) Further, the degree of collinearity between the explanatory variables in the sample, even if it is reflected in a correlation as high as .95, does not affect the Gauss-Markov assumptions. Only if there is a perfect linear relationship among two or more explanatory variables is MLR.3 violated. 只有3.8(ii),遗漏重要变量,会造成偏见确实是这样,只有当省略变量就与包括解释变量。homoskedasticity的假设,多元线性回归。5,没有发挥作用在显示OLS估计量是公正的。(Homoskedasticity是用来获取通常的方差公式。)进一步,共线的程度解释变量之间的样品中,即使它是反映在尽可能高的相关性。95年,不影响的高斯-马尔可夫假定。只要有一个完美的线性关系在两个或更多的解释变量是多元线性回归。三违反了。
3.9 (i) Becau is highly correlated with and, and the latter variables have large partial effects on y, the simple and multiple regression coefficients on can differ by large amounts. We have not done this ca explicitly, but given equation (3.46) and the discussion with a single omitted variable, the intuition is pretty straightforward. 因为 是高度相关,和这些后面的变量有很大部分影响y,简单和多元回归系数的差异可大量。我们还没有做到,这种情况下显式,但鉴于方程(3.46)和以讨论单个变量遗漏,直觉是相当简单的。 (ii) Here we would expect and 上线英文 to be similar (subject, of cour, to what we mean by “almost uncorrelated”). The amount of correlation between and does not directly effect the multiple regression estimate on if is esntially uncorrelated with and.
这里我们将期待和相似(主题,当然对我们所说的“几乎不相关的”)。相关性的数量,但不会直接影响了多元回归估计如果本质上是不相关的和。
(iii) (iii) In this ca we are (unnecessarily) introducing multicollinearity into the regression: and have small partial effects on y and yet and are highly correlated with. Adding and like increas the standard error of the coefficient on substantially, so () is likely to be much larger than ().在这种情况下我们(不必要的)引入重合放入回归:,有微小的部分影响,但y,是高度相关的。添加和像增加标准错误的系数显著,所以()可能会远远大于()。
(iv) In this ca, adding and will decrea the residual variance without causing much collinearity (becau is almost uncorrelated with and), so we should e () smaller than (). The amount of correlation between and does not directly affect ().在这种情况下,添加和将减少剩余方差,也没有引起共线(因为几乎是不相关的,),所以我们应该看到()小于()。相关性的数量,但不会直接影响()。
3.11 (i) < 0 becau more pollution can be expected to lower housing values; note that is the elasticity of price with respect to nox. is probably positive becau remember whenrooms roughly measures the size of a hou. (However, it does not allow us to distinguish homes where each room is large from homes where each room is small.) < 0,因为更多的污染可以预期较低的房屋价值;注意,价格弹性对氮氧化物。可能是积极的因为房间粗略地度量大小的房子。(然而,不允许我们自己去辨别的家中,每个房间都是大从家中,每个房间小。)
(ii) If we assume that rooms increas with quality of the home, then log(nox) and rooms are negatively correlated when poorer neighborhoods have more pollution, something that is often true. We can u Table 3.2 to determine the direction of the bias. If > 0 and Corr(x1不屑是什么意思,x2) < 0, the simple regression estimator has a downward bias. But becau < 0, this means that the simple regression, on average, overstates the importance of pollution. [E() is more negative than.]如果我们假设房间随质量的家里,然后日志(nox)和房间反比当没那么富裕的社区有更多的污染,这往往是正确的。我们可以使用表3.2来确定方向的偏见。如果> 0和柯尔(x1,x2)< 0,那么简单的
(iii) This is what we expect from the typical sample bad on our analysis in part (ii). The simple regression estimate, 1.043, is more negative (larger in magnitude) than the multiple regression estimate, 0.718. As tho estimates are only for one sample, we can never know which is clor to. But if this is a “typical” sample, is clor to 0.718. 这是我们期待的东西从典型的示例基于我们的分析部分(ii)。简单的回归估计,?1.043,是更多的负面(大级)比多元回归估计,?0.718。作为这些估计仅供一个样品,我们永远也不会知道,charcoal filter更靠近。但是如果这是一个“典型”的示例计划书英文,接近?0.718
6.4 (i) The answer is not entire obvious, but one must properly interpret the coefficient on alcohol in either ca. If we include attend, then we are measuring the effect of alcohol consumption on college GPA, holding attendance fixed. Becau attendance is likely to be an important mechanism through which drinking affects performance, we probably do not want to hold it fixed in the analysis. If we do include attend, then we interpret the estimate of as being tho effects on colGPA that are not due to attending class. (For example, we could be measuring the effects that drinking alcohol has on study time.) To get a total effect of alcohol consumption, we would leave attend out. 答案并不完全是显而易见的,但你必须正确解析系数酒精在这两种情况下。如果我们包括参加,那么我们正在测量效果的酒精消费对大学GPA,持有出席固定。因为出勤率可能是一个重要的机制,通过这种机制,饮酒会影响性能,我们可能不想把它固定在分析。如果我们确实包括参加,然后我们把这些影响的估计是在colGPA,不是由于atten
(ii) We would want to include SAT and hsGPA as controls, as the measure student abilities and motivation. Drinking behavior in college could be correlated with one’s performance in high school and on standardized tests. Other factors, such as family bac
kground, would also be good controls.
我们想要包括SAT和hsGPA作为对照组,这些测量学生的能力和动力。在大学的饮酒行为可以与一个人的表现在高中和标准化考试。其他因素,如家庭背景,也将是良好的控制。
6.6 The cond equation is clearly preferred, as its adjusted R-squared is notably larger than that
in the other two equations. The cond equation contains the same number of estimated
parameters as the first, and the one fewer than the third. The cond equation is also easier to
interpret than the third. 第二个方程显然是首选的,因为它是大调整平方比其他两个方程。第二个等式包含相同数量的估计参数作为第一个,和一个少于第三。第二个方程也更容易解释第三。
7.3 (i) The t statistic on hsize2 is over four in absolute value, so there is very strong evidence that it belongs in the equation. We obtain this by finding the turnaround point; this is the value of hsize that maximizes (other things fixed): 19.3/(22.19) 田垄 4.41. Becau hsize is measured in hundreds, the optimal size of graduating class is about 441. 在hsize2 t统计超过可能英语4在绝对价值,所以有非常有力的证据,它是属于方程。我们通过发现获得这样的转变点,这是hsize的价值最大化(其他东西固定):19.3 /(2 2.19)?4.41。因为hsize是以数百,最佳的毕业生的人数大约是441。