Thursday, May 16, 2019

Final Exam Ec315

PART I. HYPOTHESIS TESTING PROBLEM 1 A certain brand of fluorescent light tube was advertised as having an effective look span before burning step up of 4000 hours. A random exemplar of 84 bulbs was burned out with a mean illumination life span of 1870 hours and with a sample standard deviation of 90 hours. Construct a 95 trust interval based on this sample and be certain to interpret this interval. Answer Since population standard deviation is unknown, t distribution quarter be utilise construct the confidence interval. ? The 95% confidence interval is given by ? X ? t? / 2,n ? 1 ? S S? , X ? ? /2,n ? 1 ? n n? Details trust Interval estimation for the Mean Data ingest step aberrance adjudicate Mean Sample Size Confidence take 90 1870 84 95% Intermediate Calculations measuring rod fracture of the Mean 9. 819805061 Degrees of Freedom 83 t prize 1. 988959743 Interval Half Width 19. 53119695 Confidence Interval Interval Lower keep back 1850. 47 Interval swiftness Limi t 1889. 53 2 PROBLEM 2 Given the chase data from two independent data sets, conduct a one -tail hypothesis test to run into if the means be statistically equal using alpha=0. 05. Do NOT do a confidence interval. 1 = 35 n2 = 30 xbar1= 32 xbar2 = 25 s1=7 s2 = 6 Answer H01=2 H1 12 turn out statistics used is t ? X1 ? X 2 S 2 (n1 ? 1) S12 ? (n2 ? 1) S2 n1n2 tn1 ? n1 ? 2 where S ? n1 ? n2 ? 2 n1 ? n2 Decision regularise Reject the idle hypothesis, if the calculated value of test statistic is greater than the tiny value. Details t runnel for inequalitys in dickens Means Data Hypothesizingd Difference Level of substance Population 1 Sample Sample Size Sample Mean Sample Standard expiration Population 2 Sample Sample Size Sample Mean Sample Standard Deviation 0 0. 05 35 32 7 30 25 6Intermediate Calculations Population 1 Sample Degrees of Freedom 34 Population 2 Sample Degrees of Freedom 29 measure Degrees of Freedom 63 Pooled Variance 43. 01587 Difference in Sample Means 7 t Test Statistic 4. 289648 Upper-Tail Test Upper Critical Value p-Value Reject the null hypothesis 1. 669402 3. 14E-05 last Reject the null hypothesis. The sample provides exuberant evidence to support the claim that means are different. 3 PROBLEM 3. A test was conducted to witness whether gender of a display place af fected the likelihood that consumers would prefer a new product.A survey of consumers at a trade show which used a female spokesperson firm that 120 of three hundred customers pet the product while 92 of 280 customers preferred the product when it was shown by a female spokesperson. Do the samples provide sufficient evidence to indicate that the gender of the salesperson affect the likelihood of the product organism favorably regarded by consumers? Evaluate with a two-tail, alpha =. 01 test. Do NOT do a confidence interval. Answer H0 There no portentous gender wise loss in the proportion customers who preferred the product.H1 There evidential gender wise rest in the proportion customers who preferred the product. P ? P2 n p ? n p 1 The test Statistic used is Z test Z ? where p= 1 1 2 2 n1 ? n2 ?1 1? P(1 ? P) ? ? ? ? n1 n2 ? Decision rule Reject the null hypothesis, if the calculated value of test statistic is greater than the critical value. Details Z Test for Differences in Two Proportions Data Hypothesized Difference Level of Signifi after partce congregation 1 Number of Successes Sample Size Group 2 Number of Successes Sample Size 0 0. 01 anthropoid 120 300 Female 92 80 Intermediate Calculations Group 1 Proportion 0. 4 Group 2 Proportion 0. 328571429 Difference in Two Proportions 0. 071428571 Average Proportion 0. 365517241 Z Test Statistic 1. 784981685 Two-Tail Test Lower Critical Value -2. 575829304 Upper Critical Value 2. 575829304 p-Value 0. 074264288 Do not go down the null hypothesis Conclusion Fails to reject the null hypothesis. The sample does not provide enough evidence to support the claim that there significant gender wise difference in the proportion customers who preferred the product. 4PROBLEM 4 Assuming that the population variances are equal for antheral and Female GPAs, test the following sample data to see if Male and Female PhD candidate GPAs (Means) are equal. Conduct a two-tail hypothesis test at ? =. 01 to determine whether the sample means are different. Do NOT do a confidence interval. Male GPAs Female GPAs Sample Size 12 13 Sample Mean 2. 8 4. 95 Sample Standard Dev .25 .8 Answer H0 There is no significant difference in the mean GPA of males and Females H1 There is significant difference in the mean GPA of males and Females. Test Statistic used is independent sample t test. ? X1 ? X 2 S 2 (n1 ? 1) S12 ? (n2 ? 1) S2 n1n2 tn1 ? n1 ? 2 where S ? n1 ? n2 ? 2 n1 ? n2 Decision rule Reject the null hypotheses, if the calculated value of test statistic is greater than the critical value. Details t Test for Differences in Two Means Data Hypothesized Difference Level of Significance Popula tion 1 Sample Sample Size Sample Mean Sample Standard Deviation Population 2 Sample Sample Size Sample Mean Sample Standard Deviation Intermediate Calculations Population 1 Sample Degrees of Freedom Population 2 Sample Degrees of Freedom Total Degrees of Freedom Pooled Variance 0. 05 12 2. 8 0. 25 13 4. 95 0. 8 11 12 23 0. 363804 5 Difference in Sample Means t Test Statistic -2. 15 -8. 90424 Two-Tail Test Lower Critical Value Upper Critical Value p-Value Reject the null hypothesis -2. 80734 2. 807336 0. 0000 Conclusion Reject the null hypotheses. The sample provides enough evidence to support the claim that there is significant difference in the mean GP A score among the males and females. 6 PART II REGRESSION compend Problem 5 You wish to run the regression model (less Intercept and coefficients) shown below VOTE = urban + INCOME + EDUCATEGiven the Excel spreadsheet below for annual data from1970 to 2006 (with the data for row 5 thru row 35 not shown), complete all necessary entri es in the Excel Regression Window shown below the data. 1 2 3 4 A YEAR 1970 1971 1972 B VOTE C URBAN D INCOME E EDUCATE 49. 0 58. 3 45. 2 62. 0 65. 2 75. 0 7488 7635 7879 4. 3 8. 3 4. 5 36 37 38 2004 2005 2006 50. 1 92. 1 94. 0 95. 6 15321 15643 16001 4. 9 4. 7 5. 1 67. 7 54. 2 Regression Input OK Input Y Range A1A38 Input X Range B1E38 Cancel Help ? Labels Confidence Level x X X Output options X Constant is Zero 95 % Output Range New Worksheet PlyNew W orkbook residuums Residuals Residual Plots Standardized Residuals Line Fit Plots Normal Probabilit y Normal Probability Plots 7 PROBLEM 6. substance abuse the following regression output to determine the following A real estate investor has devised a model to estimate home prices in a new suburban development. Data for a random sample of 100 homes were gathered on the selling price of the home ($ thousands), the home size (square feet), the lot size (thousands of square feet), and the number of bed rooms. The following multiple reg ression output was generated Regression Statistics Multiple R 0. 8647 R Square . 7222 Adjusted R Square 0. 6888 Standard Error 16. 0389 Observations 100 Intercept X1 (Square Feet) X2 (Lot Size) X3 (Bedrooms) Coefficients -24. 888 0. 2323 11. 2589 15. 2356 Standard Error 38. 3735 0. 0184 1. 7120 6. 8905 t Stat -0. 7021 9. 3122 4. 3256 3. 2158 P-value 0. 2154 0. 0000 0. 0001 0. 1589 a. Why is the coefficient for BEDROOMS a positive number? The selling price increase when the number of rooms increases. Thus the relationship is positive. b. Which is the most statistically significant uncertain? What evidence shows this? Most statistically significant variable is one with to the lowest degree p value.Here most statistically significant variable is Square feet. c. Which is the least statistically significant variable? What evidence shows this? Least statistically significant variable is one with highschool p value. Here least statistically significant variable is bedrooms d. For a 0. 0 5 level of significance, should any variable be dropped from this model? Why or why not? The variable bed rooms can be dropped from the model as the p value is greater than 0. 05. e. Interpret the value of R square up? How does this value from the adjusted R squared? The R2 gives the model adequacy. Here R2 suggest that 72. 22% variability can e explained by the model. Adjusted R2 is a modification of R2 that adjusts for the number of explanatory terms in a model. Unlike R2, the adjusted R2 increases only if the new term improves the model more than would be expected by chance. f. Predict the sales price of a 1134-square-foot home with a lot size of 15,400 square feet and 2 bedrooms. Selling Price =-24. 888+0. 02323*1134+11. 2589*15400+15. 2356*2=173419 8 PART III SPECIFIC KNOWLEDGE SHORT-ANSWER QUESTIONS. Problem 7 Define Auto coefficient of correlation in the following terms a. In what token of regression is it likely to occur? Regressions involving m series data . What is bad about autocorrelation in a regression? The standard error of the estimates go out high. c. What method is used to determine if it exists? (Think of statistical test to be used) Durbin Watson Statistic is used determine auto correlation in a regression. d. If found in a regression how is it eliminated? Appropriate transformations can be take to eliminate auto correlation. Problem 8 Define Multicollinearity in the following terms a) In what type of regression is it likely to occur? Multicollinearity occurs in multiple regressions when two or more independent variables are highly cor tie in. ) Why is multicollinearity in a regression a difficulty to be resolved? Multicollinearity in Regression Models is an unacceptably high level of intercorrelation among the independents, such that the effects of the independents cannot be separated. Under multicollinearity, estimates are deaf(p) but ratements of the relative strength of the explanatory variables and their joint effect are unrelia ble. c) How can multicollinearity be determined in a regression? Multicollinearity refers to excessive correlation of the predictor variables. When correlation is excessive (some use the rule of thumb of r 0. 90), tandard errors of the b and beta coefficients become large, making it difficult or impossible to assess the relative importance of the predictor variables. The measures Tolerance and VIF are commonly used to measure multicollinearity. Tolerance is 1 R2 for the regression of that independent variable on all the other independents, ignoring the dependent. There will be as many tolerance coefficients as there are independents. The higher the inter-correlation of the independents, the more the tolerance will shape up zero. As a rule of thumb, if tolerance is less than . 20, a problem with multicollinearity is indicated.When tolerance is close to 0 there is high multicollinearity of that variable with other independents and the b and beta coefficients will be unstable. The m ore the multicollinearity, the level the tolerance, the more the standard error of the regression coefficients. d) If multicollinearity is found in a regression, how is it eliminated? Multicollinearity occurs because two (or more) variables are related they measure essentially the same thing. If one of the variables doesnt seem logically essential to your model, removing it may humiliate or eliminate multicollinearity.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.