# Statistics hw using r statistical software due in 24 hours

The HW is self-explanatory

There is a typo in question 1(e). The decision rule that you should use is to assign yhat = 1 if pihat > 0.15 and yhat = 0 if pihat <= 0.15

I will send the data file on email since it doesn’t upload here

STATISTICS 462 – Summer 2016
Homework 7

DUE Tuesday, August 9th

Unless otherwise stated, you can use R for any of the calculations, but make sure you include
your code. Your code should not be a copy of anyone else’s! Any code you turn in should be
well organized and commented so the grader can understand your answers.

All programming questions should be submitted to the dropbox on ANGEL for this assignment
as a .pdf file using the naming convention HWNum_FirstInitialLastName.pdf. For example,
John Doe would submit a file titled HW1_JDoe.pdf for the first assignment. Your answer
to programming questions should include both code and a description of your result. I
recommend using R-markdown for writing up your answers. A template for writing up an
assignment in R-markdown can be found on ANGEL. R-markdown files can be compiled
directly within R-Studio. Alternatively, answers may be saved in a word document or LaTeX,
and converted into a .pdf file.

Non-coding questions can either be written and submitted in the same file as your coding
questions using LaTeX typesetting (see https://latex-project.org/intro.html) or they
may be handwritten and turned in separately during class.

1. Load the “wine.Rdata” dataset. This dataset contains the wine chemical and physical
attributes for 1,599 red wines as well as a quality assessment (quality = 1: good, 0 =
poor).

(a) Fit a logistic regression model with wine quality taken as the response, and the
remaining variables as covariates.

(b) Clearly state the statistical model definition for this logistic regression model.
Include any relevant assumptions.

(c) Interpret the model coefficients. What does this output indicate about the marginal
association between each covariate and the mean response?

(d) Predict the probability that the response is of high quality under the following
covariate settings.

Variable Level
fixed acidity 10.7

volatile acidity 0.74
citric acid 0.52

residual sugar 3.6
chlorides 0.11

free sulfur dioxide 31
total sulfur dioxide 93.2

density 0.999
pH 3.5

sulphates 0.85
alcohol 12

(e) Use the predict() function to get predicted probabilities from the fitted model.
Using the following decision rule, transform the predicted probabilities into pre-
dicted response values. Create a confusion matrix for these predictions (i.e. true
positives, false positives, true negatives, false negatives).

2. The board of directors of a professional association conducted a random sample survey
of 30 members to assess the effects of several possible amounts of changes in membership
dues. The predictor X denotes, in dollars, the change in annual dues from the previous
year posited in the survey interview, and the response is binary: Y = 1 if the interviewee
indicated that the membership will NOT be renewed at that amount of change in
dues and Y = 0 if the membership will be renewed. The output for fitting the logistic
regression model is given below. Use this to answer the following questions.

(a) Write the estimated equation as a function of X for

i. The log-odds of not renewing a membership
ii. The odds of not renewing a membership
iii. The probability of not renewing a membership

(b) Estimate the probability that someone does NOT renew their membership when
the annual dues increase by \$32.

(c) Estimate the odds that someone does NOT renew their membership when the
annual dues increase by \$32

(d) Estimate the probability and odds that someone DOES renew their membership
when the annual dues increase by \$5

(e) Find the odds ratio of renewal for a scenario where the annual increase is \$5
against one where the annual increase is \$10.

(f) Estimate the increase in annual dues for which 75% of the members are expected
to not renew their membership.

(g) Conduct a Wald test to determine whether dollar increase in dues is related to the
probability of membership renewal. In your answer, state the null and alternative
hypotheses, the test statistic, p-value, and conclusion based on (1) α = 0.05 and
(2) α = 0.1