temperature <- read.table("~/OneDrive - University of Saskatchewan/teaching/stat245_1909/rdemo/data/normtemp.dat.txt")
colnames(temperature) <- c("Temp", "Gender", "HR")
temperature$Gender <- mapvalues(temperature$Gender, c(2,1), c("Male", "Female"))
temperature$Gender <- as.factor(temperature$Gender)
head(temperature)
## Temp Gender HR
## 1 96.3 Female 70
## 2 96.7 Female 71
## 3 96.9 Female 74
## 4 97.0 Female 80
## 5 97.1 Female 73
## 6 97.1 Female 75
##
## Welch Two Sample t-test
##
## data: Temp by Gender
## t = -2.2854, df = 127.51, p-value = 0.02394
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.53964856 -0.03881298
## sample estimates:
## mean in group Female mean in group Male
## 98.10462 98.39385
We can also use lm for comparing two means:
##
## Call:
## lm(formula = Temp ~ Gender, data = temperature)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.99385 -0.47154 0.00615 0.40615 2.40615
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 98.10462 0.08949 1096.298 <2e-16 ***
## GenderMale 0.28923 0.12655 2.285 0.0239 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7215 on 128 degrees of freedom
## Multiple R-squared: 0.03921, Adjusted R-squared: 0.0317
## F-statistic: 5.223 on 1 and 128 DF, p-value: 0.02393
## Analysis of Variance Table
##
## Response: Temp
## Df Sum Sq Mean Sq F value Pr(>F) R2
## Gender 1 2.719 2.71877 5.2232 0.023932 0.03921
## Residuals 128 66.626 0.52052 0.96079
A note:
The results by lm are not exactly equal to the results of t.test. In t.test, we do not assume the equality of variances in two populations, but in lm, we assume that the equality of variances in male and female groups. This is equivalent to the pooled t test, which I did not cover in this class.
Conlusions:
We see that the difference between male and female mean temperatures is statistical significant at the level of 0.05. That is, we have strong statistical evidence to support that the two temperature means are different between female and male. However, the difference may not be practically significant.
##
## Welch Two Sample t-test
##
## data: HR by Gender
## t = -0.63191, df = 116.7, p-value = 0.5287
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.243732 1.674501
## sample estimates:
## mean in group Female mean in group Male
## 73.36923 74.15385
We can also use lm for comparing two means:
##
## Call:
## lm(formula = HR ~ Gender, data = temperature)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17.1538 -5.1538 -0.1538 4.8462 14.8462
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 73.3692 0.8780 83.565 <2e-16 ***
## GenderMale 0.7846 1.2417 0.632 0.529
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.079 on 128 degrees of freedom
## Multiple R-squared: 0.00311, Adjusted R-squared: -0.004678
## F-statistic: 0.3993 on 1 and 128 DF, p-value: 0.5286
##
## Call:
## lm(formula = Temp ~ HR, data = temperature)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.85017 -0.39999 0.01033 0.43915 2.46549
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 96.306754 0.657703 146.429 < 2e-16 ***
## HR 0.026335 0.008876 2.967 0.00359 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.712 on 128 degrees of freedom
## Multiple R-squared: 0.06434, Adjusted R-squared: 0.05703
## F-statistic: 8.802 on 1 and 128 DF, p-value: 0.003591
## Analysis of Variance Table
##
## Response: Temp
## Df Sum Sq Mean Sq F value Pr(>F)
## HR 1 4.462 4.4618 8.8021 0.003591 **
## Residuals 128 64.883 0.5069
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We see that the p-value of HR is significant. A practical significance measure is the \(R^2\):
## Analysis of Variance Table
##
## Response: Temp
## Df Sum Sq Mean Sq F value Pr(>F) R2
## HR 1 4.462 4.4618 8.8021 0.0035915 0.06434
## Residuals 128 64.883 0.5069 0.93566
We see that the variation of temperature explained by HR is very small.
We may be wondering whether the temperature still diff in female and male after the effects of HR is considered. We will fit such a multiple regression model:
lm_temp_hr_gender <- lm (Temp~HR + Gender, data = temperature)
# What model is fitted?
model.matrix(lm_temp_hr_gender)[c(1:3,1:3+70),]
## (Intercept) HR GenderMale
## 1 1 70 0
## 2 1 71 0
## 3 1 74 0
## 71 1 57 1
## 72 1 61 1
## 73 1 84 1
## (Intercept) HR GenderMale
## 96.25081404 0.02526674 0.26940610
# visualize the fitting results:
gender.n <- as.numeric(temperature$Gender)
plot(Temp~HR, data=temperature, col = gender.n, pch = gender.n )
abline(a= betas[1], b = betas["HR"], col=1)
abline(a= betas[1]+betas["GenderMale"], b = betas["HR"], col=2)
legend("topleft", col=1:2,lty=c(1,1), legend = c("Female", "Male"))
##
## Call:
## lm(formula = Temp ~ HR + Gender, data = temperature)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.86363 -0.45624 0.01841 0.47366 2.33424
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 96.250814 0.648717 148.371 < 2e-16 ***
## HR 0.025267 0.008762 2.884 0.00462 **
## GenderMale 0.269406 0.123277 2.185 0.03070 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7017 on 127 degrees of freedom
## Multiple R-squared: 0.09825, Adjusted R-squared: 0.08405
## F-statistic: 6.919 on 2 and 127 DF, p-value: 0.001406
## Analysis of Variance Table
##
## Response: Temp
## Df Sum Sq Mean Sq F value Pr(>F) R2
## HR 1 4.462 4.4618 9.0617 0.0031494 0.06434
## Gender 1 2.352 2.3515 4.7758 0.0306964 0.03391
## Residuals 127 62.532 0.4924 0.90175
Conclusions
After we remove the influence of HR, the gender is still statistically significant at the level of 0.05, but the practical significance measured with \(R^2\) may be minor.
https://www.fatherly.com/health-science/why-women-are-colder-than-men/
Fit a logistic model:
Looking at the fitted model with data
prob_pred <- predict(model_gender_temp, type = "response")
with(temperature,
{
n.gender <- as.numeric(Gender)
o.temp <- order(Temp)
plot(n.gender-1~jitter(Temp), col=n.gender,pch=n.gender)
lines(prob_pred[o.temp]~Temp[o.temp])
}
)
Looking at the predictions
predicted.gender <- mapvalues(prob_pred > 0.5, c(1,0), c("Male", "Female"))
pred.error <- (predicted.gender != temperature$Gender)*1
pred.table <- cbind(temperature,
data.frame(predicted.gender,
"Prob of Male"=prob_pred,
pred.error)
)
pred.table
## Temp Gender HR predicted.gender Prob.of.Male pred.error
## 1 96.3 Female 70 Female 0.2477981 0
## 2 96.7 Female 71 Female 0.2926458 0
## 3 96.9 Female 74 Female 0.3167696 0
## 4 97.0 Female 80 Female 0.3292229 0
## 5 97.1 Female 73 Female 0.3419207 0
## 6 97.1 Female 75 Female 0.3419207 0
## 7 97.1 Female 82 Female 0.3419207 0
## 8 97.2 Female 64 Female 0.3548492 0
## 9 97.3 Female 69 Female 0.3679932 0
## 10 97.4 Female 70 Female 0.3813363 0
## 11 97.4 Female 68 Female 0.3813363 0
## 12 97.4 Female 72 Female 0.3813363 0
## 13 97.4 Female 78 Female 0.3813363 0
## 14 97.5 Female 70 Female 0.3948609 0
## 15 97.5 Female 75 Female 0.3948609 0
## 16 97.6 Female 74 Female 0.4085484 0
## 17 97.6 Female 69 Female 0.4085484 0
## 18 97.6 Female 73 Female 0.4085484 0
## 19 97.7 Female 77 Female 0.4223793 0
## 20 97.8 Female 58 Female 0.4363329 0
## 21 97.8 Female 73 Female 0.4363329 0
## 22 97.8 Female 65 Female 0.4363329 0
## 23 97.8 Female 74 Female 0.4363329 0
## 24 97.9 Female 76 Female 0.4503881 0
## 25 97.9 Female 72 Female 0.4503881 0
## 26 98.0 Female 78 Female 0.4645229 0
## 27 98.0 Female 71 Female 0.4645229 0
## 28 98.0 Female 74 Female 0.4645229 0
## 29 98.0 Female 67 Female 0.4645229 0
## 30 98.0 Female 64 Female 0.4645229 0
## 31 98.0 Female 78 Female 0.4645229 0
## 32 98.1 Female 73 Female 0.4787149 0
## 33 98.1 Female 67 Female 0.4787149 0
## 34 98.2 Female 66 Female 0.4929413 0
## 35 98.2 Female 64 Female 0.4929413 0
## 36 98.2 Female 71 Female 0.4929413 0
## 37 98.2 Female 72 Female 0.4929413 0
## 38 98.3 Female 86 Male 0.5071792 1
## 39 98.3 Female 72 Male 0.5071792 1
## 40 98.4 Female 68 Male 0.5214055 1
## 41 98.4 Female 70 Male 0.5214055 1
## 42 98.4 Female 82 Male 0.5214055 1
## 43 98.4 Female 84 Male 0.5214055 1
## 44 98.5 Female 68 Male 0.5355971 1
## 45 98.5 Female 71 Male 0.5355971 1
## 46 98.6 Female 77 Male 0.5497314 1
## 47 98.6 Female 78 Male 0.5497314 1
## 48 98.6 Female 83 Male 0.5497314 1
## 49 98.6 Female 66 Male 0.5497314 1
## 50 98.6 Female 70 Male 0.5497314 1
## 51 98.6 Female 82 Male 0.5497314 1
## 52 98.7 Female 73 Male 0.5637857 1
## 53 98.7 Female 78 Male 0.5637857 1
## 54 98.8 Female 78 Male 0.5777384 1
## 55 98.8 Female 81 Male 0.5777384 1
## 56 98.8 Female 78 Male 0.5777384 1
## 57 98.9 Female 80 Male 0.5915681 1
## 58 99.0 Female 75 Male 0.6052543 1
## 59 99.0 Female 79 Male 0.6052543 1
## 60 99.0 Female 81 Male 0.6052543 1
## 61 99.1 Female 71 Male 0.6187775 1
## 62 99.2 Female 83 Male 0.6321190 1
## 63 99.3 Female 63 Male 0.6452612 1
## 64 99.4 Female 70 Male 0.6581878 1
## 65 99.5 Female 75 Male 0.6708837 1
## 66 96.4 Male 69 Female 0.2585661 1
## 67 96.7 Male 62 Female 0.2926458 1
## 68 96.8 Male 75 Female 0.3045735 1
## 69 97.2 Male 66 Female 0.3548492 1
## 70 97.2 Male 68 Female 0.3548492 1
## 71 97.4 Male 57 Female 0.3813363 1
## 72 97.6 Male 61 Female 0.4085484 1
## 73 97.7 Male 84 Female 0.4223793 1
## 74 97.7 Male 61 Female 0.4223793 1
## 75 97.8 Male 77 Female 0.4363329 1
## 76 97.8 Male 62 Female 0.4363329 1
## 77 97.8 Male 71 Female 0.4363329 1
## 78 97.9 Male 68 Female 0.4503881 1
## 79 97.9 Male 69 Female 0.4503881 1
## 80 97.9 Male 79 Female 0.4503881 1
## 81 98.0 Male 76 Female 0.4645229 1
## 82 98.0 Male 87 Female 0.4645229 1
## 83 98.0 Male 78 Female 0.4645229 1
## 84 98.0 Male 73 Female 0.4645229 1
## 85 98.0 Male 89 Female 0.4645229 1
## 86 98.1 Male 81 Female 0.4787149 1
## 87 98.2 Male 73 Female 0.4929413 1
## 88 98.2 Male 64 Female 0.4929413 1
## 89 98.2 Male 65 Female 0.4929413 1
## 90 98.2 Male 73 Female 0.4929413 1
## 91 98.2 Male 69 Female 0.4929413 1
## 92 98.2 Male 57 Female 0.4929413 1
## 93 98.3 Male 79 Male 0.5071792 0
## 94 98.3 Male 78 Male 0.5071792 0
## 95 98.3 Male 80 Male 0.5071792 0
## 96 98.4 Male 79 Male 0.5214055 0
## 97 98.4 Male 81 Male 0.5214055 0
## 98 98.4 Male 73 Male 0.5214055 0
## 99 98.4 Male 74 Male 0.5214055 0
## 100 98.4 Male 84 Male 0.5214055 0
## 101 98.5 Male 83 Male 0.5355971 0
## 102 98.6 Male 82 Male 0.5497314 0
## 103 98.6 Male 85 Male 0.5497314 0
## 104 98.6 Male 86 Male 0.5497314 0
## 105 98.6 Male 77 Male 0.5497314 0
## 106 98.7 Male 72 Male 0.5637857 0
## 107 98.7 Male 79 Male 0.5637857 0
## 108 98.7 Male 59 Male 0.5637857 0
## 109 98.7 Male 64 Male 0.5637857 0
## 110 98.7 Male 65 Male 0.5637857 0
## 111 98.7 Male 82 Male 0.5637857 0
## 112 98.8 Male 64 Male 0.5777384 0
## 113 98.8 Male 70 Male 0.5777384 0
## 114 98.8 Male 83 Male 0.5777384 0
## 115 98.8 Male 89 Male 0.5777384 0
## 116 98.8 Male 69 Male 0.5777384 0
## 117 98.8 Male 73 Male 0.5777384 0
## 118 98.8 Male 84 Male 0.5777384 0
## 119 98.9 Male 76 Male 0.5915681 0
## 120 99.0 Male 79 Male 0.6052543 0
## 121 99.0 Male 81 Male 0.6052543 0
## 122 99.1 Male 80 Male 0.6187775 0
## 123 99.1 Male 74 Male 0.6187775 0
## 124 99.2 Male 77 Male 0.6321190 0
## 125 99.2 Male 66 Male 0.6321190 0
## 126 99.3 Male 68 Male 0.6452612 0
## 127 99.4 Male 77 Male 0.6581878 0
## 128 99.9 Male 79 Male 0.7191009 0
## 129 100.0 Male 78 Male 0.7304608 0
## 130 100.8 Male 77 Male 0.8103991 0
Error rate in prediction
## [1] 0.4230769
## [1] 0.5
## [1] 0.1538462
Create a fake dataset by adding 1F to the temperatures of males
temperature2 <- temperature
indicator.male <- temperature$Gender=="Male"
temperature2$Temp[indicator.male] <- temperature$Temp[indicator.male] + 1
Fit a logistic model:
Looking at the fitted model
prob_pred <- predict(model_gender_temp, type = "response")
with(temperature2,
{
n.gender <- as.numeric(Gender)
o.temp <- order(Temp)
plot(n.gender-1~jitter(Temp), col=n.gender,pch=n.gender)
lines(prob_pred[o.temp]~Temp[o.temp])
}
)
Looking at the predictions
predicted.gender <- mapvalues(prob_pred > 0.5, c(1,0), c("Male", "Female"))
pred.error <- (predicted.gender != temperature$Gender)*1
pred.table <- cbind(temperature,
data.frame(predicted.gender,
"Prob of Male"=prob_pred,
pred.error)
)
pred.table
## Temp Gender HR predicted.gender Prob.of.Male pred.error
## 1 96.3 Female 70 Female 0.001548955 0
## 2 96.7 Female 71 Female 0.004405894 0
## 3 96.9 Female 74 Female 0.007418867 0
## 4 97.0 Female 80 Female 0.009620182 0
## 5 97.1 Female 73 Female 0.012466464 0
## 6 97.1 Female 75 Female 0.012466464 0
## 7 97.1 Female 82 Female 0.012466464 0
## 8 97.2 Female 64 Female 0.016141138 0
## 9 97.3 Female 69 Female 0.020876079 0
## 10 97.4 Female 70 Female 0.026961933 0
## 11 97.4 Female 68 Female 0.026961933 0
## 12 97.4 Female 72 Female 0.026961933 0
## 13 97.4 Female 78 Female 0.026961933 0
## 14 97.5 Female 70 Female 0.034758970 0
## 15 97.5 Female 75 Female 0.034758970 0
## 16 97.6 Female 74 Female 0.044707208 0
## 17 97.6 Female 69 Female 0.044707208 0
## 18 97.6 Female 73 Female 0.044707208 0
## 19 97.7 Female 77 Female 0.057333573 0
## 20 97.8 Female 58 Female 0.073252479 0
## 21 97.8 Female 73 Female 0.073252479 0
## 22 97.8 Female 65 Female 0.073252479 0
## 23 97.8 Female 74 Female 0.073252479 0
## 24 97.9 Female 76 Female 0.093154555 0
## 25 97.9 Female 72 Female 0.093154555 0
## 26 98.0 Female 78 Female 0.117776673 0
## 27 98.0 Female 71 Female 0.117776673 0
## 28 98.0 Female 74 Female 0.117776673 0
## 29 98.0 Female 67 Female 0.117776673 0
## 30 98.0 Female 64 Female 0.117776673 0
## 31 98.0 Female 78 Female 0.117776673 0
## 32 98.1 Female 73 Female 0.147845762 0
## 33 98.1 Female 67 Female 0.147845762 0
## 34 98.2 Female 66 Female 0.183990643 0
## 35 98.2 Female 64 Female 0.183990643 0
## 36 98.2 Female 71 Female 0.183990643 0
## 37 98.2 Female 72 Female 0.183990643 0
## 38 98.3 Female 86 Female 0.226622108 0
## 39 98.3 Female 72 Female 0.226622108 0
## 40 98.4 Female 68 Female 0.275792967 0
## 41 98.4 Female 70 Female 0.275792967 0
## 42 98.4 Female 82 Female 0.275792967 0
## 43 98.4 Female 84 Female 0.275792967 0
## 44 98.5 Female 68 Female 0.331065522 0
## 45 98.5 Female 71 Female 0.331065522 0
## 46 98.6 Female 77 Female 0.391428207 0
## 47 98.6 Female 78 Female 0.391428207 0
## 48 98.6 Female 83 Female 0.391428207 0
## 49 98.6 Female 66 Female 0.391428207 0
## 50 98.6 Female 70 Female 0.391428207 0
## 51 98.6 Female 82 Female 0.391428207 0
## 52 98.7 Female 73 Female 0.455305682 0
## 53 98.7 Female 78 Female 0.455305682 0
## 54 98.8 Female 78 Male 0.520688507 1
## 55 98.8 Female 81 Male 0.520688507 1
## 56 98.8 Female 78 Male 0.520688507 1
## 57 98.9 Female 80 Male 0.585370188 1
## 58 99.0 Female 75 Male 0.647236838 1
## 59 99.0 Female 79 Male 0.647236838 1
## 60 99.0 Female 81 Male 0.647236838 1
## 61 99.1 Female 71 Male 0.704531834 1
## 62 99.2 Female 83 Male 0.756028831 1
## 63 99.3 Female 63 Male 0.801084527 1
## 64 99.4 Female 70 Male 0.839585002 1
## 65 99.5 Female 75 Male 0.871825935 1
## 66 96.4 Male 69 Female 0.026961933 1
## 67 96.7 Male 62 Female 0.057333573 1
## 68 96.8 Male 75 Female 0.073252479 1
## 69 97.2 Male 66 Female 0.183990643 1
## 70 97.2 Male 68 Female 0.183990643 1
## 71 97.4 Male 57 Female 0.275792967 1
## 72 97.6 Male 61 Female 0.391428207 1
## 73 97.7 Male 84 Female 0.455305682 1
## 74 97.7 Male 61 Female 0.455305682 1
## 75 97.8 Male 77 Male 0.520688507 0
## 76 97.8 Male 62 Male 0.520688507 0
## 77 97.8 Male 71 Male 0.520688507 0
## 78 97.9 Male 68 Male 0.585370188 0
## 79 97.9 Male 69 Male 0.585370188 0
## 80 97.9 Male 79 Male 0.585370188 0
## 81 98.0 Male 76 Male 0.647236838 0
## 82 98.0 Male 87 Male 0.647236838 0
## 83 98.0 Male 78 Male 0.647236838 0
## 84 98.0 Male 73 Male 0.647236838 0
## 85 98.0 Male 89 Male 0.647236838 0
## 86 98.1 Male 81 Male 0.704531834 0
## 87 98.2 Male 73 Male 0.756028831 0
## 88 98.2 Male 64 Male 0.756028831 0
## 89 98.2 Male 65 Male 0.756028831 0
## 90 98.2 Male 73 Male 0.756028831 0
## 91 98.2 Male 69 Male 0.756028831 0
## 92 98.2 Male 57 Male 0.756028831 0
## 93 98.3 Male 79 Male 0.801084527 0
## 94 98.3 Male 78 Male 0.801084527 0
## 95 98.3 Male 80 Male 0.801084527 0
## 96 98.4 Male 79 Male 0.839585002 0
## 97 98.4 Male 81 Male 0.839585002 0
## 98 98.4 Male 73 Male 0.839585002 0
## 99 98.4 Male 74 Male 0.839585002 0
## 100 98.4 Male 84 Male 0.839585002 0
## 101 98.5 Male 83 Male 0.871825935 0
## 102 98.6 Male 82 Male 0.898371311 0
## 103 98.6 Male 85 Male 0.898371311 0
## 104 98.6 Male 86 Male 0.898371311 0
## 105 98.6 Male 77 Male 0.898371311 0
## 106 98.7 Male 72 Male 0.919923983 0
## 107 98.7 Male 79 Male 0.919923983 0
## 108 98.7 Male 59 Male 0.919923983 0
## 109 98.7 Male 64 Male 0.919923983 0
## 110 98.7 Male 65 Male 0.919923983 0
## 111 98.7 Male 82 Male 0.919923983 0
## 112 98.8 Male 64 Male 0.937225306 0
## 113 98.8 Male 70 Male 0.937225306 0
## 114 98.8 Male 83 Male 0.937225306 0
## 115 98.8 Male 89 Male 0.937225306 0
## 116 98.8 Male 69 Male 0.937225306 0
## 117 98.8 Male 73 Male 0.937225306 0
## 118 98.8 Male 84 Male 0.937225306 0
## 119 98.9 Male 76 Male 0.950987648 0
## 120 99.0 Male 79 Male 0.961855614 0
## 121 99.0 Male 81 Male 0.961855614 0
## 122 99.1 Male 80 Male 0.970388761 0
## 123 99.1 Male 74 Male 0.970388761 0
## 124 99.2 Male 77 Male 0.977058518 0
## 125 99.2 Male 66 Male 0.977058518 0
## 126 99.3 Male 68 Male 0.982253426 0
## 127 99.4 Male 77 Male 0.986288499 0
## 128 99.9 Male 79 Male 0.996264009 0
## 129 100.0 Male 78 Male 0.997122799 0
## 130 100.8 Male 77 Male 0.999645523 0
Error rate in prediction
## [1] 0.1615385
## [1] 0.5
## [1] 0.6769231