The project states to investigate two categorical variables whose relationship seem interesting. The two categorical variables, which we have chosen to investigate is gender and smoking. This left us to decide on how we were going to proceed.The sampling frame consists of female and male students who attended John Abbott College, on full-time bases. We know had to decide what question we would ask.We decided to go with a simple question which would end up giving us a direct yes orno answer. The question, which we decided to ask, “Do you or do you not smoke?” We took a sample size of forty students: 21 males and 19 females. It should be noted that the larger the sample size the more reliable the results are likely to be.There are several types of chi-square test depending on the way the data was collected and the hypothesis being tested. We will begin with the simplest case: a 2X2 contingency table. When a comparison is made between one sample and another, a simple rule is that the degrees of freedom equal (number of columns minus one) X (number of rows minus one).For our data this gives us (2-1) X (2-1) equals 1. The table had percentages across. Comparison down the columns, suggest that men (33.3%) are less likely to smoke compared to women (47.4%) whom smoke. On the other handmen (66.7%) are less likely to smoke compared to women (52.6%).In this case the response variable is smoking and the explanatory variable is gender (male/female). The null hypothesis states that gender and smoking are statistically independent; there is no relationship between smoking and gender. The alternative hypothesis states that gender and smoking are statistically associated; there is a relationship between gender and smoking.By using the chi-square statistic we will be able to test the significance level ?. Our chi-square value computes to .819. To test the significance level ?, we will use table 24.1 on page 475 of the course text. As we can conclude from this table the chi-square statistic is not significant at any level. The experiment shows that a prediction of smokers from gender would most likely be inaccurate.The experiment itself is not accurate on the account that it was performed on a convenience sample, which could be the major source of bias. The forty-selected sample is not in anyway representative of the population. The subjects that were chosen for the experiment were chosen because of mere acquaintance. Further more, they wereNot chosen in a certain age group nut on the other hand they were all students who attended John Abbott College.To conclude, even if the data obtained was not biased (SRS) and the results were the same, a prediction would still be inaccurate. There are defiantly more reasons not just based on gender why people smoke. Perhaps in a properly conducted experiment, there would be a better possibility that the chi-square value would be significant and the predictions would be far more accurate.(Attached is raw data and the Gender*Smoking cross-tabulation and theChi-square test.)Question Asked:Do you or do you not smoke?Males:Who smoke: Tallied: lllll llFrequency: 7/40Percentage: 17.5%Who do not smoke: Tallied: lllll lllll llllFrequency: 14/40Percentage: 35%Females:Who smoke: Tallied: lllll llllFrequency: 9/40Percentage: 22.5%Who do not smoke: Tallied: lllll lllllFrequency: 10/40Percentage: 25%Quantitative MethodsTeacher: William BoshuckProject #2Presented by:Sarah Pinsonneault&Andy Patch