Group work for a Monash Research Methods course

Merge branch 'master' of https://github.com/Dekker1/ResearchMethods

+71 -30
+70 -29
wk7/week7.tex
··· 23 23 24 24 \section{Introduction} \label{sec:introduction} 25 25 26 + In this report we have documented a series of hypothesis tests regarding provided data in high-ranking 27 + Tennis players. The focus of these hypotheses concerns a player's handedness with regards to overall 28 + ranking. We first provide an overview of how we address these notions, with visualisations and 29 + descriptions of our overall methodology. Following this, we then provide a brief discussion of what we 30 + can infer given our statistical analysis techniques. 31 + 26 32 \section{Method} \label{sec:method} 27 33 28 - We are testing two hypotheses. The first hypothesis that we test is that tall players have an advantage over smaller players. The second hypothesis that we test is that left-handed players have an advantage over right-handed players. To build an intuition of how the data behaves with respect to the hypotheses we are testing, we created visual representations using tools from the Matplotlib, and Seaborn libraries and then we perform statistical tests to measure these effects. 34 + We are testing two hypotheses. The first hypothesis that we test is that tall players have an advantage 35 + over smaller players. The second hypothesis that we test is that left-handed players have an advantage 36 + over right-handed players. To build an intuition of how the data behaves with respect to the hypotheses 37 + we are testing, we created visual representations using tools from the Matplotlib, and Seaborn libraries 38 + and then we perform statistical tests to measure these effects. 29 39 30 40 \subsection{Visualisation} \label{subsec:visualisation} 31 41 ··· 89 99 90 100 \section{Results} \label{sec:results} 91 101 102 + We investigate both the advantage of height and the advantage of being 103 + left-handed using a $\chi^2$ test and a T-test. For every test we will state 104 + the exact hypothesis and the null-hypothesis. 105 + 92 106 \subsection{The advantage of height} 93 107 108 + \textbf{$\chi^2$-test:} To test if there is an advantage of being tall we ran 109 + a $\chi^2$ with the following hypotheses:\\ 110 + $H$: Players that are taller have a higher rank \\ 111 + $H_0$: The rank of a player is independent of their height \\ 112 + \\ 113 + To perform the test the players are groups into groups dependant on their 114 + rank and if they are taller than the mean height for their gender. The 115 + expected data is computed using the chances of being taller than the mean, and 116 + the chance of being in the group of rankings. The data used is found in table 117 + 1. 118 + 94 119 \begin{table}[ht] 95 120 \centering 96 - \label{tab:chi-height} 97 121 \begin{tabular}{|l|r|r|r|r|} 98 122 \hline 99 123 & \textbf{M: 168 - 188} & \textbf{M: 189 - 210} & \textbf{F: 155 - 171} & \textbf{F: 172 - 189} \\ \hline ··· 105 129 & 7 / 8 \\ 106 130 \hline 107 131 \end{tabular} 132 + \label{tab:chiheight} 108 133 \caption{Observed / Expected values used for the $\chi^2$-test. The groups are divided by their rank (vertical) and, per gender, their height (horizontal).} 109 134 \end{table} 110 135 111 - $$ 112 - \chi^2 \approx 7.697606186049128 113 - $$ 136 + The $\chi^2$ value found is approximately $7.697606186049128$. With 12 degrees 137 + of freedom our $p$-value will be $0.8082925814979871$ 114 138 115 - $$ 116 - df = (5-1)(4-1) = 12 117 - $$ 139 + \textbf{T-test:} A slightly different hypothesis can be tested using a T-Test: 140 + \\ 141 + $H$: Players that are taller have significantly more point \\ 142 + $H_0$: The points a player has is independent of their height \\ 118 143 119 - $$ 120 - \chi^2(7.69\dots,12) \approx 0.8082925814979871 121 - $$ 144 + We ran this T-test twice, once for the women and once for the men, by 145 + splitting the groups of players into two: one being taller than the mean 146 + height, one being shorter than the mean height. Our T-test for the men 147 + revealed a T-value of 1.711723, this has a p-value of 0.043815. For the women 148 + the T-value found was 1.860241, which has a p-value of 0.032030. 122 149 150 + \subsection{The advantage of left-handedness} 123 151 124 - \textbf {t-test men:} T score: 1.711723, P score: 0.043815 125 - 126 - \textbf {t-test women:} T score: 1.860241, P score: 0.032030 127 - 128 - 129 - \subsection{The advantage of left-handedness} 152 + \textbf{$\chi^2$-test:} To test if there is an advantage of being left-handed 153 + we ran a $\chi^2$ with the following hypotheses:\\ 154 + $H$: Players that are left-handed have a higher rank \\ 155 + $H_0$: The rank of a player is independent their preferred hand \\ 156 + \\ 157 + To perform the test the players are groups into groups dependant on their rank 158 + and if they play with their left hand. The expected data is computed using the 159 + chances of being left-handed. The data used is found in table 160 + 2. 130 161 131 162 \begin{table}[ht] 132 163 \centering 133 - \label{tab:chi-hand} 164 + \label{tab:chihand} 134 165 \begin{tabular}{|l|l|l|l|l|l|} 135 166 \hline 136 167 & \textbf{1 - 99} & \textbf{100 - 199} & \textbf{200 - 299} & \textbf{300 - 399} & \textbf{400 - 499} \\ ··· 143 174 \caption{Observed / Expected values used for the $\chi^2$-test. The groups are divided by which hand they use (vertical) and their rank (horizontal).} 144 175 \end{table} 145 176 177 + The $\chi^2$ value found is approximately $6.467312944404331$. With 4 degrees 178 + of freedom our $p$-value will be $0.1668616190847413$ 146 179 147 - $$ 148 - \chi^2 \approx 6.467312944404331 149 - $$ 180 + \textbf{T-test:} A slightly different hypothesis can be tested using a T-Test: 181 + \\ 182 + $H$: Players that are left-handed have significantly more point \\ 183 + $H_0$: The points a player has is independent of their preferred hand \\ 150 184 151 - $$ 152 - df = (2-1)(5-1) = 4 153 - $$ 185 + We ran this T-test by splitting the groups of players into two depending on 186 + their preferred hand. Our T-test revealed a T-value of 0.451694, 187 + this has a p-value of 0.325815. 154 188 155 - $$ 156 - \chi^2(6.46\dots,4) \approx 0.1668616190847413 157 - $$ 158 189 159 - \textbf {t-test:} T score: 0.451694, P score: 0.325815 190 + \section{Discussion} \label{sec:discussion} 160 191 192 + In our investigation we did not find any strong correlation between the 193 + ranking of a player (or their number of points) and with which hand they 194 + played or how tall they are. Most tests failed to pass the required p value of 195 + $<0.05$. The only tests that did give us positive results are the T-test that 196 + were conducted on the correlation between height and the number of points. 197 + However, without the $\chi^2$-test confirming the correlation, the existence 198 + of the correlation is questionable. 161 199 162 - \section{Discussion} \label{sec:discussion} 200 + These results might not be so surprising when the visual exploration is taken 201 + into account. Only slight deviations are visible in our graphs, so the test 202 + mainly confirmed our suspicion that no definitive correlation exists between 203 + the different attributes. 163 204 164 205 \end{document}
+1 -1
wk8/week8.tex
··· 47 47 The re-analysis is conducted on the data provided in the 48 48 paper\cite{dong2018methods}, using Python in conjunction with packages such as 49 49 pandas, matplotlib, numpy and seaborn, to process and visualise the data. As 50 - aformentioned, only spatial data and the variables mentioned above are 50 + aforementioned, only spatial data and the variables mentioned above are 51 51 considered, for the reference days and the change occuring Day 62 (day of 52 52 first socially disruptive event). The distribution of the difference between 53 53 the reference period and Day 62 is visualised by plotting a histogram for each