Group work for a Monash Research Methods course

Move overfitting footnote to it's first occurrence

+13 -14
+13 -14
mini_proj/report/waldo.tex
··· 149 149 (binary) tree. Each non-leaf node contain a selection criteria to its 150 150 branches. Every leaf node contains the class that will be assigned to the 151 151 instance if the node is reached. In other training methods, decision trees 152 - have the tendency to overfit, but in random forest a multitude of decision 153 - tree is trained with a certain degree of randomness and the mean of these 154 - trees is used which avoids this problem. 152 + have the tendency to overfit\footnote{Overfitting occurs when a model learns 153 + from the data too specifically, and loses its ability to generalise its 154 + predictions for new data (resulting in loss of prediction accuracy)}, but in 155 + random forest a multitude of decision tree is trained with a certain degree 156 + of randomness and the mean of these trees is used which avoids this problem. 155 157 156 158 \subsection{Neural Network Architectures} 157 159 ··· 233 235 models; requiring training on a dataset of typical images. Each network was 234 236 trained using the preprocessed training dataset and labels, for 25 epochs 235 237 (one forward and backward pass of all data) in batches of 150. The number of 236 - epochs was chosen to maximise training time and prevent 237 - overfitting\footnote{Overfitting occurs when a model learns from the data 238 - too specifically, and loses its ability to generalise its predictions for 239 - new data (resulting in loss of prediction accuracy)} of the training data, 240 - given current model parameters. The batch size is the number of images sent 241 - through each pass of the network. Using the entire dataset would train the 242 - network quickly, but decrease the network's ability to learn unique features 243 - from the data. Passing one image at a time may allow the model to learn more 244 - about each image, however it would also increase the training time and risk 245 - of overfitting the data. Therefore the batch size was chosen to maintain 246 - training accuracy while minimising training time. 238 + epochs was chosen to maximise training time and prevent overfitting of the 239 + training data, given current model parameters. The batch size is the number 240 + of images sent through each pass of the network. Using the entire dataset 241 + would train the network quickly, but decrease the network's ability to learn 242 + unique features from the data. Passing one image at a time may allow the 243 + model to learn more about each image, however it would also increase the 244 + training time and risk of overfitting the data. Therefore the batch size was 245 + chosen to maintain training accuracy while minimising training time. 247 246 248 247 \subsection{Neural Network Testing}\label{nnTesting} 249 248