4. Cross Validation

4.1 Training and Test Set

We impose the iid\iid assumption, i.e. (Xi,Yi)FX,Y(\rvec X_i, Y_i) \sim F_{\rvec X, Y}. Recall that the estimate m^\mhat is constructed by using some estimator based on the data (X1,Y1),,(Xn,Yn)(\rvec X_1, Y_1), \ldots, (\rvec X_n, Y_n). We then would like to evaluate the accuracy of the estimated target which is based on the training data. A principal problem thereby is that if we use the training data again to measure the predictive power of our estimated target, e.g. m^\mhat, the results will be overly optimistic. Thus, we could look how well the estimated.