2024-09-04
Introduction
Summary
keywords
TO-DO
Homework
Exercise*
Next time
What is Statistical Learning?
Predicting the target variable (Y) by the relation with inputs($X_i$)
How can we find the predicting function $f(X)$?
How can we prove the function $f$ is a "well-predicting function"?
Function Confirmation
evaluate the error from the differences between function value and the actual observed value.
i.e. function that minimises mean-squared prediction error $E[(Y-g(X))^2]$ over all function $g$
irreducible error 감소시킬 수 없는 에러 : $\epsilon = Y-f(x)$ ; even if we find $f$ there would still be errors in prediction due to the noise in $X$
($\hat{f}$ is my function I am constructing, $f$ is the ideal predicting function we want to find.)
Introduce the concept of Neighbourhood
when there's no Y observed value for a specific X
(too sparse data pool)
Select a neighbourhood $N(x)$ around $x$
get the average of $Y$s of any existing $x$ in the neighbourhood $N$
What if there is no neighbour?
When does it happen?
When the dimension of features(number of features) is too many. Nearest neighbours tend to be far away in high dimensions.
When you are at the boundary.
Finding the function
set a model you predict the function would look like.
1. Linear model
we are looking for the coefficient $\beta_i$ .
2. Quadratic model maybe better..
3. Multiple variables could over-fit, less flexible...
Trade-offs
accuracy vs. interpret-ability(이해/연산하기 쉬운)
Parsimony(절약한) vs. black-box(개복잡한)
when the truth curve is linear, linear model could be flexible.
If not, higher dimension model could fit better.*
Red line is test dataset errors, gray line is training dataset errors. Although green performs better than blue in training cases, it will be less confident in general test cases.
Last updated