Instance Based Learning, KNN

Introduction

Pros

Remember
Fast
Simple

Cons

No generalization
Overfitting
Can't handle multiple reference to the same value/ weights.

Cost of the house

KNN

KNN Bias

Preference Bias: Our believe about what makes a good hypothesis

Locality -> near points are similar
Smoothness -> averaging
An features matter equally ( distances are the same for different dimensions)

Won't you compute my neighbors?

Curse of Dimensionality

As the number of features or dimensions grows, the amount of data we need to generalize accuracy grows exponentially.

Some other stuff

Distance(x,q) , Euclian, Manhattan (weighted) - regressions, mismatches for "colors"
- Define a function to measure similar functionality between different points.
- The idea of similarity measurement.
K : how to pick K. K = n. Weighted average n.
- Locally weighted regression
- Locally weighted linear regression

What have we learned

Instance based learning
Lazy vs. eager learning.
KNN.
Nearest neighbor: similarity function (distance)
- Domain knowledge matters.
Classification vs regression
Averaging.
Locally weighted $X regression.

No free lunch

Any learning algorithm that you create, if you average over all possible instances, it's doing similar to random.

Solution

Use the domain knowledge to pick up the best similarity functions.

Motivation of this class

Let students learn the domain knowledge and pick up the right approach for the learning task.

Introduction​

Pros​

Cons​

Cost of the house​

KNN​

KNN Bias​

Won't you compute my neighbors?​

Curse of Dimensionality​

Some other stuff​

What have we learned​

No free lunch​

Solution​

Motivation of this class​