仓库源文站点原文


layout: post title: "ML4T笔记 | 03-03 Assessing a learning algorithm" date: "2019-01-28 01:28:28" categories: 计算机科学 auth: conge

tags: Machine_Learning Trading ML4T OMSCS

01 - Overview

image.png

In this lesson, we'll learn methods for assessing those algorithms.

Time: 00:00:25

02 - A closer look at KNN solutions

KNN

Time: 00:01:59

03 - What Happens as K Varies Quiz

what happens to the model when we change the value of k?

So we've got three k nearest neighbor models here.

Time: 00:01:07

04 - What Happens as D Varies Quiz

Given a polynomial model of degree d.

  1. what happens to the model when we change the value of d?
  2. True or false, as we increase d we are more likely to overfit.

Answer: 1) d=1, a linear model. (C) 2) d=2, a parabola. 3)

Time: 00:01:37

05 - Metric 1: RMS Error

Time: 00:01:32

06 - In Sample vs out of sample

07 - Quiz: Which is worse

Which sort of error would you expect to be larger?

Which sort of error would you expect to be larger?

The answer is: out of sample error is expected to be worse

Time: 00:00:07

08 - Cross-validation

.

Time: 00:01:09

09- Roll forward cross-validation

Time: 00:00:47

10 - Metric 2 correlation

Time: 00:02:14

11 - Correlation and RMS error

My answer is: when error decreases, the correlaion become stronger. But the value could go both ways as correlation becomes stronger. E.g.: Negative correlation is stronger when the r value decreases, but when correlation is positive, it is stronger when the r value increase.> Time: 00:00:30

12 - Overfitting

In the graph:

Example: parameterized polynomial models where we can add additional factors, like x, x squared, x cubed, x to the fourth, and so on.

As we increase degrees of freedom, our in sample error is decreasing, but our out of sample error is increasing. And that's how we define overfitting.

Time: 00:02:14

13 - Overfitting Quiz

image.png

overfitting in KNN

As K increases, how the in-sample error and out-sample error will look like?

Overfitting Solution

The answer is b.

Time: 00:00:57

14 - A Few Other Considerations Quiz

 Which are better?

Solution:

  1. Linear regression does not need much to save the model but KNN requires all the data to be saved.
  2. Compute time to train. KNN takes zero time to train KNN. Linear regression has to take all that data, compute over it, to find those parameters.
  3. Compute time to query. LinReg wins. KNN requires time to query across all the data.
  4. Ease to add new data. KNN wins that because all you gotta do is just plop it in there, no re-calculation. With linear regression, you have to add the new data and then recompute the factors.

Time: 00:01:14

Total Time: 00:21:10

2019-01-28 初稿