Data Analysis with Python
Week 5 Quiz Answer
Practice Quiz 1
Model Evaluation
Q1) What is the correct use of the "train_test_split function such that 90% of the data samples will be utilized for training, the parameter "random_state" is set to zero, and the input variables for the features and targets are x_data, y data respectively.
- train_test_split(x_data, y_data, test_size-0.9, random_state)
- train_test_spilt(x_data, y_data, test_size-0.1, random_state)
Q2) What is the problem with the In-sample evaluation
- it does not tell us how well the trained model can be used to predict new data
- it's slow because there is more data
Practice Quiz 2
Overfitting, Underfitting and Model Selection
Q1) what model should you select
- a
- b
- c
Q2) the following is an example of:
- overfitting
- perfect fit
- underfitting
Practice Quiz 3
Practice Quiz 4
- train_test_split(x_data, y data, test_size=0, random_state=0.4)
- train_test_split(x_data, y_data, test_size=0.4, random_state=0)
- train_test_split(x_data, y_data)
- This function finds the free parameter alpha
- The average R^2 on the test data for each of the two folds
- The predicted values of the test data using cross-validation
- RRLinearRegression(alpha=10)
- RR=Ridge(alpha=10)
- RR=Ridge(alpha=1)
- alpha=[1, 10, 1001]
- [{'alpha': [1,10,100]}]
- [{'alpha': [0.001,0.1,1, 10, 100, 1000, 10000, 100000, 100000],'normalitze':[True,false]}]
- You should always use the simplest model
- 100-th order polynomial will work better on unseen data
- The results on your training data is not the best indicator of how your model performs: you should use your test data to get a better idea
- 3
- 4
- 1
- Overfitting
- Perfect fit
- Underfitting
_____________________________________________________________
Data Analysis with Python
_____________________________________________________________
0 Comments