Skip to content

Model Selection Using PCR, PLSR, Best subsets, Ridge Regression and Lasso Regression

License

Notifications You must be signed in to change notification settings

gapkim/cystfibr_dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Comparison of Model Selection Using PCR, PLSR, Best subsets, Ridge Regression and Lasso Regression on cystfibr dataset

The performances of five model selection methods, Principal Component regression (PCR), Partial Least Squares regression (PLSR), Best subsets, Ridge regression and Lasso regression, have been compared using the ‘cystfibr’ dataset from the ‘ISwR’ library. A Monte Carlo cross validation with sampling size of 100 is used to determine the optimal model that regressed maximum expiratory pressure (pemax) on 9 predictors. First, a detailed analysis was performed for PCR and PLSR. Then, a comparison of the five model selection methods was performed in terms of Test MSE and predictor selection. The PCR, PLSR and Best subsets selection method had the lowest Test MSE. A spectral analysis was used to determine the predictors that had the postive and negative contributions on ‘pemax’.

File Description

GitHub_Proj6.pdf: Project report in PDF
GitHub_Proj6.R: R script

You can view the Project Report in HTML by clicking here.