Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU Single threaded performance #22

Open
szilard opened this issue May 11, 2019 · 8 comments
Open

CPU Single threaded performance #22

szilard opened this issue May 11, 2019 · 8 comments
Labels

Comments

@szilard
Copy link
Owner

szilard commented May 11, 2019

This might be relevant for training lots of models (100s, 1000s...) on smaller data, when running them in parallel 1 model/CPU core would be probably the most efficient if the data is small and all the datasets (or if on same data, then multiple copies of the data) fit in RAM.

@szilard
Copy link
Owner Author

szilard commented May 11, 2019

using c5.xlarge to have higher frequency CPU (4 cores, so 2 physical cores, leaving some resources to the EC2 hypervisor if needed; also 8GB RAM)

c5.18xlarge (72 cores, 144GB RAM) could run 36 such models in parallel on physical cores if data+train does not use more than 4GB/run); one could also test running 72 models in parallel (but measure the effect of hyperthreading on speed/thoughput) if data+train can be confined to 2GB/run.

@szilard
Copy link
Owner Author

szilard commented May 11, 2019

@szilard
Copy link
Owner Author

szilard commented May 11, 2019

0.1m:
h2o 24.46 0.702228
xgboost 3.823 0.7324224
lightgbm 3.816 0.7298355
catboost 27.113 0.7225903
Rgbm 13.661 0.7190915
1m:
h2o 128.121 0.7623496
xgboost 26.692 0.7494959
lightgbm 20.393 0.7636987
catboost 273.306 0.7402029
Rgbm 233.34 0.7373496

RAM usage 1M:
h2o 0.6GB
xgb 1.0GB
lgbm 1.0GB
catboost 2.0GB
Rgbm 0.8GB

@szilard
Copy link
Owner Author

szilard commented May 11, 2019

  time [s] AUC time [s] AUC
  0.1m:   1m:  
h2o 24.5 0.702 128.1 0.762
xgboost 3.8 0.732 26.7 0.749
lightgbm 3.8 0.730 20.4 0.764
catboost 27.1 0.723 273.3 0.740
Rgbm 13.7 0.719 233.3 0.737

@szilard
Copy link
Owner Author

szilard commented May 11, 2019

combined with previous results on c5.9xlarge (18 threads) #13 (comment) :

c5.xlarge 1 thread 0.1m: 1m: 0.1->1m
h2o 24.5 128.1 5.2
xgboost 3.8 26.7 7.0
lightgbm 3.8 20.4 5.3
catboost 27.1 273.3 10.1
Rgbm 13.7 233.3 17.1
c5.9xlarge 18 threads 0.1m: 1m: 0.1->1m
h2o 8.7 14.1 1.6
xgboost 3.2 10.8 3.3
lightgbm 2.0 4.3 2.2
catboost 4.6 33.9 7.4
1->18 threads 0.1m: 1m:  
h2o 2.8 9.1  
xgboost 1.2 2.5  
lightgbm 1.9 4.7  
catboost 5.9 8.1  

@szilard
Copy link
Owner Author

szilard commented May 11, 2019

numbers inside the red area are time in seconds, outside are ratios:

Screen Shot 2019-05-11 at 7 08 49 AM

@Laurae2
Copy link

Laurae2 commented May 11, 2019

Hardware/Software: #12

hist xgboost, 1 model:

Size Time (1T) Time (9T) Time (18T) Time (36T) Time (70T)
0.1M 3.062 2.735 3.407 10.515 56.710
1M 23.293 12.524 12.929 25.980 96.465
10M 220.200 86.092 70.121 106.479 271.683
100M 2373.128 858.772 675.223 756.661 1142.271

LightGBM, 1 model:

Size Time (1T) Time (9T) Time (18T) Time (36T) Time (70T)
0.1M 2.983 1..801 2.389 3.266 5.943
1M 15.919 5.363 4.891 5.568 10.487
10M 180.748 53.689 48.260 47.033 53.234
100M 1930.816 578.734 560.296 507.580 507.627

@szilard
Copy link
Owner Author

szilard commented May 13, 2019

Concurrent usage (training many models on the same hardware at the same time to see e.g. throughput etc.) will be studied in this repo by @Laurae2 (with some of my involvement) here: Laurae2/ml-perf#3

@szilard szilard changed the title Single threaded performance CPU Single threaded performance May 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants