Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document other tutorials and resources for background / more info #11

Open
mghpcsim opened this issue Jun 22, 2020 · 4 comments
Open

Document other tutorials and resources for background / more info #11

mghpcsim opened this issue Jun 22, 2020 · 4 comments
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@mghpcsim
Copy link
Contributor

On the tutorial website, there will be a resources page to provide links to other tutorial content we've done and more info on specific topics as needed.

This ticket is a catchall for those things.

This is where to put all the things David has been sending to the list for example.

@mghpcsim mghpcsim added the question Further information is requested label Jun 22, 2020
@mghpcsim mghpcsim added this to the PEARC Tutorial milestone Jun 22, 2020
@koomie
Copy link
Contributor

koomie commented Jun 23, 2020

@mghpcsim
Copy link
Contributor Author

Karl added the PEARC 17 and 19 tutorial to the GH pages site;

Others have stuff as well especially on containers; need to figure out how best to incorporate them into the GH pages site ala PEARC 17 and 19 or just add one page as a resources / link collation page

@mghpcsim mghpcsim added the help wanted Extra attention is needed label Jul 13, 2020
@DavidBrayford
Copy link
Contributor

DavidBrayford commented Jul 13, 2020

For exercise 5 we can use this tutorial as a template https://github.com/DavidBrayford/HPAI/blob/master/tutorial/Intel_HPC_DevCon

I've also uploaded the Charliecloud container to my onedrive account and shared a link with Chris S.

i can rewrite the recipe for PEARC20.

Charliecloud execution line for PEARC20:
No MPI:
ch-run -w ./pearc_tutorial_image -- python /tensorflow/benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --model alexnet --batch_size 128 --data_format NCHW --num_batches 100 --distortions=False --mkl=True --local_parameter_device cpu --num_warmup_batches 10 --optimizer rmsprop --display_every 10 --variable_update horovod --horovod_device cpu --num_intra_threads 8 --kmp_blocktime 1 --num_inter_threads 2

Using MPI inside container:
mpiexec -n 2 ch-run -w ./pearc_tutorial_image -- python /tensorflow/benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --model alexnet --batch_size 128 --data_format NCHW --num_batches 100 --distortions=False --mkl=True --local_parameter_device cpu --num_warmup_batches 10 --optimizer rmsprop --display_every 10 --variable_update horovod --horovod_device cpu --num_intra_threads 8 --kmp_blocktime 1 --num_inter_threads 2

This uses the MPI libraries installed within the container, be sure to ensure that the MPI version inside the library is compatible with the system version of MPI.

Using system MPI from within the container:
mpiexec -n 2 ch-run -b /where/MPI/Module/are.:/where/MPI/modules/are -w ./pearc_tutorial_image -- python /tensorflow/benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --model alexnet --batch_size 128 --data_format NCHW --num_batches 100 --distortions=False --mkl=True --local_parameter_device cpu --num_warmup_batches 10 --optimizer rmsprop --display_every 10 --variable_update horovod --horovod_device cpu --num_intra_threads 8 --kmp_blocktime 1 --num_inter_threads 2

This uses the runtime MPI libraries on the system (for example, module load mpich.mpi) by importing the host environment (PATH etc) and maps the host directories to equivalent directories inside the container., improves stability, scalability and performance on large HPC systems with tuned MPI setup.
Ideally you want to build you application inside the container with the same vendor version of MPI as used on the system. For example, if the host system has been optimized to use Intel MPI, install Intel MPI inside the container and build the containerized application with Intel MPI.

@bkmgit
Copy link

bkmgit commented Apr 12, 2021

It may be helpful to have some of this material in HPC Carpentry

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants