Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heuristics for estimating job run times #67

Open
espenhgn opened this issue Apr 12, 2021 · 2 comments
Open

Heuristics for estimating job run times #67

espenhgn opened this issue Apr 12, 2021 · 2 comments

Comments

@espenhgn
Copy link
Collaborator

espenhgn commented Apr 12, 2021

With all the NEST benchmarking efforts going on, do you have (plan) a model for estimating wall times? Aka some function returning wt=f(N_pops, K_conns, N_mpi, N_proc, t_sim, ...)

I'm thinking of something for the LFP predictions which may have to be broken down into parallel jobs in the macaque v1 case.

@jhnnsnk
Copy link
Collaborator

jhnnsnk commented Apr 12, 2021

Such a function would not be trivial since these parameters depend strongly on the machine. For network construction, the specific connection routines are very different in terms of their wall-clock time. So far, we do not have such general estimates but per-model examples to compare with.

@espenhgn
Copy link
Collaborator Author

Let's not overcomplicate this. For now all simulations run on one HPC resource (well perhaps also JUSUF which is anyway similar). We also need only to consider one connection routine (distance-dependent) which is already a worst case scenario. For the sake of brewity, let N_mpi and N_proc be constants, what I had in mind for the network is that create times would be approximately linearly dependent on N_pops.sum(), connect times perhaps with some exponential dependency so proportional with K_conns^c and run times again linearly dependent on t_sim * N_pops.sum() etc.

wt = a * N_pops.sum() + b * (K_conns^c).sum() + d * t_sim * N_pops.sum() + ..

As you are have set up different network sizes (mesocircuit etc.) could we not estimate expected wall times a priori if thesse coefficients can be fitted? If not, what strategy do you have when setting wall times which for now are hardcoded in the base_system_params file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants