Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate ways to speed up venv creation #317

Open
mdboom opened this issue Dec 11, 2024 · 9 comments
Open

Investigate ways to speed up venv creation #317

mdboom opened this issue Dec 11, 2024 · 9 comments

Comments

@mdboom
Copy link
Contributor

mdboom commented Dec 11, 2024

According to the log timestamps, benchmark runs currently spend 3 minutes creating virtual environments.

We might be able to reduce that by caching the results of the sat solver and reusing that (that should only change when pyperformance/python_macrobenchmarks are updated).

An earlier experiment with uv was about 2x faster (from memory), but that adds another dependency.

Cc: @mpage, @colesbury

@mpage
Copy link
Contributor

mpage commented Dec 11, 2024

benchmark runs currently spend 3 minutes creating virtual environments.

Is that 3 minutes per benchmark run or 3 minutes in aggregate?

@mdboom
Copy link
Contributor Author

mdboom commented Dec 11, 2024

Is that 3 minutes per benchmark run or 3 minutes in aggregate?

In aggregate.

@mdboom
Copy link
Contributor Author

mdboom commented Dec 16, 2024

For anyone seeing significantly longer venv creation times than ~3 minutes, @Fidget-Spinner had success by upgrading pip to a more recent version: python/pyperformance#375 (comment)

@Fidget-Spinner
Copy link
Contributor

I upgraded pip but it reduced the time by half. So it's still somewhere like 15-30 minutes. I know someone else who just set up pyperformance and it's doing the same too. So maybe this is really worth it to implement for machines that are somehow slower.

@mdboom
Copy link
Contributor Author

mdboom commented Jan 8, 2025

Isn't it the case that if you are running locally, the venvs are reused between runs, so this cost is only paid once?

For github actions (clean VMs), of course we pay it every time, but there the 3 min cost seems negligible.

Or am I missing something?

@Fidget-Spinner
Copy link
Contributor

but there the 3 min cost seems negligible

Yeah that's the strange part. For my machine and the other person I mentioned, it takes longer than that. Something like 30 minutes to set up venvs in total. I upgraded pip to the latest version.

@mdboom
Copy link
Contributor Author

mdboom commented Jan 8, 2025

If this was run in Github Actions, the log should have a timestamp for each line, so maybe that would offer some clues as to where the time is going.

@Fidget-Spinner
Copy link
Contributor

This was manual pyperformance sadly, so I don't have timestamps anymore, sorry! However, on my end I ran the benchmarks manually on my machine and watched them and the venv creation was definitely way over 3 minutes.

@mdboom
Copy link
Contributor Author

mdboom commented Jan 8, 2025

Interesting. My main objection was to the approach was creating the venvs in parallel, since I don't think that's guaranteed to create the same results each time which is bad for reproducibility.

It might be fruitful to experiment with using uv (but still creating the venvs in order) rather than pip to see if that helps. (An alias from pip install to uv pip install may be enough). Even if we don't end up using that on Github Actions (where installing uv negates the benefit a bit), having that as an option might help people running pyperformance locally who are hitting this issue).

Though I suspect there's just some environmental difference (maybe file system setup) that makes it slower in some cases -- but that's just a hunch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants