-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backup-restore throughput is awful #333
Comments
Yes, this is a known problem with Tarsnap. The extract performance is currently latency-bound by the network connection to S3. @cperciva has a plan for addressing it, but at the moment we are not announcing any estimated time for this improvement. This can be mitigated somewhat by doing parallel extracts: |
Thanks, the context helps a little. It's unfortunate the throughput is so limited in the product tool. I've gone ahead and forked the Edit: with about 500 workers it seems I can pull 80-140 Mbps out of EC2 even with relatively small files. |
Sent my redsnapper changes upstream in case they're helpful to others: https://github.com/directededge/redsnapper/pulls . I suggest Edit: my fork is here: https://github.com/cemeyer/redsnapper |
I don't think |
Well, I'd appreciate it if you'd solve the restore performance problem in the EC2 has big pipes, S3 has big pipes, and I have a fast computer and fast internet connection, and even with 100+ parallel jobs, redsnapper+tarsnap can only use about 16% of my available internet bandwidth. |
Any update about whether this issue will be addressed in tarsnap? |
At the moment we can't provide an estimated time of completion, sorry. |
Some reasons for moving on: - Tarsnap/tarsnap#333 - https://www.tarsnap.com/faq.html#out-of-money DM me recommendations for good cmd line backup
@cperciva and @gperciva. Any progress on improving tarsnap backup-restore throughput, as per this ticket from Nov 2018? I thought I'd ask, as I've been evaluating tarsnap for production use. The design philosophy is great, the cli tool is nice to use, but I keep seeing recent concerns re restore speed:
As per Tarsnap "improve speed" tips, I've been experimenting with restoring named files in parallel, using this gist as a basis. Am currently running tests from my UK computer, but will also try from US cloud to be closer to the data. Many thanks Greg |
Brief update re Tarsnap performance test. Restore performance for 17 git repos totalling 2.7G, using 100 tarsnap clients in parallel with xargs (as per #333 (comment)):
That's a 6.5x speed up. |
I have a gigabit internet connection, a fast multi-core CPU, and a fast local NVMe disk.
Yet
tarsnap -rf xxx | pv | tar -xf -
shows tarsnap is only able to download around 50 kB/s. What gives? At this rate it'll take 6 days to download a 26 GB backup set.The text was updated successfully, but these errors were encountered: