Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test performance metrics #10

Open
cecilia-donnelly opened this issue Apr 13, 2023 · 2 comments
Open

Test performance metrics #10

cecilia-donnelly opened this issue Apr 13, 2023 · 2 comments

Comments

@cecilia-donnelly
Copy link
Member

cecilia-donnelly commented Apr 13, 2023

We want to get baseline metrics for fastest possible performance, so we will set up an instance in AWS' us-west-2 region and run the following tests:

  • Large numbers of files (e.g. 100, 500, 1000)
  • Large files (greater than 400 MB)
  • Large numbers of large files (e.g. more than 8 GB at a time)
@nfebe
Copy link
Contributor

nfebe commented Nov 6, 2023

Load Testing Results Report

Some load testing related to these was carried using https://github.com/PermanentOrg/sftp-qa and https://github.com/PermanentOrg/sftp-qa-iac

Date: 10.05.2023
Tested System: Ubuntu 22.0

Summary

The load testing session conducted ran the variety uploads test with 5 Ubuntu machines set up to upload simultaneously to the same account.

Key Findings

  1. CPU Usage: While the 5 machines where uploading, CPU usage consistently reached its maximum capacity. Averaging between 95%-100%.
  2. Denial of Service (DOS) Issues: There is noticeable lag on the "frontend" as the page takes a lot of time to load waiting for responses from the backend. Backend login requests, such as those from non-SFTP users, were observed to time out when five simultaneous SFTP users were uploading data.
  3. RAM Usage: RAM usage remained stable throughout the load testing, indicating that memory resources are not a limiting factor.

Recommendation

Apart from obvious CPU scaling options, the recommended number of simultaneous users for rlcone on the m4.xlarge dev backend

Conclusion

There is still significant CPU bottlenecks with heavy parallel usage of rclone. Pointers for things that could be aggravating this bottleneck include the permanent backend taskrunner.

Noted here because it's related to performance metrics

cc: @kfogel @slifty @cecilia-donnelly

@nfebe
Copy link
Contributor

nfebe commented Nov 6, 2023

@cecilia-donnelly Here is information for earlier test sessions (if it helps) carried on large number of files and large files tests.

SIZE TESTS

Path Structure Env Internet Speed Action Retries Attempts In Succeeded Run # of files attempted # of files succeeded Size / Transferred Time (Success)
Dir with 1 file Dev Upload 0 1 1 512MB 0:02:01
Dir with 1 file Dev Upload 0 2 1 1GB 0:05:34
Dir with 1 file Dev Upload 4 1 1 4GB 0:14:47
Dir with 3 files and folders Dev Upload 2 3 5.2GB 0:00:00

QUANTITY TESTS

Path Structure Env Internet Speed Action Retries # of files attempted # of files succeeded Size / Transferred Time (Success)
Dir with 1000 1kb files Dev Upload 0 1000 237 237 B 1:18:11
Dir with 1000 5kb files Dev Upload 0 1000 230 1.15KB 1:20:10
Dir with 1000 10kb files Dev Upload 0 1000 234 2.23 MB 0:02:01
Dir with 1000 1MB files Dev Upload 0 1000 235 224.11 MB / 224.14 MB 1:21:05
Dir with 1000 5MB files Dev Upload 0 1000 202 976.56 MB / 1.1 GB 1:20:56
Dir with 1000 10MB files Dev Upload 0 1000 230 2.20 GB 1:21:00

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants