-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add summary for scraped sources from specfiles #216
Conversation
Closes packit/packit-service#2390 Signed-off-by: Matej Focko <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work, thanks!
This “research” provides the scripts that have been used to process the scraped | ||
sources. | ||
|
||
### Domains with ≥ 10 occurrences |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, what do you think will be the reasonable threshold for us to ask for the firewall adjustments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some of the bigger domains, it appears that with dependencies it's easy to get ≥ 10. Additionally we don't have many blocked packages on the firewall, so I would consider only the really big hosts (like forges).
All `SourceX` fields of the specfiles have been initially scraped by the @msuchy. | ||
This “research” provides the scripts that have been used to process the scraped | ||
sources. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please include here some instructions on how the data can be obtained (how should be the script run)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the script in this repo or the one that scrapes the specfiles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to include both? It would be enough to mention here what files represent what.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job! The outcome looks really reasonable.
Co-authored-by: Laura Barcziová <[email protected]> Signed-off-by: Matej Focko <[email protected]>
Signed-off-by: Matej Focko <[email protected]>
Signed-off-by: Matej Focko <[email protected]>
Closes packit/packit-service#2390