Clean up license names + add more licenses and disambiguations #1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
(This is built on your latest commit but I'm not sure if I should open this PR here or on the upstream repo)
This PR introduces better handling of license name variants via the
clean_license_name
function. It takes a license name and:This means that, for example, "cc-by-sa 3.0" and "custom:CCBYSA3.0" are treated as the same license. The
--list-licenses
option prints the most popular variant.I've also spent some time adding more licenses and packages into the "unambiguous" db.
jaxb-ri
, it's basically just BSD-3xorg-font-util
, not recognized by OSI or FSF but Debian allows itunambiguous_db
now contains a bunch of xorg/x11 packages (all of which use some variant MIT/X11), some Mesa packages, some non-free packages, and some others likeyoutube-dl
,zsh
andzathura
.