-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No files showing on components details page #964
Comments
The raw data for https://clearlydefined.io/definitions/pypi/pypi/-/dnspython/1.10.0 shows files as [].
The issue for pypi/pypi/-/dnspython/1.10.0 is a bug in crawler: pypiFetch failed to find tar.gz file and interrupted file downloading without reporting an error. |
For github.com/linux-audit/audit-userspace/5fae55c1ad15b3cefe6890eba7311af163e9133c, and git/github/golang/crypto/c084706c2272f3d44b722e988e70d4a58e60e7f4, the reason for "no files" is that only "licensee" tool was run. In the definition page, "Tools" section shows only licensee and curation. In the raw data section, only licensee portion of the json is available. For the files to be listed properly, "clearlydefined" tool needs to be run and its corresponding json result should be available. In my local environment, files are available in both cases after "source" typed harvests (clearlydefined + licensee + scancode) are completed. Harvest was also initiated on dev server and result available at: https://dev.clearlydefined.io/definitions/git/github/linux-audit/audit-userspace/5fae55c1ad15b3cefe6890eba7311af163e9133c/5fae55c1ad15b3cefe6890eba7311af163e9133c. Files are available and displayed upon completion of the harvest. These two look like cases of incomplete harvest. |
definitions/pypi/pypi/-/dnspython/1.10.0: there is no download url in pypi registry for dnspython 1.10.0, so download failed. See commit message in clearlydefined/crawler#470 |
@qtomlinson Thanks for looking into this. and both look to have successfully harvested. There's still with the below. I can confirm there is no download package in PyPi for this component. https://clearlydefined.io/definitions/pypi/pypi/-/dnspython/1.10.0 Question: Is CD supposed to be showing "harvested" if the system can't find the package like in this example? |
@bduranc Those harvest requests will be marked missing in the crawler (See commit message at clearlydefined/crawler#470) and will not be marked as successful in the future. |
Thanks @qtomlinson . This is a fairly important issue since it involved scans that were "harvested" but had no files to scan (or just a LICENSE file in a few other examples I had observed previously but re-harvested). But it sounds like there is a solution in place to address at least the cases like dnspython where package download/source cannot be not found. For the other two, where package/source is indeed available, is the best solution just to reharvest them when encountered or is there something else we can do? |
@bduranc If |
@qtomlinson should I go ahead and create a separate issue for this then? |
@bduranc Typically, clearlydefined, reuse, licensee and scancode tools are dispatched for source components. It is possible that all four tools were dispatched, but only one tool was processed and the other three runs were somehow not successful. Retriggering harvests can verify whether a potential issue exists. The re-harvested data are now available and seem ok. Alternatively, a user has the option to run harvest with a specific tool (e.g. licensee or scancode) via REST api. In that case, only the result for the user specified tool is available (as expected). The two components listed here might have been cases of harvesting with a specific tool (licensee). To get the complete definition, retriggering harvest with all tools is the solution for that scenario. |
I'm seeing this frequently where a component is harvested but no files are shown.
Some recent examples:
The text was updated successfully, but these errors were encountered: