Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

View for OpenScanHub results #441

Open
2 tasks
Tracked by #2516
lachmanfrantisek opened this issue Aug 21, 2024 · 17 comments
Open
2 tasks
Tracked by #2516

View for OpenScanHub results #441

lachmanfrantisek opened this issue Aug 21, 2024 · 17 comments
Assignees
Labels
area/fedora Related to Fedora ecosystem discuss discuss To be discussed within a team (usually on the so-called Architecture meeting next Thursday) gain/high This brings a lot of value to (not strictly a lot of) users. impact/high This issue impacts multiple/lot of users. kind/feature New feature or a request for enhancement.

Comments

@lachmanfrantisek
Copy link
Member

lachmanfrantisek commented Aug 21, 2024

It would be nice to present people results from the analysis including the relation to other related jobs (like SRPM/RPM) build.

This would most probably require a reporting mechanism to be implemented.
(Currently, Packit only triggers the scan.)

We might also want to think about this as part of Project Mycorrhisa, ping @Venefilyn .

todo

  • add needed information to the service API
  • implement a basic view for the scan
@lachmanfrantisek
Copy link
Member Author

We also need to think about what needs to be displayed on Packit side and what (if) needs to be on OSH only.

@siteshwar
Copy link

siteshwar commented Aug 21, 2024

As mentioned by @kdudka in packit/packit#2371 (reply in thread), this needs to be implemented on the Packit side.

As per my understanding, the workflow should look like:

  • Packit submits the scan to OSH, and waits for the scan to finish.
  • If the scan was successful, download added.js and convert it to SARIF through csgrep --mode=sarif.
  • Upload the SARIF output to CodeQL, or a similar service that can present the results in a user friendly way.
  • Link the user to the report through GitHub actions, or similar CI job.

@davidmalcolm Let us know if you have any suggestions here. Thanks!

EDIT: Fixed name of .js file to download.

@siteshwar
Copy link

We might also want to think about this as part of Project Mycorrhisa, ping @Venefilyn .

As I understand, it is Project Mycorrhiza and its a terrible name.

@lachmanfrantisek
Copy link
Member Author

As I understand, it is Project Mycorrhiza and its a terrible name.

I like the meaning but always struggle with the spelling..;-) Btw. we are collecting use cases people want to see help with here: https://packit.limesurvey.net/project-mycorrhiza

@lachmanfrantisek
Copy link
Member Author

@siteshwar thanks for the task list.

Do you (or anyone else) know if there is a way to get HTML or some other nice visualisation/representation so we can integrate it directly instead of linking an external service? (People do not like these clicking workflows.)

@lachmanfrantisek
Copy link
Member Author

  • Packit submits the scan to OSH, and waits for the scan to finish.

@siteshwar btw, for this, we would appreciate messages coming to the Fedora message bus to avoid busy-wait which is really hard to do with the the current Packit's event-driven architecture. Do you think it's realistic to do?

Related issues about this:

@mfocko mfocko added area/fedora Related to Fedora ecosystem gain/high This brings a lot of value to (not strictly a lot of) users. impact/high This issue impacts multiple/lot of users. kind/feature New feature or a request for enhancement. labels Aug 22, 2024
@siteshwar
Copy link

@siteshwar btw, for this, we would appreciate messages coming to the Fedora message bus to avoid busy-wait which is really hard to do with the the current Packit's event-driven architecture. Do you think it's realistic to do?

It is in my todo list. I would get back to it at some point in future.

@davidmalcolm
Copy link

As mentioned by @kdudka in packit/packit#2371 (reply in thread), this needs to be implemented on the Packit side.

As per my understanding, the workflow should look like:

* Packit submits the scan to OSH, and waits for the scan to finish.

* If the scan was successful, download `scan-results.js` and convert it to SARIF through `csgrep --mode=sarif`.

* Upload the SARIF output to CodeQL, or a similar service that can present the results in a user friendly way.

* Link the user to the report through GitHub actions, or similar CI job.

@davidmalcolm Let us know if you have any suggestions here. Thanks!

@siteshwar I confess my knowledge here is mostly on the diagnostics and SARIF generation side, less on filtering, presentation, etc. I haven't played around with CodeQL yet; the pertinent docs would appear to be here. I notice that they are using a few non-standard properties via property bags. FWIW I regularly meet a CodeQL representative at the monthly OASIS SARIF TC meetings.

I'm nervous that going from, say GCC SARIF output to scan-results.js then back to SARIF might lose information. Ideally we'd use SARIF throughout the process to keep as much info as possible (perhaps using property bags if there's anything that OSH wants to capture that isn't yet expressible in SARIF). I'm familiar with a large subset of SARIF 2.1.0, so let me know if you have any questions about it.

@davidmalcolm
Copy link

@siteshwar Also, please note that (with my upstream GCC hat on), I regard GCC's "json" diagnostic output format to be "legacy"; all of my future development for machine-readable diagnostics is on its SARIF output format instead. See https://gcc.gnu.org/wiki/SARIF for details on GCC's SARIF support, bug tracking, etc.

@kdudka
Copy link

kdudka commented Aug 23, 2024

@davidmalcolm OSH still captures the plain-text format of GCC diagnostics. Although csdiff supports all the three formats (plain-text, JSON, SARIF), it is not easy to migrate csmock-plugin-gcc to the newer diagnostic formats. As it is implemented now, the diagnostic output from all GCC processes is written to a single text file. This works well for the plain-text format but not for the JSON-based formats because concatenation of multiple JSON files is not a valid JSON.

Can GCC write a separate SARIF file to a specified directory for each invocation? Something like valgrind --xml-file=${dir}/%p-%n.xml ...? I think something like that would ease the transition of OSH to GCC's SARIF output...

@davidmalcolm
Copy link

Can GCC write a separate SARIF file to a specified directory for each invocation? Something like valgrind --xml-file=${dir}/%p-%n.xml ...? I think something like that would ease the transition of OSH to GCC's SARIF output...

@kdudka Sadly not. I had thought that it respected the dumpfile command-line arguments, but it turns out it doesn't. Sorry about that. See GCC bug 110522 for more info, and ideas on fixing this.

Would being able to specify an output filename for the .sarif file on the gcc command line be enough?

BTW, which version of GCC are you using?

@siteshwar
Copy link

siteshwar commented Aug 28, 2024

BTW, which version of GCC are you using?

The scans that are run through Packit are run on rawhide, so we use the same version of gcc as available in rawhide. Currently it is 14.1.1-7.fc41, it can be actually seen in the stdout.log of any scan.

@kdudka
Copy link

kdudka commented Aug 28, 2024

Would being able to specify an output filename for the .sarif file on the gcc command line be enough?

Yes. If GCC was able to write a SARIF file with a unique name in a specified directory (like valgrind --xml-file=${dir}/%p-%n.xml ... does), it would be much easier to build automation on top of it. Nevertheless, if GCC is able to write the SARIF file to a specified path, we should be able to script around it with flock and mktemp.

Also does GCC always provide absolute paths to source code files in the SARIF data? Relative paths to source code files are currently translated by the compiler wrapper that processes the diagnostic messages printed by GCC to stderr.

BTW, which version of GCC are you using?

OSH uses the GCC version that is available in the buildroot. So, as @siteshwar says, it will be the Fedora Rawhide version by default for OSH tasks triggered by Packit. OSH needs to work with GCC versions available in EPEL buildroots, too, but csmock-plugin-gcc can probe for the needed features in GCC and eventually fallback to the plain-text capture that it uses today.

csutils/csmock#41 introduced an option to use a custom version of GCC Analyzer, which can be different from the GCC version that drives the native build (as long as these GCC versions can be installed in parallel into a single buildroot).

@siteshwar
Copy link

  • Upload the SARIF output to CodeQL, or a similar service that can present the results in a user friendly way.

There is a limit of 10 MB on the results file. The SARIF file should be compressed before being uploaded.

@Venefilyn
Copy link
Collaborator

Venefilyn commented Sep 20, 2024

  • Upload the SARIF output to CodeQL, or a similar service that can present the results in a user friendly way.

I think outputting it with CodeQL would be great to keep things within GitHub. As for the dashboard we could make use of the SARIF output as well, for that I ask that packit-service can indicate in a response that a SARIF file exists or is being generated, and create a new endpoint to return the SARIF results

Given SARIF has a schema it should be relatively trivial to handle it in the dashboard as we can just install the types or generate the types to make things easier for us through something like https://www.npmjs.com/package/@types/sarif/v/2.1.5

We might also want to think about this as part of Project Mycorrhisa, ping @Venefilyn .

As I understand, it is Project Mycorrhiza and its a terrible name.

It's a project/WG name for an idea! It's a good name 😉

@lbarcziova
Copy link
Member

I have started looking into the implementation details of this and have a few things to discuss/confirm first:

  • is it ok if we focus on and display only the newly found defects, i.e. added.js? (there is also the general results file and fixed defects)
  • before implementing the conversion to SARIF on our (Packit) side, I wanted to confirm whether this functionality doesn't generally make more sense to be in OSH directly, could other OSH users benefit from this as well?
  • displaying of the SARIF on Packit's side: I agree with @Venefilyn point, Packit isn't serving GitHub only (even though most of our users are here), so it would be nice to have a general solution for displaying the SARIF on our dashboard, I haven't looked into the implementation details of that yet. But I wanted to confirm we want to have 2 ways of displaying the results - both on dashboard and in GitHub UI. GitHub's functionality for this looks really nice and for GitHub users would be great, from what I checked we would upload the SARIF using this endpoint and for that we would need to update permissions for our app regarding Code scanning alerts (see docs).

@siteshwar
Copy link

I have started looking into the implementation details of this and have a few things to discuss/confirm first:

  • is it ok if we focus on and display only the newly found defects, i.e. added.js? (there is also the general results file and fixed defects)

Yes, we should just point out new findings.

  • before implementing the conversion to SARIF on our (Packit) side, I wanted to confirm whether this functionality doesn't generally make more sense to be in OSH directly, could other OSH users benefit from this as well?

@kdudka Shall this be done from OSH side?

  • displaying of the SARIF on Packit's side: I agree with @Venefilyn point, Packit isn't serving GitHub only (even though most of our users are here), so it would be nice to have a general solution for displaying the SARIF on our dashboard, I haven't looked into the implementation details of that yet. But I wanted to confirm we want to have 2 ways of displaying the results - both on dashboard and in GitHub UI. GitHub's functionality for this looks really nice and for GitHub users would be great, from what I checked we would upload the SARIF using this endpoint and for that we would need to update permissions for our app regarding Code scanning alerts (see docs).

I am fine with using Packit dashboard, as long as it enforces an action from users on new findings.

@lbarcziova lbarcziova added the discuss discuss To be discussed within a team (usually on the so-called Architecture meeting next Thursday) label Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/fedora Related to Fedora ecosystem discuss discuss To be discussed within a team (usually on the so-called Architecture meeting next Thursday) gain/high This brings a lot of value to (not strictly a lot of) users. impact/high This issue impacts multiple/lot of users. kind/feature New feature or a request for enhancement.
Projects
Status: in-progress
Development

No branches or pull requests

7 participants