-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Patch building and Webhooks API #307
[RFC] Patch building and Webhooks API #307
Comments
Do you mean adding an endpoint to the KernelCI API? I think a better approach is to have a micro-service receiving the webhooks from Patchwork and then relaying things to the KernelCI API. We have something like this already for LAVA callbacks: That way, the KernelCI API doesn't need to "know" about Patchwork and we can update the bridge any time just by restarting the client-side service. It's a key principle of the new API, to be modular and leave all the specific logic happen on the client side. |
Yes, the |
I would suggest to make it entirely specific to Patchwork and whatever data BPF CI currently has, with a dedicated micro-service just for this use-case. This would be the first time we add "pre-merge" testing capability to KernelCI so we can keep things as simple as possible and tailored to this particular use-case. Then when we start adding other kinds of pre-merge testing sources we might have other dedicated micro-services for them and potentially consolidate things with a more standard way of triggering pipeline based on patch series, but I don't think we need to solve this kind of generic problem right now just for BPF CI integration. |
Going dedicated microservice route will certainly help with iterations and decoupling logic from the core part. But now I am thinking about how to handle sending information back to patchworks? |
As for BPF CI, we already have CI setup and ready to send results into |
Yes, it's one of the key design principles of the new API. The legacy API had too many custom built-in features and that made it very hard to do anything with it. Testing the upstream kernel requires a very modular design as each use-case is different - here it's Patchwork for BPF but nearly every maintainer has some custom tooling.
Yes, the latter. It's much better to keep the micro-service stateless, the Node.data field can be used for this kind of extra data.
This is fine, in fact the micro-service doesn't even need to be in the kernelci-pipeline repository. It could be something hosted as part of BPF CI, or it could be in kernelci-pipeline with a separate docker-compose file like we have for the LAVA callback handler. Having a micro-service dedicated to BPF patch testing is absolutely the kind of thing the new API was designed for. |
OK, I guess it'll be good to clarify what you want to cover with your existing CI and what you're expecting from KernelCI to be doing. If you want the KernelCI infrastructure to build kernels and run tests for BPF patches, then would it be duplicating some things you're already doing with your current BPF CI? Would it be a way to transition from BPF CI to KernelCI? |
In kcidb/orm/data.py there Does KernelCI store any patchset information like mbox urls and/or title? If not, what's the best place to add these fields? KCIDB itself? |
Part of the reason I want this functionality is to reduce BPF CI footprint. I'd like to rely on KernelCI as much as possible in the future, while keeping more Meta-workload specific BPF tests on BPF CI side.
Once we'll have |
@yurinnick I think we need to clarify a few things, KCIDB is only a database for storing data from any CI system. It doesn't run builds or tests or anything itself. The point of it is to have a common email report (and dashboard) with results aggregated from many CI systems e.g. 0-Day, Syzbot, CKI, Gentoo, Linaro's TuxSuite as well as the "native" KernelCI results. The "native" KernelCI results are the builds, tests and anything that is orchestrated by KernelCI itself. There's a legacy database and frontend for it and it's running with Jenkins. This is now being replaced with the new API & Pipeline which will have a new web dashboard too at some point. So you could already choose to send your current BPF CI results to KCIDB. I'm not an expert regarding the schema and how it handles patch sets, that's more a question for @spbnick. Then if you want to rely on KernelCI to run tests for BPF, that's not related to KCIDB. It's something we can do with the new API & Pipeline by setting up a dedicated pipeline service to make the bridge between the patch sets you have to test for BPF and KernelCI. I believe this is what you wanted to do, and offload your current CI system or maybe refocus it on some "internal" testing while KernelCI would be gradually covering all the upstream-oriented testing. Is that right? I'm not familiar enough yet with your setup around Patchwork and GitHub but we can go through the details if integrating it with KernelCI is what you want to achieve. |
As part of proof-of-concept, added patchwork functionality to various kernelci parts:
|
Could you please explain which service is going to be receiving what kind of data from Patchwork to start with? We basically need something to know the Git base revision and a list of patch files. Then the simplest way to deal with this would be to apply the patches on top of the git revision and create a source tarball "checkout" with it, and the rest can remain unchanged. Once we have this in place, I think we can gradually add more features specific to patches like keeping them as separate files, doing incremental builds etc. but that will require many development iterations I think. |
So I created kernelci-patchwork-webhook where I receive webhook data (exact structure of it is WIP) and transform it into a new node request here. Once a new node created with |
I assume the new KernelCI API will be sending its testing results into KCIDB, so whatever you submit there would automatically end up in KCIDB. We have some data coming in already from the WIP. I only see checkouts at this point, though. @gctucker, please correct me if I misunderstood anything, and if not, do we already have more data to send from the API, perhaps? If you have any data from your own CI system to send to KCIDB in addition to that, you'll be more than welcome. Reach out to me on Slack, or the maillist, and we can arrange that, no problem. |
That's right, as discussed yesterday in the meeting the data from the new API will be forwarded to KCIDB and that can include patch sets. @spbnick The staging API instance is not very stable and probably the bridge that sends data to KCIDB hasn't been very operational. That should improve with the Early Access instance which should be more production-like, so hopefully there'll be more data sent to KCIDB in September. |
@gctucker please let me know which pull requests we can review and merge, and which one I should rework. |
For the record, here's a sample Patchwork page showing results of current BPF CI with a link to a GitHub PR that run the tests: https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/ |
Thanks @yurinnick for this, it looks like a very good first step. It's mostly functional and ready to be added to the code base, I think the only issue is that we might need to update the API schema for node revisions or at least precisely define how to handle patchsets. Node revisions are important for the whole pipeline so we can't really just deal with patchsets only using the arbitrary I'll create a GitHub issue to cover this part then we can discuss it and evaluate various options. Then once that's clarified I think it'll unblock things to land a first functional implementation. |
Following up from kernelci/kernelci-pipeline#295, it would seem like a good starting point to look at how to manage patchsets via the API. Right now, the One issue is if we keep the same Git information for a checkout node that has patches, querying the results for a particular kernel revision could be misleading. If patches are applied, it's not the same source code as the base Git revision so the results for the patched kernel shouldn't be found when looking for a particular Git revision. So here's a proposal for how to deal with this in Node data, using some hash algorithm for the patch series like KCIDB or CKI (here some simplified YAML because it's easier): # Root node for the base Git revision
id: 123
parent: null
name: checkout
revision:
commit: 374a7f47bf401441edff0a64465e61326bf70a82
patchset: null
artifacts:
linux.tar.gz: https://some.storage.com/linux.tar.gz
---
# Build node for the base revision
id: 124
parent: 123
name: kbuild-gcc-10-x86
revision:
commit: 374a7f47bf401441edff0a64465e61326bf70a82
patchset: null
artifacts:
bzImage: https://stome.storage.com/bzImage-1
---
# Patched revision
id: 125
parent: 123
name: patchset
revision:
commit: 374a7f47bf401441edff0a64465e61326bf70a82
patchset: 6b7c28b5b6fc0c8194b8b5fd54f2f57ecc2d73fa45ba97125e775c55cf6825cf
artifacts:
0001-accel-ivpu-Use-struct_size.patch: https://some.storage.com/0001.patch
0002-accel-ivpu-Remove-configuration-of-MMU-TBU1-and-TBU3.patch: https://some.storage.com/0002.patch
---
# Build node for the patched revision
id: 126
parent: 125
name: kbuild-gcc-10-x86
revision:
commit: 374a7f47bf401441edff0a64465e61326bf70a82
patchset: 6214ba5dcb3dbf46a5f479b087207f0cc83f11516c0a764bb590a76d00964e38
artifacts:
bzImage: https://stome.storage.com/bzImage-2 A few things to note about this proposal:
Some more investigation needs to be done for incremental builds, but I think this can provide the basis for a solution as a first step. We could have a hierarchy of Additional information such as the patch author etc. should already be in the patch file itself, so while we might add it to the data via API it seems like an incremental improvement. We could easily store this in the How does this sound? |
One importance part we haven't discussed yet, is how are we going to pick the base revision for patch or patchset. In my basic implementation I set |
A few question arise since the last week:
Given [2], what the best way to create |
It doesn't matter in this proposal where the nodes are created, this is only about how it would work from the API side initially. The patchset nodes would be child nodes of checkout nodes, and the checkout node has the information about the base git commit. Surely a client could send a checkout node with the tarball and the base git revision data as well as a patchset child node with the patchset data? Or a client for patchsets could first look up base git revisions in checkout nodes, well there are many ways to do this depending on the use-case. |
Webhook API should help to integrate
KernelCI
with patch management (Patchwork) and version control systems (Github, Gitlab). It will provide interface to trigger non-upstream patch builds that should be able to publish results back later.Implementation
As the first step, I'd like to implement Patchwork integration with KernelCI. Following changes should be implemented:
/webhooks/patchwork
API that will expect certain (TBD) input from Patchwork side, enough to build and test kernel, and report back resultskernelci.Node
,KernelBuildMetadata
,kernelci.config.Tree
with patch-related fieldsMinimal patch information
We need a patch and all dependent patches information, as well as submitter information for email notifications.
Discussion
Node
?data
field or a separate field?/webhooks/patchwork
? (Collaboration with Patchwork developers)The text was updated successfully, but these errors were encountered: