Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support BPF program symbolization #826

Closed
danielocfb opened this issue Sep 24, 2024 · 4 comments · Fixed by #854
Closed

Support BPF program symbolization #826

danielocfb opened this issue Sep 24, 2024 · 4 comments · Fixed by #854
Assignees
Labels
enhancement New feature or request rust Pull requests that update Rust code

Comments

@danielocfb
Copy link
Collaborator

danielocfb commented Sep 24, 2024

As part of the effort of improving our kernel symbolization logic, we would like to support symbolization of addresses mapping to BPF programs. Here is a brain dump roughly outlining what (I think) is necessary to support such symbolization. Everything and anything could be wrong ...

  • in /proc/<pid>/maps / PROCMAP_QUERY BPF programs would be represented with a "name" of bpf_prog_<some-hex-number>; some-hex-number seems to be the program's "tag" and can be used for finding more information
  • use bpf_prog_get_next_id to iterate over loaded programs and find the one with matching the tag
  • use bpf_prog_get_fd_by_id to retrieve program file descriptor
  • use bpf_obj_get_info_by_fd to retrieve program information using said file descriptor
  • use bpf_prog_info.{nr_jited_line_info, jited_line_info, line_info_rec_size} and similar, in conjunction with the kernel's BTF information, to retrieve function name and source code path
    • I'd hope there are examples floating around for this, but haven't checked

In terms of integration into blazesym, a good starting point to look at would probably be

fn create_kernel_resolver(&self, src: &Kernel) -> Result<KernelResolver> {

What data to cache (and at what level) is somewhat of an open question. At the very least I'd say we should be remembering the result for a given address and reuse that on repeated symbolization. But perhaps a more coarse grained approach (e.g., caching at the function level, if there is such a thing, or remembering what BPF program maps to what tag) may be useful as well. I have no idea of performance characteristics of any of the APIs we need to interface with.

As I mentioned above, I think we may need some basic BTF support (mostly for string lookup?) as well as BPF syscall bindings. Usage of libbpf-rs is a possibility (should contain both), though I don't know if we really want to add a dependency to libbpf-rs and libbpf longer term. But we can think about that once a POC is working.

We would also require some prerequisite work introducing proper kernel testing infrastructure to be able to test this symbolization on injected programs as well. At this point I think it mostly comes down to loading BPF programs, as we already support testing on arbitrary kernels using vmtest. Again, this should be provided by libbpf-rs, which I think is a no brainer to use in a testing context.

@danielocfb
Copy link
Collaborator Author

cc @jfernandez

@danielocfb danielocfb added enhancement New feature or request help wanted Extra attention is needed rust Pull requests that update Rust code labels Sep 24, 2024
@javierhonduco
Copy link

Very interested in this work, happy to review and test any PRs! :)

@danielocfb danielocfb self-assigned this Oct 8, 2024
@danielocfb danielocfb removed the help wanted Extra attention is needed label Oct 8, 2024
d-e-s-o added a commit to d-e-s-o/blazesym that referenced this issue Oct 18, 2024
This change adds the remaining plumbing for symbolizing BPF program
kernel addresses. When a kernel address falls into a BPF program, we
query all the necessary information to see if the kernel is able to
provide us with source code information about said address and furnish
up the corresponding CodeInfo object to include it in the symbolization
result.

Closes: libbpf#826

Signed-off-by: Daniel Müller <[email protected]>
@d-e-s-o
Copy link
Collaborator

d-e-s-o commented Oct 18, 2024

This is now out for review #854

@javierhonduco feel free to try it out and report back. Also, let me know if you have any questions.

@d-e-s-o d-e-s-o linked a pull request Oct 18, 2024 that will close this issue
d-e-s-o added a commit to d-e-s-o/blazesym that referenced this issue Oct 18, 2024
This change adds the remaining plumbing for symbolizing BPF program
kernel addresses. When a kernel address falls into a BPF program, we
query all the necessary information to see if the kernel is able to
provide us with source code information about said address and furnish
up the corresponding CodeInfo object to include it in the symbolization
result.

Closes: libbpf#826

Signed-off-by: Daniel Müller <[email protected]>
d-e-s-o added a commit to d-e-s-o/blazesym that referenced this issue Oct 18, 2024
This change adds the remaining plumbing for symbolizing BPF program
kernel addresses. When a kernel address falls into a BPF program, we
query all the necessary information to see if the kernel is able to
provide us with source code information about said address and furnish
up the corresponding CodeInfo object to include it in the symbolization
result.

Closes: libbpf#826

Signed-off-by: Daniel Müller <[email protected]>
d-e-s-o added a commit to d-e-s-o/blazesym that referenced this issue Oct 18, 2024
This change adds the remaining plumbing for symbolizing BPF program
kernel addresses. When a kernel address falls into a BPF program, we
query all the necessary information to see if the kernel is able to
provide us with source code information about said address and furnish
up the corresponding CodeInfo object to include it in the symbolization
result.

Closes: libbpf#826

Signed-off-by: Daniel Müller <[email protected]>
d-e-s-o added a commit to d-e-s-o/blazesym that referenced this issue Oct 18, 2024
This change adds the remaining plumbing for symbolizing BPF program
kernel addresses. When a kernel address falls into a BPF program, we
query all the necessary information to see if the kernel is able to
provide us with source code information about said address and furnish
up the corresponding CodeInfo object to include it in the symbolization
result.

Closes: libbpf#826

Signed-off-by: Daniel Müller <[email protected]>
d-e-s-o added a commit to d-e-s-o/blazesym that referenced this issue Oct 18, 2024
This change adds the remaining plumbing for symbolizing BPF program
kernel addresses. When a kernel address falls into a BPF program, we
query all the necessary information to see if the kernel is able to
provide us with source code information about said address and furnish
up the corresponding CodeInfo object to include it in the symbolization
result.

Closes: libbpf#826

Signed-off-by: Daniel Müller <[email protected]>
d-e-s-o added a commit to d-e-s-o/blazesym that referenced this issue Oct 18, 2024
This change adds the remaining plumbing for symbolizing BPF program
kernel addresses. When a kernel address falls into a BPF program, we
query all the necessary information to see if the kernel is able to
provide us with source code information about said address and furnish
up the corresponding CodeInfo object to include it in the symbolization
result.

Closes: libbpf#826

Signed-off-by: Daniel Müller <[email protected]>
d-e-s-o added a commit to d-e-s-o/blazesym that referenced this issue Oct 18, 2024
This change adds the remaining plumbing for symbolizing BPF program
kernel addresses. When a kernel address falls into a BPF program, we
query all the necessary information to see if the kernel is able to
provide us with source code information about said address and furnish
up the corresponding CodeInfo object to include it in the symbolization
result.

Closes: libbpf#826

Signed-off-by: Daniel Müller <[email protected]>
d-e-s-o added a commit to d-e-s-o/blazesym that referenced this issue Oct 21, 2024
This change adds the remaining plumbing for symbolizing BPF program
kernel addresses. When a kernel address falls into a BPF program, we
query all the necessary information to see if the kernel is able to
provide us with source code information about said address and furnish
up the corresponding CodeInfo object to include it in the symbolization
result.

Closes: libbpf#826

Signed-off-by: Daniel Müller <[email protected]>
@javierhonduco
Copy link

Great stuff @danielocfb! Thanks for the heads up! This week I won't have too much time to take a look a this, but will make sure to do it early next week.

danielocfb pushed a commit to d-e-s-o/blazesym that referenced this issue Oct 22, 2024
This change adds the remaining plumbing for symbolizing BPF program
kernel addresses. When a kernel address falls into a BPF program, we
query all the necessary information to see if the kernel is able to
provide us with source code information about said address and furnish
up the corresponding CodeInfo object to include it in the symbolization
result.

Closes: libbpf#826

Signed-off-by: Daniel Müller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request rust Pull requests that update Rust code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants