GSoC: Support for BTF and other BPF decoding improvements proposal rough draft
anolasc13 at gmail.com
Thu Apr 7 14:45:30 UTC 2022
Hello, I'd like to contribute to the strace project in GSoC. I wrote a
proposal for it. Suggestions for improvement are always welcome!
Support for BTF and other BPF decoding improvements
SuHsueyu, anolasc13 at gmail.com
BTF information is important for debugging. The purpose of this
project is to extract BTF information for eBPF maps, process it in a
human-readable manner, and improve decoding for the bpf() syscall's
map manipulation sub-calls. This proposal also includes a small
prototype for testing feasibility as well as a description of the
Retrieve the BTF information for the eBPF maps and use it to enhance
decoding of keys and values of map elements passed to/from the kernel
in map-manipulation-related `bpf` syscalls.
Debugging eBPF programs and maps requires BTF information. BTF is a
metadata collection for eBPF that includes source information for BPF
data. So, with BTF, we can obtain a deeper understanding of eBPF and
the bpf() syscall. The syscall bpf() not only allows you to load an
eBPF program into the kernel, but it also allows you to get meta
information about it. This project will extract BTF data via the bpf()
syscall, decode it, and use the output to improve decoding itself.
bpftool: bpftool is an tool that allows you to view and manipulate
eBPF programs and maps. To retrieve BTF information about eBPF maps,
we can use 'bpftool btf dump prog <prog id>' or 'bpftool map dump <map
id>' to retrieve BTF objects loaded in the kernel. bpftool use libbpf
to retrieve the BTF object loaded in kernel.
The bpftool source code is included in the Linux source code. bpftool
is a great example of a tool for retrieving BTF information. It can
provide a deep interpretation of BPF data to the user, which is the
project's purpose. And I think bpftool is a useful and valuable
link to bpftool:
For this proposal, I write a prototype:
[RetriveBTFDemo](https://github.com/ANOLASC/RetriveBTFDemo). It has
the basic ability to retrieve BTF data that has been loaded into the
kernel. I propose adding several crucial features in GSoC 2022 to make
it more complete.
Result of my demo:
sudo ./rt 6
Implementation of prototype
BTF information loaded in the kernel or ELF file can be inspected
using the Linux syscall bpf(). The prototype will extract information
loaded in kernel and print it to the console using the command line
option provided by user, which is now map id. The prototype's workflow
is as follows:
1. The prototype will attempt to obtain the map file descriptor that
corresponds to the map id provided by the user via command line input.
To retrieve the map file descriptor, the example uses the bpf syscall
with the 'BPF_MAP_GET_FD_BY_ID'. The map file descriptor is part of
BPF virtual file system.
2. After getting the corresponding map file descriptor, the prototype
will retrieve the map information using bpf syscall with
3. Finally, the prototype will decode the map information and print
each map information entry to the console.
The prototype is simple and it remains to be improved. For example,
retrieving BTF information in ELF file .BTF section; More detailed
classification. BPF has many types of map, like hash table map,
program array map, stack trace map, etc. It would be beneficial if it
could make more detailed classifications.
Why is it innovative and What it will contribute
It can provide a new viewpoint of the eBPF program and bpf() syscall
by retrieving BTF for eBPF and BPF data and performing more extensive
analysis on the returned BTF information. This can provide strace
users with much more detailed information when tracing bpf() syscall.
Community Bonding May 20 - June 12
During the community bonding phase, I will delve into eBPF, set up a
coding and debugging environment, learn about the community workflow,
and get a sense of how things function here.
Phase 1 June 1 - June 29
Week 1 June 1 - June 7
I will follow TDD in development. So I would write test case first
Week 2 June 8 - June 14
Week for writing full coverable and complete test cases
Week 3 June 15 - June 21
Week for adding ability of retrieving map info to the decoder
Week 4 June 22 - June 28
Week for classify the map type
Phase 2 June 29 - July 27
Week 5 June 29 - July 5
Week for reading bpf source code and improving decoding information
for bpf syscall
Week 6 July 6 - July 12
Week for improving BPF_MAP_CREATE
Week 7 July 13 - July 19
Week for improving BPF_MAP_LOOKUP_ELEM
Week 8 July 20 - July 26
Week for improving BPF_MAP_UPDATE_ELEM
Phase 3 July 27 - August 24
Week 9 July 27 - August 2
Week for improving BPF_MAP_DELETE_ELEM
Week 10 August 3 - August 9
Week for improving BPF_MAP_GET_NEXT_KEY
Week 11 August 10 - August 16
Week for code reviewing and bug fixing
Week 12 August 17 - August 23
Week for code reviewing, bug fixing. Buffer time in case of something
cannot be done on time.
I am an undergraduate student of software engineering.
Relevant skills that will help to achieve the goal
I'm fresh to the opensource community. I'm conversant with C/C++ and
the fundamentals of git. Through MIT 6.s081, I have a fundamental
understanding of the operating system. I believe these abilities will
assist me in completing this project.
I don't have any open-source projects; this is the first open-source
project I involved, and I'm hoping it will serve as a springboard for
future open-source activity. During the GSoC 2022 period, I am
available full-time to work on this project.
- Name: SuHsueyu
- Email: anolasc13 at gmail.com
- Github: github.com/ANOLASC
More information about the Strace-devel