[PATCH v3] bpf: support new commands BPF_MAP_*_BATCH
Paul Chaignon
paul.chaignon at gmail.com
Sat Apr 4 13:44:34 UTC 2020
On Fri, Apr 03, 2020 at 08:14:27PM +0300, Dmitry V. Levin wrote:
> On Sun, Mar 29, 2020 at 11:49:44AM +0200, Paul Chaignon wrote:
> > * xlat/bpf_commands.in (BPF_MAP_LOOKUP_BATCH): New constant introduced
> > by Linux commit v5.6-rc1~151^2~46^2~23^2~7.
> > (BPF_MAP_LOOKUP_AND_DELETE_BATCH): New constant introduced by Linux
> > commit v5.6-rc1~151^2~46^2~23^2~4.
> > (BPF_MAP_UPDATE_BATCH, BPF_MAP_DELETE_BATCH): New constants introduced
> > by Linux commit v5.6-rc1~151^2~46^2~23^2~6.
> > * bpf.c (BEGIN_BPF_CMD_DECODER(BPF_MAP_LOOKUP_BATCH),
> > BEGIN_BPF_CMD_DECODER(BPF_MAP_UPDATE_BATCH),
> > BEGIN_BPF_CMD_DECODER(BPF_MAP_DELETE_BATCH)): Decode the new commands.
> > (decode_BPF_MAP_LOOKUP_AND_DELETE_BATCH): New macro.
> > (SYS_FUNC(bpf)): Decode the new commands.
> > * bpf_attr.h (BPF_MAP_LOOKUP_BATCH_struct,
> > BPF_MAP_LOOKUP_AND_DELETE_BATCH_struct, BPF_MAP_UPDATE_BATCH_struct,
> > BPF_MAP_DELETE_BATCH_struct): New structs introduced by Linux commit
> > v5.6-rc1~151^2~46^2~23^2~7.
> > (BPF_MAP_LOOKUP_BATCH_struct_size,
> > BPF_MAP_LOOKUP_AND_DELETE_BATCH_struct_size,
> > BPF_MAP_UPDATE_BATCH_struct_size, BPF_MAP_DELETE_BATCH_struct_size): New
> > macros.
> > * NEWS: Mention this.
> > * tests/bpf.c (BPF_MAP_LOOKUP_BATCH_checks, BPF_MAP_UPDATE_BATCH_checks,
> > BPF_MAP_DELETE_BATCH_checks): Tests for the new commands.
> >
> > Signed-off-by: Paul Chaignon <paul.chaignon at gmail.com>
>
> Thanks, see my comments below.
Thanks for the review Dmitry!
>
> > ---
> > Changelogs:
> > Changes in v3:
> > - Add tests for BPF_MAP_UPDATE_BATCH and BPF_MAP_DELETE_BATCH.
> > - Do not print unused fields in_batch and out_batch.
> > - Print count on entering and exiting.
> > - Fix printing of flags.
> > Changes in v2:
> > - Add ATTRIBUTE_ALIGNED(8) to uint64_t attributes.
> > - Rebase on master.
> >
> > NEWS | 3 ++
> > bpf.c | 86 ++++++++++++++++++++++++++++++++++++++++++++
> > bpf_attr.h | 24 +++++++++++++
> > tests/bpf.c | 74 ++++++++++++++++++++++++++++++++++++++
> > xlat/bpf_commands.in | 4 +++
> > 5 files changed, 191 insertions(+)
> >
> > diff --git a/NEWS b/NEWS
> > index 6806c1e3..2965d629 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -14,6 +14,9 @@ Noteworthy changes in release ?.? (????-??-??)
> > and syscall time output (using --absolute-timestamps,
> > --relative-timestamps, and --syscall-times options, respectively).
> > * Implemented PTRACE_GETREGS API support on hppa.
> > + * Implemented decoding of BPF_MAP_LOOKUP_BATCH,
> > + BPF_MAP_LOOKUP_AND_DELETE_BATCH, BPF_MAP_UPDATE_BATCH, and
> > + BPF_MAP_DELETE_BATCH bpf syscall commands.
> > * Enhanced decoding of BPF_MAP_CREATE and BPF_PROG_ATTACH bpf syscall
> > commands.
> > * Updated lists of BPF_* constants.
> > diff --git a/bpf.c b/bpf.c
> > index c432181c..04fa369e 100644
> > --- a/bpf.c
> > +++ b/bpf.c
> > @@ -954,6 +954,88 @@ BEGIN_BPF_CMD_DECODER(BPF_TASK_FD_QUERY)
> > }
> > END_BPF_CMD_DECODER(RVAL_DECODED)
> >
> > +BEGIN_BPF_CMD_DECODER(BPF_MAP_LOOKUP_BATCH)
> > +{
> > + unsigned long count;
> > +
> > + if (entering(tcp)) {
> > + set_tcb_priv_ulong(tcp, attr.count);
> > +
> > + PRINT_FIELD_ADDR64("{", attr, in_batch);
> > + PRINT_FIELD_ADDR64(", ", attr, out_batch);
> > + PRINT_FIELD_ADDR64(", ", attr, keys);
> > + PRINT_FIELD_ADDR64(", ", attr, values);
> > + PRINT_FIELD_U(", ", attr, count);
> > + PRINT_FIELD_FD(", ", attr, map_fd, tcp);
> > + PRINT_FIELD_FLAGS(", ", attr, elem_flags,
> > + bpf_map_lookup_elem_flags, "BPF_???");
> > + PRINT_FIELD_X(", ", attr, flags);
> > +
> > + return 0;
>
> If I understand correctly, this results to END_BPF_CMD_DECODER being
> skipped on entering syscall, which essentially means that
> decode_attr_extra_data is not going to be called on entering,
> and the closing "}" is not going to be printed either.
You're right, the "tprintf" should be moved before the "return 0".
>
> I suppose this side effect of an innocent looking "return 0" statement
> was not intended.
>
> > + }
> > +
> > + count = get_tcb_priv_ulong(tcp);
> > + if (count != attr.count) {
> > + PRINT_FIELD_U("=> {", attr, count);
> > + tprintf("}");
> > + }
>
> Here, on exiting syscall, converesely, END_BPF_CMD_DECODER is not skipped, which means
> that both decode_attr_extra_data is going to be called and the closing "}"
> is going to be printed.
>
> I suppose the intent was exactly the opposite, i.e. to execute
> END_BPF_CMD_DECODER on entering syscall and skip it on exiting.
That's the expected behavior actually (apart from the "tprintf" that
should be moved). I tried to follow the implementation of clone3 [1]
such that the following would be printed if attr.count is changed:
bpf(BPF_MAP_LOOKUP_BATCH, {[...], count=32, [...]} => {count=30}, 40)
with everything after and including "=>" printed on syscall exit.
However, I missed that the bpf_attr structure for the batch commands is
not anonymous, so this should be:
bpf(BPF_MAP_LOOKUP_BATCH, {batch={[...], count=32, [...]} => {count=30}}, 40)
I'll send a v4 to fix that and the tprintf.
1 - https://github.com/strace/strace/blob/f17cbaa4bf2ae64af1b56c099ed878c461d53804/clone.c#L225-L227
Paul
More information about the Strace-devel
mailing list