[PATCH v3] bpf: support new commands BPF_MAP_*_BATCH

Paul Chaignon paul.chaignon at gmail.com
Sat Apr 4 13:44:34 UTC 2020


On Fri, Apr 03, 2020 at 08:14:27PM +0300, Dmitry V. Levin wrote:
> On Sun, Mar 29, 2020 at 11:49:44AM +0200, Paul Chaignon wrote:
> > * xlat/bpf_commands.in (BPF_MAP_LOOKUP_BATCH): New constant introduced
> > by Linux commit v5.6-rc1~151^2~46^2~23^2~7.
> > (BPF_MAP_LOOKUP_AND_DELETE_BATCH): New constant introduced by Linux
> > commit v5.6-rc1~151^2~46^2~23^2~4.
> > (BPF_MAP_UPDATE_BATCH, BPF_MAP_DELETE_BATCH): New constants introduced
> > by Linux commit v5.6-rc1~151^2~46^2~23^2~6.
> > * bpf.c (BEGIN_BPF_CMD_DECODER(BPF_MAP_LOOKUP_BATCH),
> > BEGIN_BPF_CMD_DECODER(BPF_MAP_UPDATE_BATCH),
> > BEGIN_BPF_CMD_DECODER(BPF_MAP_DELETE_BATCH)): Decode the new commands.
> > (decode_BPF_MAP_LOOKUP_AND_DELETE_BATCH): New macro.
> > (SYS_FUNC(bpf)): Decode the new commands.
> > * bpf_attr.h (BPF_MAP_LOOKUP_BATCH_struct,
> > BPF_MAP_LOOKUP_AND_DELETE_BATCH_struct, BPF_MAP_UPDATE_BATCH_struct,
> > BPF_MAP_DELETE_BATCH_struct): New structs introduced by Linux commit
> > v5.6-rc1~151^2~46^2~23^2~7.
> > (BPF_MAP_LOOKUP_BATCH_struct_size,
> > BPF_MAP_LOOKUP_AND_DELETE_BATCH_struct_size,
> > BPF_MAP_UPDATE_BATCH_struct_size, BPF_MAP_DELETE_BATCH_struct_size): New
> > macros.
> > * NEWS: Mention this.
> > * tests/bpf.c (BPF_MAP_LOOKUP_BATCH_checks, BPF_MAP_UPDATE_BATCH_checks,
> > BPF_MAP_DELETE_BATCH_checks): Tests for the new commands.
> > 
> > Signed-off-by: Paul Chaignon <paul.chaignon at gmail.com>
> 
> Thanks, see my comments below.

Thanks for the review Dmitry!

> 
> > ---
> > Changelogs:
> >   Changes in v3:
> >   - Add tests for BPF_MAP_UPDATE_BATCH and BPF_MAP_DELETE_BATCH.
> >   - Do not print unused fields in_batch and out_batch.
> >   - Print count on entering and exiting.
> >   - Fix printing of flags.
> >   Changes in v2:
> >   - Add ATTRIBUTE_ALIGNED(8) to uint64_t attributes.
> >   - Rebase on master.
> > 
> >  NEWS                 |  3 ++
> >  bpf.c                | 86 ++++++++++++++++++++++++++++++++++++++++++++
> >  bpf_attr.h           | 24 +++++++++++++
> >  tests/bpf.c          | 74 ++++++++++++++++++++++++++++++++++++++
> >  xlat/bpf_commands.in |  4 +++
> >  5 files changed, 191 insertions(+)
> > 
> > diff --git a/NEWS b/NEWS
> > index 6806c1e3..2965d629 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -14,6 +14,9 @@ Noteworthy changes in release ?.? (????-??-??)
> >       and syscall time output (using --absolute-timestamps,
> >       --relative-timestamps, and --syscall-times options, respectively).
> >    * Implemented PTRACE_GETREGS API support on hppa.
> > +  * Implemented decoding of BPF_MAP_LOOKUP_BATCH,
> > +    BPF_MAP_LOOKUP_AND_DELETE_BATCH, BPF_MAP_UPDATE_BATCH, and
> > +    BPF_MAP_DELETE_BATCH bpf syscall commands.
> >    * Enhanced decoding of BPF_MAP_CREATE and BPF_PROG_ATTACH bpf syscall
> >      commands.
> >    * Updated lists of BPF_* constants.
> > diff --git a/bpf.c b/bpf.c
> > index c432181c..04fa369e 100644
> > --- a/bpf.c
> > +++ b/bpf.c
> > @@ -954,6 +954,88 @@ BEGIN_BPF_CMD_DECODER(BPF_TASK_FD_QUERY)
> >  }
> >  END_BPF_CMD_DECODER(RVAL_DECODED)
> >  
> > +BEGIN_BPF_CMD_DECODER(BPF_MAP_LOOKUP_BATCH)
> > +{
> > +	unsigned long count;
> > +
> > +	if (entering(tcp)) {
> > +		set_tcb_priv_ulong(tcp, attr.count);
> > +
> > +		PRINT_FIELD_ADDR64("{", attr, in_batch);
> > +		PRINT_FIELD_ADDR64(", ", attr, out_batch);
> > +		PRINT_FIELD_ADDR64(", ", attr, keys);
> > +		PRINT_FIELD_ADDR64(", ", attr, values);
> > +		PRINT_FIELD_U(", ", attr, count);
> > +		PRINT_FIELD_FD(", ", attr, map_fd, tcp);
> > +		PRINT_FIELD_FLAGS(", ", attr, elem_flags,
> > +				  bpf_map_lookup_elem_flags, "BPF_???");
> > +		PRINT_FIELD_X(", ", attr, flags);
> > +
> > +		return 0;
> 
> If I understand correctly, this results to END_BPF_CMD_DECODER being
> skipped on entering syscall, which essentially means that
> decode_attr_extra_data is not going to be called on entering,
> and the closing "}" is not going to be printed either.

You're right, the "tprintf" should be moved before the "return 0".

> 
> I suppose this side effect of an innocent looking "return 0" statement
> was not intended.
> 
> > +	}
> > +
> > +	count = get_tcb_priv_ulong(tcp);
> > +	if (count != attr.count) {
> > +		PRINT_FIELD_U("=> {", attr, count);
> > +		tprintf("}");
> > +	}
> 
> Here, on exiting syscall, converesely, END_BPF_CMD_DECODER is not skipped, which means
> that both decode_attr_extra_data is going to be called and the closing "}"
> is going to be printed.
> 
> I suppose the intent was exactly the opposite, i.e. to execute
> END_BPF_CMD_DECODER on entering syscall and skip it on exiting.

That's the expected behavior actually (apart from the "tprintf" that
should be moved).  I tried to follow the implementation of clone3 [1]
such that the following would be printed if attr.count is changed:

  bpf(BPF_MAP_LOOKUP_BATCH, {[...], count=32, [...]} => {count=30}, 40)

with everything after and including "=>" printed on syscall exit.

However, I missed that the bpf_attr structure for the batch commands is
not anonymous, so this should be:

  bpf(BPF_MAP_LOOKUP_BATCH, {batch={[...], count=32, [...]} => {count=30}}, 40)

I'll send a v4 to fix that and the tprintf.

1 - https://github.com/strace/strace/blob/f17cbaa4bf2ae64af1b56c099ed878c461d53804/clone.c#L225-L227 

Paul


More information about the Strace-devel mailing list