[PATCH v3] bpf: support new commands BPF_MAP_*_BATCH

Paul Chaignon paul.chaignon at gmail.com
Sat Apr 4 16:13:37 UTC 2020


On Sat, Apr 04, 2020 at 05:05:55PM +0300, Dmitry V. Levin wrote:
> On Sat, Apr 04, 2020 at 03:44:34PM +0200, Paul Chaignon wrote:
> > On Fri, Apr 03, 2020 at 08:14:27PM +0300, Dmitry V. Levin wrote:
> > > On Sun, Mar 29, 2020 at 11:49:44AM +0200, Paul Chaignon wrote:
> > > > * xlat/bpf_commands.in (BPF_MAP_LOOKUP_BATCH): New constant introduced
> > > > by Linux commit v5.6-rc1~151^2~46^2~23^2~7.
> > > > (BPF_MAP_LOOKUP_AND_DELETE_BATCH): New constant introduced by Linux
> > > > commit v5.6-rc1~151^2~46^2~23^2~4.
> > > > (BPF_MAP_UPDATE_BATCH, BPF_MAP_DELETE_BATCH): New constants introduced
> > > > by Linux commit v5.6-rc1~151^2~46^2~23^2~6.
> > > > * bpf.c (BEGIN_BPF_CMD_DECODER(BPF_MAP_LOOKUP_BATCH),
> > > > BEGIN_BPF_CMD_DECODER(BPF_MAP_UPDATE_BATCH),
> > > > BEGIN_BPF_CMD_DECODER(BPF_MAP_DELETE_BATCH)): Decode the new commands.
> > > > (decode_BPF_MAP_LOOKUP_AND_DELETE_BATCH): New macro.
> > > > (SYS_FUNC(bpf)): Decode the new commands.
> > > > * bpf_attr.h (BPF_MAP_LOOKUP_BATCH_struct,
> > > > BPF_MAP_LOOKUP_AND_DELETE_BATCH_struct, BPF_MAP_UPDATE_BATCH_struct,
> > > > BPF_MAP_DELETE_BATCH_struct): New structs introduced by Linux commit
> > > > v5.6-rc1~151^2~46^2~23^2~7.
> > > > (BPF_MAP_LOOKUP_BATCH_struct_size,
> > > > BPF_MAP_LOOKUP_AND_DELETE_BATCH_struct_size,
> > > > BPF_MAP_UPDATE_BATCH_struct_size, BPF_MAP_DELETE_BATCH_struct_size): New
> > > > macros.
> > > > * NEWS: Mention this.
> > > > * tests/bpf.c (BPF_MAP_LOOKUP_BATCH_checks, BPF_MAP_UPDATE_BATCH_checks,
> > > > BPF_MAP_DELETE_BATCH_checks): Tests for the new commands.
> > > > 
> > > > Signed-off-by: Paul Chaignon <paul.chaignon at gmail.com>

[...]

> > > > diff --git a/bpf.c b/bpf.c
> > > > index c432181c..04fa369e 100644
> > > > --- a/bpf.c
> > > > +++ b/bpf.c
> > > > @@ -954,6 +954,88 @@ BEGIN_BPF_CMD_DECODER(BPF_TASK_FD_QUERY)
> > > >  }
> > > >  END_BPF_CMD_DECODER(RVAL_DECODED)
> > > >  
> > > > +BEGIN_BPF_CMD_DECODER(BPF_MAP_LOOKUP_BATCH)
> > > > +{
> > > > +	unsigned long count;
> > > > +
> > > > +	if (entering(tcp)) {
> > > > +		set_tcb_priv_ulong(tcp, attr.count);
> > > > +
> > > > +		PRINT_FIELD_ADDR64("{", attr, in_batch);
> > > > +		PRINT_FIELD_ADDR64(", ", attr, out_batch);
> > > > +		PRINT_FIELD_ADDR64(", ", attr, keys);
> > > > +		PRINT_FIELD_ADDR64(", ", attr, values);
> > > > +		PRINT_FIELD_U(", ", attr, count);
> > > > +		PRINT_FIELD_FD(", ", attr, map_fd, tcp);
> > > > +		PRINT_FIELD_FLAGS(", ", attr, elem_flags,
> > > > +				  bpf_map_lookup_elem_flags, "BPF_???");
> > > > +		PRINT_FIELD_X(", ", attr, flags);
> > > > +
> > > > +		return 0;
> > > 
> > > If I understand correctly, this results to END_BPF_CMD_DECODER being
> > > skipped on entering syscall, which essentially means that
> > > decode_attr_extra_data is not going to be called on entering,
> > > and the closing "}" is not going to be printed either.
> > 
> > You're right, the "tprintf" should be moved before the "return 0".
> > 
> > > 
> > > I suppose this side effect of an innocent looking "return 0" statement
> > > was not intended.
> > > 
> > > > +	}
> > > > +
> > > > +	count = get_tcb_priv_ulong(tcp);
> > > > +	if (count != attr.count) {
> > > > +		PRINT_FIELD_U("=> {", attr, count);
> > > > +		tprintf("}");
> > > > +	}
> > > 
> > > Here, on exiting syscall, converesely, END_BPF_CMD_DECODER is not skipped, which means
> > > that both decode_attr_extra_data is going to be called and the closing "}"
> > > is going to be printed.
> > > 
> > > I suppose the intent was exactly the opposite, i.e. to execute
> > > END_BPF_CMD_DECODER on entering syscall and skip it on exiting.
> > 
> > That's the expected behavior actually (apart from the "tprintf" that
> > should be moved).  I tried to follow the implementation of clone3 [1]
> > such that the following would be printed if attr.count is changed:
> > 
> >   bpf(BPF_MAP_LOOKUP_BATCH, {[...], count=32, [...]} => {count=30}, 40)
> > 
> > with everything after and including "=>" printed on syscall exit.
> 
> It's important to call decode_attr_extra_data on entering because this
> extra data might be the reason why syscall returned an error and our
> decoder was skipped on exiting syscall.

Ok, that makes sense.

Then, I think I need to do "END_BPF_CMD_DECODER(0)" for entering and
"return RVAL_DECODED" for exiting.  With the appropriate tprintf, that
should give me:

  bpf(BPF_MAP_LOOKUP_BATCH, {batch={[...], count=32, [...]} => {count=30}, ...}, 40)

when decode_attr_extra_data() actually prints something.

> 
> This is the reason why the decoder of clone3 syscall calls
> print_nonzero_bytes on entering syscall.
> 
> 
> -- 
> ldv


More information about the Strace-devel mailing list