[rfc] new generic ptrace api for setting syscall info

Renzo Davoli renzo at cs.unibo.it
Tue Nov 2 13:50:58 UTC 2021

I am sorry for the delay. Life synch problems.
> at a high level, i like where this is going.  i think we should break them up as
> (1) that's how you get things reviewed & accepted faster,
> (2) certain pieces need more help from arches than others,
> (3) they look easy to break up & make logically independent,
> (4) i selfishly care more for some aspects than others :),
> (5) i want to avoid an "all or nothing" type of functionality merge.
Agree, entirely, including (4) for my interests.
> the functionality that matters to me with PTRACE_SET_SYSCALL_INFO:
> * set syscall nr (e.g. skip it) 
> * change syscall arguments
> * set syscall return value
> * set syscall error value
> imo, without these, PTRACE_SET_SYSCALL_INFO is not super useful, and seems like
> it'd be hard to partially implement any of these.  especially as syscall_set_nr
> is the worst performer, and is a bit of the foundation here.  a survey of the
> current asm/syscall.h tree:
> * syscall_set_nr
>   (1/22) only csky has this atm
> * syscall_set_arguments
>   (14/22) missing alpha arc h8300 hexagon m68k mips parisc sh
> * syscall_set_return_value
>   (17/22) missing alpha h8300 hexagon m68k sh
arc arm arm64 csky hexagon ia64 mips nds32 nios2 openrisc parisc powerpc riscv s390 sh sparc x86 xtensa

It is reasonable to design PTRACE_SET_SYSCALL_INFO for the same set of archs.
* syscall_set_nr
  (1/18) only csky has this atm
* syscall_set_arguments
  (12/18) missing arc hexagon mips parisc sh
* syscall_set_return_value
  (16/18) missing hexagon sh
> you've also thrown in these:
> * change IP
> * change SP
> i'm guessing you've done that simply to mirror the API?  i would argue that
> these are significantly less interesting (i won't say never as i could think
> of a few).  
Not entirely. In vuos we set the IP to repeat the system call.
For what concerns the change of SP, I am with you: it is just to preserve the simmetry.
In similar situations, it has happened to me many times that I discover a suitable usage later.
> so what if we start off with a PTRACE_SET_SYSCALL_INFO that does
> not support changing these values.  i'm not sure we could get away with using
> a sentinel value here (e.g. require they be set to -1 else we EINVAL), so
> maybe requiring them to be the same as would be OK ?  it's reasonable imo to
> assume the tracer first called PTRACE_GET_SYSCALL_INFO before modifying, so
> they'll have easy access to the current values.  if they don't match, we will
> return EINVAL.  this will allow merging of the API without also implementing
> the corresponding APIs in asm/ptrace.h.
> * instruction_pointer_set
>   (11/22) missing alpha arc h8300 hexagon ia64 m68k microblaze nds32 nios2 openrisc xtensa
(12/18) missing arc hexagon ia64 nds32 nios2 xtensa
> * stack_pointer_set
>   (4/22) only mips riscv sh x86
(4/18) only mips riscv sh x86
> i guess we're just ignoring arch entirely.  not unreasonable considering
> changing execution architecture state is ... not useful, if even possible
> for many architectures.  
> so should we enforce it be 0 and return an EINVAL
> if it isn't ?  otherwise it turns into a field we can never use for anything
> else in the set path.  afaict, there is no AUDIT_xxx value that is zero atm,
> and never will be since it encodes the ELF machine (EM_*), and EM_NONE==0.
I envision the default usage of PTRACE_SET_SYSCALL_INFO to be:
  ptrace(PTRACE_GET_SYSCALL_INFO, tracee_pid, sizeof(info), &info)
	/* modify something in info */
  ptrace(PTRACE_SET_SYSCALL_INFO, tracee_pid, sizeof(info), &info)
so I propose to ignore the value of arch, otherwise programmers need to remember to 
set the field to zero prior to call PTRACE_SET_SYSCALL_INFO.

Looking at the future and aiming at the goal to have the entire PTRACE_SET_SYSCALL_INFO
implementend although through some steps, we need to provide a way to inform
the programmers about which features have been implemented (change IP/change SP), so that
those features can be used if present or a workaround applied.

I propose to check if IP/SP have been modified on archs where change IP/SP
have not implemented yet and return an EINVAL in case.
The same approach could be applied to an attempt to modify arch.
> next you have seccomp improvements.  these are interesting, but i think
> important to not block PTRACE_SET_SYSCALL_INFO on since seccomp support is
> a bit spotty already across arches.  and it seems like it calls for a bit
> of API bikeshedding.  while your reuse of the existing INFO ops doesn't
> feel too terrible since there are direct parallels between them, adding a
> new op that only shows up on set feels a little awkward.  it'd be nice if
> we could stick to PTRACE_SYSCALL_INFO_SECCOMP somehow.  with the fields we
> have now, we have the syscall number, args, and return value.  so can't we
> do what you want ?  there's no access to setting an error distinct from a
> return value ... but oes seccomp allow that either ?  if not, extending
> that limitation from seccomp to ptrace-seccomp doesn't feel bad.

One of the features we agree to support in the first patch to propose is:
> * set syscall nr (e.g. skip it)
Skip the system call also means to provide a suitable return value/errno
to be returned to the caller.
So in this case we need to encode for PTRACE_SET_SYSCALL_INFO in some way:
 * the info that the syscall must be skipped 
 * rval/is_error.
we provide a specific tag (as I propose) PTRACE_SYSCALL_INFO_SECCOMP_SKIP
and we use the "exit" branch of the union in  struct ptrace_syscall_info
to provide rval/errno,
we set seccomp.nr to -1 and then we need to "recycle" the fields in the seccomp
branch (nr, args[6], and ret_data).
seccomp.ret_data is not suitable for the syscall return value as it is _u32, on 64 bits archs
one may need to return 64 bit values. Modify the struct ptrace_syscall_info is
out of scope, it would break the compatibility with the PTRACE_GET_SYSCALL current usage.
So we should use two of the args for rval/errno.

I am quite supportive for my proposal, When a user wants to skip the system call,
they use a specific tag and set the "exit" fields, as it is an exit, not an entry.
There is no syscall to enter if it has to be skipped.

> in terms of contributing, i think you can see where my interests align,
> and so i'd be happy to chip in for those parts.  but for others, i'll
> just wish you the best of luck :).

I think we have a quite good alignemnt. Luck is always needed anyway.


More information about the Strace-devel mailing list