[RFC] make strace handle SIGTRAP properly
Denys Vlasenko
dvlasenk at redhat.com
Fri May 20 19:24:21 UTC 2011
On Fri, 2011-05-20 at 16:18 +0200, Tejun Heo wrote:
> Hello, Denys.
>
> On Fri, May 20, 2011 at 02:08:03PM +0200, Denys Vlasenko wrote:
> > During recent lkml discussions about fixing some long-standing problems
> > with ptrace, I had to look in strace source and experiment with it a
> > bit. (CC-ing some participants).
> >
> > One irritating thing I noticed is that we *still* don't handle
> > user-generated SIGTRAPs. There are users who do want that to work:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=162774
> >
> > Currently, strace "handles" SIGTRAP by code like this:
> >
> > #if defined (I386)
> > if (upeek(tcp, 4*EAX, &eax) < 0)
> > return -1;
> > if (eax != -ENOSYS && !(tcp->flags & TCB_INSYSCALL)) {
> > if (debug)
> > fprintf(stderr, "stray syscall exit: eax = %ld\n", eax);
> > return 0;
> > }
> > #elif ...
>
> That is truly scary; however, it doesn't have to be this scary even
> without TRACESYSGOOD.
I don't see how one can determine that this SIGTRAP is not a
syscall-stop, and moreover, to determine whether this SIGTRAP
is a "real" SIGTRAP (such as generated with "kill -trap")
or one generated by execve.
> Primarily, what's being reported to the ptracer via wait(2) (and
> SIGCHLD) are traps that the tracee are taking. Entering trap means
> that tracee leaves RUNNING state and enters TRACED state. When in
> this state, tracee won't resume execution until directed so by ptracer
> with PTRACE_CONT.
>
> There are different trap sites inside the kernel, which appear as
> different trap types to the ptracer. Most trap sites are there to
> report certain events - tracee is about to do something or finished
> something kind of things. Other than continuing (and possibly
> injecting signal via @data depending on trap site, but please don't do
> this), there isn't whole lot ptracer wants to do with these traps
> themselves.
>
> However, two are somewhat special. The first one is signal delivery
> trap. This trap site sits in the signal delivery path (of course) and
> gets triggered right after the signal is dequeued from pending queue
> but before actually being delievered. The ptracer can change the
> siginfo and signo or even squash the signla altogether.
>
> The second is group stop trap. When a stop signal is received, the
> whole process (task group) enters group stop. IOW, delivery of a stop
> signal by any task in a process initiates group stop and generation
> (not delivery, so the action of sending signal itself) of SIGCONT ends
> it. Once group stop is initiated, each task in the process
> participates in the group stop by stopping if not ptraced and by
> trapping at group stop trap site if ptraced.[1]
I believe my current documentation draft does describe all of the above.
> So, when a trap is reported via wait(2), the first thing to do is
> determining which trap tracee has taken, which should have been easy
> and apparent but unfortunately a bit convoluted and undocumented, but
> it's doable.
>
> The following exit_code is used.
>
> * The signal being delivered for signal delivery trap.
>
> * The signal number which initiated the group stop for group stop
> traps.
>
> * SIGTRAP | optional PTRACE_EVENT_* << 8 for other traps.
>
> However, it's immediately apparent that exit_code itself isn't
> sufficient in determining the specific trap site taken.
> PTRACE_GETSIGINFO can shed some light.
>
> * On signal delivery, it contains the siginfo of the signal. In this
> case, si_code always contains either 0 or negative number. It
> can't contain a positive number no matter how the signal is
> generated.
Not always. Compile this with gcc -m32 -nostartfiles -nostdlib FILE.s:
_start: .globl _start
int3
movl $42,%ebx
movl $1,%eax
int $0x80
This delivers SIGTRAP with si_code = 0x80, which is not a negative
number.
> * On group stop, there is no siginfo and PTRACE_GETSIGINFO will fail
> with -EINVAL.
>
> * On other traps, si_code equals exit_code - SIGTRAP | optional
> PTRACE_EVENT_* << 8.
> So, here are the steps a program can take to determine the trap type.
>
> 1. Test whether exit_code is SIGTRAP | PTRACE_EVENT_* << 8. If so,
> it's one of PTRACE_EVENT traps (let's include TRACESYSGOOD in this
> category too).
>
> 2. Otherwise, execute PTRACE_GETSIGINFO. If the returned si_code is
> 0 or negative, signal is being delivered. If there's no siginfo,
> tracee is participating in a group stop. If si_code equals
> exit_code (it would have to be SIGTRAP), it stopped for a ptrace
> trap without PTRACE_EVENT_* code (!TRACESYSGOOD).
This last bit of info is interesting, thanks. I added it to
"Ptrace documentation, draft #3" I just sent.
--
vda
More information about the Strace-devel
mailing list