detach() logic

Dmitry V. Levin ldv at altlinux.org
Wed Jun 19 14:35:38 UTC 2013


On Wed, Jun 19, 2013 at 03:51:44PM +0200, Denys Vlasenko wrote:
> On 06/19/2013 02:35 PM, Dmitry V. Levin wrote:
> > On Wed, Jun 19, 2013 at 01:11:56PM +0200, Denys Vlasenko wrote:
> >> Dmitry, I am a bit worried about the flow in this function still.
> >> Let's take a look:
> >>
> >>                 error = ptrace(PTRACE_DETACH, tcp->pid, 0, 0);
> >>                 if (error == 0) {
> >>                         /* On a clear day, you can see forever. */
> >>                 }
> >>                 else if (errno != ESRCH) {
> >>                         /* Shouldn't happen. */
> >>                         perror_msg("detach: ptrace(PTRACE_DETACH, ...)");
> >>                 }
> >>                 else
> >>                 /* ESRCH: process is either not stopped or doesn't exist. */
> >>                 if (my_tkill(tcp->pid, 0) < 0) {
> >>                         if (errno != ESRCH)
> >>                                 /* Shouldn't happen. */
> >>                                 perror_msg("detach: checking sanity");
> >>                         /* else: process doesn't exist. */
> >> ^^^^^^^^^^^^^^^^^
> >> Well, it may not exist already, but was it *waited for*?
> >> IOW: we may still need to enter waitpid loop.
> >> This may rarely trigger - say, we do "strace -p PROCESS",
> >> and process exits just as we ^C the strace,
> >> and we may end up here.
> >> OTOH, not-waited-for child reparents to init when we exit,
> >> so... do we ever detach() NOT not strace exit, where dead
> >> children are a problem? I see one location:
> >>   if (event == PTRACE_EVENT_EXEC) {
> >>       if (detach_on_execve && !skip_one_b_execve)
> >>               detach(tcp); /* do "-b execve" thingy */
> >> Maybe in the name of correctness we should wait for the process
> >> if we see ESRCH? Possibly with WHOHANG for paranoid reasons.
> > 
> > In case of "-b execve", the tracee is in syscall-stop state already, so
> 
> To nitpick, it is in PTRACE_EVENT_EXEC stop...
> 
> ...or rather, we only know that it *was* in PTRACE_EVENT_EXEC.
> 
> It may no longer be true if it was suddenly nuked by SIGKILL
> a microsecond later while we are calling detach() on it.

Or an untraced thread called execve() at this moment.

> Then DETACH fails with ESRCH, tkill(0) fails with ESRCH (I guess...),
> and with current code we do nothing, leaving a zombie.
> 
> Actually, that may be a good thing, we want *its parent*
> to consume its exit status. But that parent can be *us* if we act
> on our own child.

Yes, exactly.  If TCB_STRACE_CHILD bit is set, then strace is the parent
and therefore is expected to wait for it.

> > PTRACE_DETACH should succeed and there should be no need to wait (and if
> > PTRACE_DETACH failed, then the tracee is no more so strace is expected
> > to wait for it).
> 
> My point is, we *dont* wait if both DETACH and probing tkill(0)
> fail with ESRCH. This might be wrong in some situations.

I suppose in that case, if TCB_STRACE_CHILD bit is set, strace should
waitpid the tracee, expecting ECHILD or WIFSIGNALED status.


-- 
ldv
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.strace.io/pipermail/strace-devel/attachments/20130619/4f0f5e2c/attachment.bin>


More information about the Strace-devel mailing list