detach() logic
Dmitry V. Levin
ldv at altlinux.org
Wed Jun 19 14:35:38 UTC 2013
On Wed, Jun 19, 2013 at 03:51:44PM +0200, Denys Vlasenko wrote:
> On 06/19/2013 02:35 PM, Dmitry V. Levin wrote:
> > On Wed, Jun 19, 2013 at 01:11:56PM +0200, Denys Vlasenko wrote:
> >> Dmitry, I am a bit worried about the flow in this function still.
> >> Let's take a look:
> >>
> >> error = ptrace(PTRACE_DETACH, tcp->pid, 0, 0);
> >> if (error == 0) {
> >> /* On a clear day, you can see forever. */
> >> }
> >> else if (errno != ESRCH) {
> >> /* Shouldn't happen. */
> >> perror_msg("detach: ptrace(PTRACE_DETACH, ...)");
> >> }
> >> else
> >> /* ESRCH: process is either not stopped or doesn't exist. */
> >> if (my_tkill(tcp->pid, 0) < 0) {
> >> if (errno != ESRCH)
> >> /* Shouldn't happen. */
> >> perror_msg("detach: checking sanity");
> >> /* else: process doesn't exist. */
> >> ^^^^^^^^^^^^^^^^^
> >> Well, it may not exist already, but was it *waited for*?
> >> IOW: we may still need to enter waitpid loop.
> >> This may rarely trigger - say, we do "strace -p PROCESS",
> >> and process exits just as we ^C the strace,
> >> and we may end up here.
> >> OTOH, not-waited-for child reparents to init when we exit,
> >> so... do we ever detach() NOT not strace exit, where dead
> >> children are a problem? I see one location:
> >> if (event == PTRACE_EVENT_EXEC) {
> >> if (detach_on_execve && !skip_one_b_execve)
> >> detach(tcp); /* do "-b execve" thingy */
> >> Maybe in the name of correctness we should wait for the process
> >> if we see ESRCH? Possibly with WHOHANG for paranoid reasons.
> >
> > In case of "-b execve", the tracee is in syscall-stop state already, so
>
> To nitpick, it is in PTRACE_EVENT_EXEC stop...
>
> ...or rather, we only know that it *was* in PTRACE_EVENT_EXEC.
>
> It may no longer be true if it was suddenly nuked by SIGKILL
> a microsecond later while we are calling detach() on it.
Or an untraced thread called execve() at this moment.
> Then DETACH fails with ESRCH, tkill(0) fails with ESRCH (I guess...),
> and with current code we do nothing, leaving a zombie.
>
> Actually, that may be a good thing, we want *its parent*
> to consume its exit status. But that parent can be *us* if we act
> on our own child.
Yes, exactly. If TCB_STRACE_CHILD bit is set, then strace is the parent
and therefore is expected to wait for it.
> > PTRACE_DETACH should succeed and there should be no need to wait (and if
> > PTRACE_DETACH failed, then the tracee is no more so strace is expected
> > to wait for it).
>
> My point is, we *dont* wait if both DETACH and probing tkill(0)
> fail with ESRCH. This might be wrong in some situations.
I suppose in that case, if TCB_STRACE_CHILD bit is set, strace should
waitpid the tracee, expecting ECHILD or WIFSIGNALED status.
--
ldv
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.strace.io/pipermail/strace-devel/attachments/20130619/4f0f5e2c/attachment.bin>
More information about the Strace-devel
mailing list