[PATCH] check process death if ptrace() fails, v2

Wed Dec 10 06:17:30 UTC 2008

> Makes sense, although a few ptrace() calls are special
> (they are not acting on tcp and/or they can fail
> and this isn't a problem).

Yes.  PTRACE_ATTACH and PTRACE_TRACEME are special, for example.
Ones where failure is explicitly ignored already are also fine too.

> The problem is, we are somewhere in a middle of post-syscall
> processing, busily extracting register values, struct stat's
> etc, and suddenly ptrace fails. We can stuff errno, ptrace op,
> params, etc, into tcp->ptrace_errno, tcp->ptrace_last_op,
> this is easy. "Not easy" part comes here - what to do next
> with the rest of unfinished syscall result processing?

A ptrace failure in trace_syscall or its callees will bail out early and
not do any more after the first failing ptrace call.  That's fine.

> > What I think we want is to record the errno in a new tcb field
> > and go back to the normal loop.
> 
> What we will do with it? When to print it?
> If task has died, we don't print it (print "killed by" instead),
> this is ok. If it did not die, it does not show up in waitpid,
> and we never touch this tcb again. Thus, we will never use
> tcp->ptrace_errno.

If we resume normal processing, then we'll get to the PTRACE_SYSCALL at
"tracing:".  If PTRACE_SYSCALL fails with ESRCH, it means either that the
task is already dead, or that it is not really stopped.  In both of those
cases, we will see it in a future wait call.  If PTRACE_SYSCALL gets some
other error, that's unexpected and we can barf as it does now.

If a following PTRACE_SYSCALL succeeded, then it means the earlier ptrace
failure was not because the task was gone, but was some other specific
problem in fetching memory or registers.  (Getting ESRCH for such errors
would be unexpected.)  However, the rest of tracing can proceed.  So for
this, instead of barfing and bailing all of strace eagerly as we do now,
we could just print a graceful message like the "<unavailable>" in place of
the missing output for the problem syscall, and then go on as normal.  

> >  but probably better instead to have a
> > check for a saved ptrace error in the main loop after the WIFSIGNALED and
> > WIFEXITED cases.  If we get further and there is a saved ptrace error, then
> > print the "<unavailable>" or perhaps something with the strerror string.
> 
> I'm not sure what are you referring to by "get further".

I mean when it didn't take the WIFSIGNALED or WIFEXITED cases (which do
"continue;") so it gets to:

		if (!WIFSTOPPED(status)) {
			fprintf(stderr, "PANIC: pid %u not stopped\n", pid);
			droptcb(tcp);
			continue;
		}
		if (debug)
			fprintf(stderr, "pid %u stopped, [%s]\n",
				pid, signame(WSTOPSIG(status)));

Below there is the normal signal/syscall trace processing.  What I meant
here was that after a successful PTRACE_SYSCALL we just go around again
(not printing anything yet).  Then if wait gets a WIFSTOPPED report for a
tcb with a recorded ptrace error, print the pending error ("<unavailable>"
or whatever it is) only then (probably in printleader).

> Testing on  hp-rx6600-03.rhts.bos.redhat.com:/root/fix472053/strace.2:
> 
> [root at hp-rx6600-03 strace.2]# ./strace ./k; echo $?
> execve("./k", ["./k"], [/* 22 vars */]) = 0
> ...
> getpid()                                = 20732
> kill(20732, SIGKILL <unavailable>
> +++ killed by SIGKILL +++
> Killed
> 137

Here I think <unfinished ...> makes most sense if you got the SIGKILL
report with no syscall-exit report.  That is, always use that marker if the
syscall-entry report for PID X was followed by a report other than the
syscall-exit report for PID X.

If you got a syscall-exit report but then got PEEKUSR errors, then ideal
would be:

kill(20732, SIGKILL)			 = ? <unavailable>
+++ killed by SIGKILL +++

In another case where there was result-arg printing, that might be:

foo(123, 345, ... <unavailable>)	= ? <unavailable>

for a case where it would normally be:

foo(123, 345, "result buffer")		= 13

That is, whenever there is a syscall-exit report seen, then there is a ")"
and a "= something" printed.

> Here is the part which illustrates how we are forced to struggle
> with unknowns:
> 
>                 if (trace_syscall(tcp) < 0) {
> +if (tcp->ptrace_errno) {
> + tcp->ptrace_errno = 0;
> + secure_ptrace_SYSCALL(tcp, 0);
> + continue;
> +}
> 
> Here, syscall decode finally bailed out, confused and angry.
> I saw in testing that without this code block, strace
> bails out completely.

Yes, that's the behavior we're going to change.  But don't do it with a
"continue;".  As I described above, let it get to the regular logic.
That is:

		if (trace_syscall(tcp) < 0 && !tcp->ptrace_errno) {

or something along those lines.

> With it, I basically blindly reset error indicator and blindly
> perform PTRACE_SYSCALL. I can't just fall through
> to already existing PTRACE_SYSCALL further below, it would error
> out. Does not look too nice.

Please make the "tracing:" call apply the logic I described above.

Probably the right place to reset ptrace_errno is in printleader or
whereever exactly it's used to decide to print something.

Thanks,
Roland