[PATCH] make strace handle SIGTRAP properly

Wed May 25 11:53:50 UTC 2011

On Wed, 2011-05-25 at 02:23 +0400, Dmitry V. Levin wrote:
> > +static void error_msg_and_die(const char *fmt, ...) /* TODO: __attribute__ ((noreturn, format (printf, 1, 2))) */ ;
> > +static void error_msg_and_die(const char *fmt, ...)
> 
> I really don't like this interface, it would be less flexible than error(3).

I used to use [p]error_msg[_and_die](const char *fmt, ...)
functions in several other projects. They proved to be no less versatile
than error(), but are significantly more readable. Compare:

error(1, errno, "Can't write to '%s'", filename);

with:

perror_msg_and_die("Can't write to '%s'", filename);

I would need to read "man error" to understand that 1st form
exits. In second form, it's blindingly obvious.

FOO_and_die() has two less params and can be annotated "noreturn" to
help gcc optimize calls better. error() can't be annotated as such.

Also, "man error" says:

"CONFORMING TO
       These functions and variables are GNU extensions, and should not
be used in programs intended to be portable."

Seeing many directives a-la "#if defined(SUNOS4) || defined(FREEBSD)"
I had an impression strace does not assume we are on glibc...


> > +{
> > +	char *msg;
> > +	va_list p;
> > +
> > +	va_start(p, fmt);
> > +	msg = NULL;
> > +	vasprintf(&msg, fmt, p);
> > +	if (msg) {
> > +		fprintf(stderr, "%s: %s\n", progname, msg);
> > +		free(msg);
> > +	}
> > +	va_end(p);
> > +
> > +	cflag = 0; /* or else cleanup may print summary */
> > +        cleanup();
> > +	fflush(NULL);
> > +        _exit(1);
> 
> When such function is called in parent process, exit() is more appropriate
> than _exit().  Otherwise, cleanup() must not be called.

I don't want to be limited to calling error_msg_and_die in parent
process only. However, since we don't use atexit et al, exit(1)
is essentially fflush(NULL) + _exit(1). I can use that.

You are right regarding cleanup(): I need to compare getpid() with the
pid of stracing process (usually parent, unless -D), and skip cleanup()
if they don't match.

Fixing...


> > +/*
> > + * Test whether the kernel support PTRACE_O_TRACESYSGOOD.
> > + * First fork a new child, call ptrace(PTRACE_SETOPTIONS) on it,
> > + * and then see whether it will stop with (SIGTRAP | 0x80).
> > + */
> > +static void
> > +test_ptrace_setoptions_for_all(void)
> > +{
> > +	const unsigned int test_options = PTRACE_O_TRACESYSGOOD | PTRACE_O_TRACEEXEC;
> > +	int pid;
> > +	int it_worked = 0;
> > +
> > +	pid = fork();
> > +	if (pid < 0)
> > +		error_msg_and_die("fork failed");
> > +
> > +	if (pid == 0) {
> > +		pid = getpid();
> > +		if (ptrace(PTRACE_TRACEME, 0L, 0L, 0L) < 0)
> > +			/* "parent, something is deeply wrong!" */
> > +			kill(pid, SIGKILL);
> 
> The approach used in test_ptrace_setoptions() was just to _exit(1) in this
> case.  Why this test does this part differently?

_exit(1) here will exit *in the child*. Parent would see WIFEXITED and
decide that it's just merely a test failire, and will continue (without
using options it tested for).

Whereas the more logical action is to abort, because PTRACE_TRACEME not
working is a *very* unexpected situation. We probably are better off
bailing out than trying to continue "against all odds".

kill(pid, SIGKILL) you pointed at will result parent taking
if (!WIFSTOPPED(status)) ... branch here:

                if (WIFEXITED(status))
                        break;
                if (!WIFSTOPPED(status)) {
                        kill(pid, SIGKILL);
                        error_msg_and_die("%s: unexpected wait status %x", __func__, status);
                }

which is exactly what I want to happen in this case. _exit(1) would go
through "if (WIFEXITED(status)) break".

> 
> > +		kill(pid, SIGSTOP);
> > +		_exit(0); /* parent should see entry into this syscall */
> > +	}
> > +
> > +	while (1) {
> > +		int status, tracee_pid;
> > +
> > +		errno = 0;
> > +		tracee_pid = wait(&status);
> > +		if (tracee_pid <= 0) {
> > +			if (errno == EINTR)
> > +				continue;
> 
> In test_ptrace_setoptions() there is a check for errno == ECHILD.
> I suppose it won't harm in this case, too.

This check should do... what? Something like this?

if (errno == ECHILD)
    error_msg_and_die("Your kernel is terminally broken: it can't even "
            "properly wait for a single child. Bailing out");

If yes, then the below code does that already:

> > +			kill(pid, SIGKILL);
> > +			error_msg_and_die("%s: unexpected wait result %d", __func__, tracee_pid);
> 
> Well, I'm not quite happy with test_* functions aborting the whole strace
> unconditionally.

There are cases when it makes sense to abort. This is such case:
wait(&status) doesn't see the child we _just_ created_. This is
total "WTF???" case. I don't see how continuing here is better.

Alternative POV can be: this code path most likely will never
be taken, but if it will be, we are going to see users reporting
the message which will be easy to track down. Whereas continuing here
would just make strace not use an option, hiding this "WTF?" case
from users and us. (Not that I expect strace to actually work on kernels
which can't even wait() for one simple child. It will probably dies
later anyway, but possibly with much more obscure error msg...)

> > +		}
> > +		if (WIFEXITED(status))
> > +			break;
> 
> I'd also check for exit status here.

We need to exit loop regardless of exit code: there is no child anymore,
we have no one to wait for.

However, just for paranoid reasons we can check that WEXITSTATUS == 0.
What do you propose to do if it isn't zero?

> > +		if (!WIFSTOPPED(status)) {
> > +			kill(pid, SIGKILL);
> > +			error_msg_and_die("%s: unexpected wait status %x", __func__, status);
> > +		}
> 
> Same unhappiness about error_msg_and_die.

The child is very simple:

                if (ptrace(PTRACE_TRACEME, 0L, 0L, 0L) < 0)
                        /* "parent, something is deeply wrong!" */
                        kill(pid, SIGKILL);
                kill(pid, SIGSTOP);
                _exit(0); /* parent should see entry into this syscall */

But we saw neither WIFEXITED nor WIFSTOPPED from it. What should we do?
I prefer to die, giving user enough info to provide a useful bug report.
"error_msg_and_die("%s: unexpected wait status %x", __func__, status)"
does exactly that.


> > +		if (WSTOPSIG(status) == SIGSTOP) {
> > +			/*
> > +			 * We don't check "options aren't accepted" error.
> > +			 * If it happens, we'll never get (SIGTRAP | 0x80),
> > +			 * and thus will decide to not use the option.
> > +			 * IOW: the outcome of the test will be the same.
> > +			 */
> > +			ptrace(PTRACE_SETOPTIONS, pid, 0L, test_options);
> 
> I'm not sure that this lengthy comment is more obvious than the check
> itself.

Please clarify: do you propose to perform the check and emit error msg
if this ptrace() call failed?
(without exiting - here it's a survivable error).

> > +		}
> > +		if (WSTOPSIG(status) == (SIGTRAP | 0x80)) {
> > +			it_worked = 1;
> > +		}
> > +		if (ptrace(PTRACE_SYSCALL, pid, 0L, 0L) < 0) {
> > +			kill(pid, SIGKILL);
> > +			error_msg_and_die("%s: PTRACE_SYSCALL doesn't work");
> 
> What about ESRCH?

Since we know exactly what child does, we _know_ PTRACE_SYSCALL must
succeed here.

> > @@ -2597,7 +2701,7 @@ Process %d attached (waiting for parent)
> >  			fprintf(stderr, "pid %u stopped, [%s]\n",
> >  				pid, signame(WSTOPSIG(status)));
> >  
> > -		if (ptrace_setoptions && (status >> 16)) {
> > +		if (status >> 16) {
> 
> Is this change accurate enough?
> How this (status >> 16) should be handled if strace haven't requested it?

All PTRACE_EVENTs defined by Linux so far can be harmlessly ignored.
We can emit a diagnostic if we see an unexpected one.


> >  int debug = 0, followfork = 0;
> > -unsigned int ptrace_setoptions = 0;
> > +unsigned int ptrace_setoptions_followfork = 0;
> > +static unsigned int ptrace_setoptions_for_all = 0;
> > +/* Which WSTOPSIG(status) value marks syscall traps? */
> > +static unsigned int SYSCALLTRAP = SIGTRAP;
> 
> I think giving macros-like names to variables is confusing,
> even when these variables are going to be used in place of macros.
 

Agreed. Renaming to syscall_trap_sig.

Please take a look at the patch below.

-- 
vda


diff -d -urpN strace.8/strace.c strace.9/strace.c

--- strace.8/strace.c	2011-05-24 23:06:41.064757632 +0200
+++ strace.9/strace.c	2011-05-25 13:47:47.308762755 +0200
@@ -87,7 +87,7 @@ int debug = 0, followfork = 0;
 unsigned int ptrace_setoptions_followfork = 0;
 unsigned int ptrace_setoptions_for_all = 0;
 /* Which WSTOPSIG(status) value marks syscall traps? */
-static unsigned int SYSCALLTRAP = SIGTRAP;
+static unsigned int syscall_trap_sig = SIGTRAP;
 int dtime = 0, xflag = 0, qflag = 0;
 cflag_t cflag = CFLAG_NONE;
 static int iflag = 0, interactive = 0, pflag_seen = 0, rflag = 0, tflag = 0;
@@ -116,6 +116,7 @@ int tracing_paths = 0;
 
 static int exit_code = 0;
 static int strace_child = 0;
+static int strace_tracer_pid = 0;
 
 static char *username = NULL;
 uid_t run_uid;
@@ -229,15 +230,20 @@ static void error_msg_and_die(const char
 	msg = NULL;
 	vasprintf(&msg, fmt, p);
 	if (msg) {
+		fflush(NULL);
 		fprintf(stderr, "%s: %s\n", progname, msg);
 		free(msg);
 	}
 	va_end(p);
 
-	/* TODO? cflag = 0; -- or else cleanup() may print summary */
-	cleanup();
-	fflush(NULL);
-	_exit(1);
+	/* Careful: don't do cleanup not from tracer process */
+	if (strace_tracer_pid == getpid()) {
+		/* TODO? cflag = 0; -- or else cleanup() may print summary */
+		cleanup();
+	}
+
+	/*fflush(NULL); - exit does it */
+	exit(1);
 }
 
 #ifdef SVR4
@@ -616,7 +622,8 @@ startup_child(char **argv)
 	if ((pid != 0 && daemonized_tracer) /* parent: to become a traced process */
 	 || (pid == 0 && !daemonized_tracer) /* child: to become a traced process */
 	) {
-		pid = getpid();
+		if (pid == 0)
+			pid = getpid();
 #ifdef USE_PROCFS
 		if (outf != stderr) close(fileno(outf));
 #ifdef MIPS
@@ -710,6 +717,8 @@ startup_child(char **argv)
 	}
 
 	/* We are the tracer.  */
+	strace_tracer_pid = (pid != 0 ? pid : getpid());
+
 	tcp = alloctcb(daemonized_tracer ? getppid() : pid);
 	if (daemonized_tracer) {
 		/* We want subsequent startup_attach() to attach to it.  */
@@ -864,7 +873,7 @@ test_ptrace_setoptions_for_all(void)
 	}
 
 	if (it_worked) {
-		SYSCALLTRAP = (SIGTRAP | 0x80);
+		syscall_trap_sig = (SIGTRAP | 0x80);
 		ptrace_setoptions_for_all = test_options;
 		if (debug)
 			fprintf(stderr, "ptrace_setoptions_for_all = %#x\n",
@@ -889,6 +898,8 @@ main(int argc, char *argv[])
 
 	progname = argv[0] ? argv[0] : "strace";
 
+	strace_tracer_pid = getpid();
+
 	/* Allocate the initial tcbtab.  */
 	tcbtabsize = argc;	/* Surely enough for all -p args.  */
 	if ((tcbtab = calloc(tcbtabsize, sizeof tcbtab[0])) == NULL) {
@@ -1000,7 +1011,7 @@ main(int argc, char *argv[])
 					progname, optarg);
 				break;
 			}
-			if (pid == getpid()) {
+			if (pid == strace_tracer_pid) {
 				fprintf(stderr, "%s: I'm sorry, I can't let you do that, Dave.\n", progname);
 				break;
 			}
@@ -1858,7 +1869,7 @@ detach(struct tcb *tcp, int sig)
 				break;
 			}
 			error = ptrace_restart(PTRACE_CONT, tcp,
-					WSTOPSIG(status) == SYSCALLTRAP ? 0
+					WSTOPSIG(status) == syscall_trap_sig ? 0
 					: WSTOPSIG(status));
 			if (error < 0)
 				break;
@@ -2512,6 +2523,8 @@ handle_ptrace_event(int status, struct t
 			fprintf(stderr, "PTRACE_EVENT_EXEC on pid %d (ignored)\n", tcp->pid);
 		return 0;
 	}
+	/* Some PTRACE_EVENT_foo we didn't ask for?! */
+	fprintf(stderr, "Unexpected status %x on pid %d\n", status, tcp->pid);
 	return 1;
 }
 #endif
@@ -2751,7 +2764,7 @@ Process %d attached (waiting for parent)
 			goto tracing;
 		}
 
-		if (WSTOPSIG(status) != SYSCALLTRAP) {
+		if (WSTOPSIG(status) != syscall_trap_sig) {
 			if (WSTOPSIG(status) == SIGSTOP &&
 					(tcp->flags & TCB_SIGTRAPPED)) {
 				/*