[PATCH] do not detach from tracee if we get ptrace error

Denys Vlasenko dvlasenk at redhat.com
Tue Jan 3 19:12:54 UTC 2012


Before this patch, if a thread got nuked by exit in another thread
and we happened to poke it at the same time, we print "????(" thingy
and detach the thread. Since we removed "detach before death" logic,
this no longer matches the behavior of other threads.

An example:

[pid  1780] exit_group(1)               = ?
[pid  1778] ????( <unfinished ...>
Process 1778 detached
[pid  3881] exit_group(42)              = ?
[pid  3880] exit_group(42)              = ?
[pid  5860] +++ exited with 1 +++
[pid  2444] +++ exited with 1 +++
[pid  2440] +++ exited with 1 +++
[pid  2437] +++ exited with 1 +++
[pid  2434] +++ exited with 1 +++
[pid  1856] +++ exited with 1 +++
[pid  1853] +++ exited with 1 +++
[pid  3881] +++ exited with 42 +++
[pid  3880] +++ exited with 42 +++
[pid  1849] +++ exited with 1 +++
[pid  1780] +++ exited with 1 +++
[pid  1765] +++ exited with 1 +++
[pid  1533] +++ exited with 1 +++
[pid  1365] +++ exited with 1 +++
[pid  1356] +++ exited with 1 +++
[pid  1352] +++ exited with 1 +++
[pid  1351] +++ exited with 1 +++
+++ exited with 1 +++

Note that "+++ exited with 1 +++" is never printed for 1778.

After the patch:

[pid 17765] exit_group(1)               = ?
[pid 32362] exit_group(42)              = ?
[pid 32362] +++ exited with 42 +++
[pid 21680] <... waitpid resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 42}], 0) = 32362
[pid 17791] futex(0x2b98c4, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 21680] --- {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=32362, si_status=42, si_utime=0, si_stime=0} (Child exited) ---
[pid 21680] ????( <unfinished ...>
[pid 21517] +++ exited with 1 +++
[pid 17791] +++ exited with 1 +++
[pid 21986] +++ exited with 1 +++
[pid 22499] +++ exited with 1 +++
[pid 21680] +++ exited with 1 +++
[pid 22502] <... futex resumed> )       = ? <unavailable>
[pid 22502] +++ exited with 1 +++
[pid 22035] +++ exited with 1 +++
[pid 18224] +++ exited with 1 +++
[pid 17903] +++ exited with 1 +++
[pid 17899] +++ exited with 1 +++
[pid 17816] +++ exited with 1 +++
[pid 17813] +++ exited with 1 +++
[pid 17804] +++ exited with 1 +++
[pid 17768] +++ exited with 1 +++
[pid 17765] +++ exited with 1 +++
[pid 17787] +++ exited with 1 +++
+++ exited with 1 +++

Now 21680's exit is shown in the same way as exits of all other threads.


* strace (trace): do not detach from tracee which experienced ptrace error.

-- 
vda

diff -d -urpN strace.1/strace.c strace.2/strace.c
--- strace.1/strace.c	2012-01-03 19:42:51.154290001 +0100
+++ strace.2/strace.c	2012-01-03 19:23:13.800107214 +0100
@@ -2606,10 +2606,18 @@ trace()
  					tprints(" <unfinished ...>");
  					printtrailer();
  				}
-				detach(tcp);
+				/* We assume that ptrace error was caused by process death.
+				 * We used to detach(tcp) here, but since we no longer
+				 * implement "detach before death" policy/hack,
+				 * we can let this process to report its death to us
+				 * normally, via WIFEXITED or WIFSIGNALED wait status.
+				 */
  			} else {
+				/* When does this happen? */
+				/* FIXME: PTRACE_KILL is deprecated - use tgkill(SIGKILL)? */
  				ptrace(PTRACE_KILL,
  					tcp->pid, (char *) 1, SIGTERM);
+				/* FIXME: don't do this? */
  				droptcb(tcp);
  			}
  			continue;




More information about the Strace-devel mailing list