clone on Linux

Ganesan Rajagopal rganesan at myrealbox.com
Wed Jan 3 08:48:47 UTC 2001


>>>>> "Ganesan" == Ganesan Rajagopal <rganesan at novell.com> writes:

> Thank you! This has been *very* helpful. I could get SUBTERFUGUE working
> very nicely with a test pthreads program. Though I am only a beginning
> Python programmer I am able to follow the code in p_linux_i386_trick.py
> (which I assume has the relevant code). I would like to add this support to
> strace if no one is duplicating the effort already (Wichert, is anyone
> actively working on fixing clone support for strace?).

Okay, I spent some with this and the new kernel sources and have got strace
working with 2.4.0-prerelease kernel. There was really very little to do,
Mike (SUBTERFUGUE) and other kernel developers have already done the real
work needed in the kernel (Thanks guys!).

> Any way, coming back to strace, when I have been debugging strace with
> multi-threaded programs in two scenarios. When the program is started under
> strace control, strace doesn't trace the cloned processes without the -f
> option. With the -f option both strace hangs in a wait4() call. Calling
> strace with a -p option doesn't seem to do anything useful. I am still
> debugging this.

All that is needed is only a __WALL option in wait4(). I started debugging
strace -p with a cloned process and found that wait4 returned with
ECHILD. Looking at the kernel sources it was clear that __WALL option will
do the trick. I made this change and I am able to get something working - I
am really excited about this :-). I also got some sort of clone support
working with 2.2.18 kernel by using the option __WCLONE. More on that later.

There are some problems though

1. When you hit Ctrl-C and come out of strace -p, the trace process seems to
be stopped, sending a CONT signal from a terminal seems to get it going
again.

2. Once in a while I get a message "--- SIGRT_0 (Real-time signal 0) ---"
and the strace exits. This happens more often with the 2.2.18 kernel. I have not
debugged this further but it appears that strace needs to handle a
LinuxThreads specific signal.

3. Starting a multi-threaded program under strace -f control does not work
with 2.2.18. It works on 2.4.0-prerelease kernel, however strace hangs
in wait4 just before exit for a process that has already become
<defunct>. This also appears to be a problem with SIGRT_0.

> Looking at the strace code, it appears that strace explicitly sets a break
> point after a fork()/clone() and then removes it later. It also converts a
> vfork() into a fork(). SUBTERFUGUE instead appears to translate fork() and
> vfork() into a clone call with CLONE_PTRACE flag set. I can't find any
> information if CLONE_PTRACE also automatically sends a SIGSTOP to the child.

I didn't touch this for now. However, it does seem to be a good idea to
clone with CLONE_PTRACE flag and have the client automatically stop instead
of inserting a breakpoint manually. 

> Finally, what happens when you PTRACE_ATTACH to a cloned process? Does it
> automatically set CLONE_PTRACE flag for that process?

Look at the kernel sources for ptrace() system call, this does not appear to
be true. I think this needs to be fixed in the kernel. The patch is
attached. 

Index: strace.c
===================================================================
RCS file: /cvsroot/strace/strace/strace.c,v
retrieving revision 1.20
diff -u -r1.20 strace.c
--- strace.c	2000/09/03 23:57:48	1.20
+++ strace.c	2001/01/03 08:45:51
@@ -1586,6 +1586,15 @@
 
 #else /* !USE_PROCFS */
 
+#ifdef LINUX
+#ifndef __WCLONE
+#define __WCLONE	0x8000000
+#endif
+#ifndef __WALL
+#define __WALL		0x4000000
+#endif
+#endif /* LINUX */
+	
 static int
 trace()
 {
@@ -1594,6 +1603,8 @@
 	int status;
 	struct tcb *tcp;
 #ifdef LINUX
+	/* __WALL is only supported by 2.4 kernels */
+	static int wait4_options = __WALL;
 	struct rusage ru;
 #endif /* LINUX */
 
@@ -1601,7 +1612,26 @@
 		if (interactive)
 			sigprocmask(SIG_SETMASK, &empty_set, NULL);
 #ifdef LINUX
-		pid = wait4(-1, &status, 0, cflag ? &ru : NULL);
+		pid = wait4(-1, &status, wait4_options, cflag ? &ru : NULL);
+		if ((wait4_options & __WALL) && errno == EINVAL) {
+			/* this kernel does not support __WALL */
+			wait4_options &= ~__WALL;
+			errno = 0;
+			pid = wait4(-1, &status, wait4_options,
+				    cflag ? &ru : NULL);
+		}
+		if (!(wait4_options & _WALL) && errno == ECHILD) {
+			/* most likely a "cloned" process */
+			pid = wait4(-1, &status, __WCLONE,
+				    cflag ? &ru : NULL);
+			if (pid == -1) {
+				fprintf(stderr, "strace: clone wait4 "
+					"failed: %s\n", strerror(errno));
+			}
+		}
+			
+		
+		    
 #endif /* LINUX */
 #ifdef SUNOS4
 		pid = wait(&status);

-- 
R. Ganesan (rganesan at novell.com)       | Ph: 91-80-5721856 Ext: 2149
Novell India Development Center.       | #include <std_disclaimer.h>





More information about the Strace-devel mailing list