clone on Linux
Ganesan Rajagopal
rganesan at myrealbox.com
Wed Jan 3 08:48:47 UTC 2001
>>>>> "Ganesan" == Ganesan Rajagopal <rganesan at novell.com> writes:
> Thank you! This has been *very* helpful. I could get SUBTERFUGUE working
> very nicely with a test pthreads program. Though I am only a beginning
> Python programmer I am able to follow the code in p_linux_i386_trick.py
> (which I assume has the relevant code). I would like to add this support to
> strace if no one is duplicating the effort already (Wichert, is anyone
> actively working on fixing clone support for strace?).
Okay, I spent some with this and the new kernel sources and have got strace
working with 2.4.0-prerelease kernel. There was really very little to do,
Mike (SUBTERFUGUE) and other kernel developers have already done the real
work needed in the kernel (Thanks guys!).
> Any way, coming back to strace, when I have been debugging strace with
> multi-threaded programs in two scenarios. When the program is started under
> strace control, strace doesn't trace the cloned processes without the -f
> option. With the -f option both strace hangs in a wait4() call. Calling
> strace with a -p option doesn't seem to do anything useful. I am still
> debugging this.
All that is needed is only a __WALL option in wait4(). I started debugging
strace -p with a cloned process and found that wait4 returned with
ECHILD. Looking at the kernel sources it was clear that __WALL option will
do the trick. I made this change and I am able to get something working - I
am really excited about this :-). I also got some sort of clone support
working with 2.2.18 kernel by using the option __WCLONE. More on that later.
There are some problems though
1. When you hit Ctrl-C and come out of strace -p, the trace process seems to
be stopped, sending a CONT signal from a terminal seems to get it going
again.
2. Once in a while I get a message "--- SIGRT_0 (Real-time signal 0) ---"
and the strace exits. This happens more often with the 2.2.18 kernel. I have not
debugged this further but it appears that strace needs to handle a
LinuxThreads specific signal.
3. Starting a multi-threaded program under strace -f control does not work
with 2.2.18. It works on 2.4.0-prerelease kernel, however strace hangs
in wait4 just before exit for a process that has already become
<defunct>. This also appears to be a problem with SIGRT_0.
> Looking at the strace code, it appears that strace explicitly sets a break
> point after a fork()/clone() and then removes it later. It also converts a
> vfork() into a fork(). SUBTERFUGUE instead appears to translate fork() and
> vfork() into a clone call with CLONE_PTRACE flag set. I can't find any
> information if CLONE_PTRACE also automatically sends a SIGSTOP to the child.
I didn't touch this for now. However, it does seem to be a good idea to
clone with CLONE_PTRACE flag and have the client automatically stop instead
of inserting a breakpoint manually.
> Finally, what happens when you PTRACE_ATTACH to a cloned process? Does it
> automatically set CLONE_PTRACE flag for that process?
Look at the kernel sources for ptrace() system call, this does not appear to
be true. I think this needs to be fixed in the kernel. The patch is
attached.
Index: strace.c
===================================================================
RCS file: /cvsroot/strace/strace/strace.c,v
retrieving revision 1.20
diff -u -r1.20 strace.c
--- strace.c 2000/09/03 23:57:48 1.20
+++ strace.c 2001/01/03 08:45:51
@@ -1586,6 +1586,15 @@
#else /* !USE_PROCFS */
+#ifdef LINUX
+#ifndef __WCLONE
+#define __WCLONE 0x8000000
+#endif
+#ifndef __WALL
+#define __WALL 0x4000000
+#endif
+#endif /* LINUX */
+
static int
trace()
{
@@ -1594,6 +1603,8 @@
int status;
struct tcb *tcp;
#ifdef LINUX
+ /* __WALL is only supported by 2.4 kernels */
+ static int wait4_options = __WALL;
struct rusage ru;
#endif /* LINUX */
@@ -1601,7 +1612,26 @@
if (interactive)
sigprocmask(SIG_SETMASK, &empty_set, NULL);
#ifdef LINUX
- pid = wait4(-1, &status, 0, cflag ? &ru : NULL);
+ pid = wait4(-1, &status, wait4_options, cflag ? &ru : NULL);
+ if ((wait4_options & __WALL) && errno == EINVAL) {
+ /* this kernel does not support __WALL */
+ wait4_options &= ~__WALL;
+ errno = 0;
+ pid = wait4(-1, &status, wait4_options,
+ cflag ? &ru : NULL);
+ }
+ if (!(wait4_options & _WALL) && errno == ECHILD) {
+ /* most likely a "cloned" process */
+ pid = wait4(-1, &status, __WCLONE,
+ cflag ? &ru : NULL);
+ if (pid == -1) {
+ fprintf(stderr, "strace: clone wait4 "
+ "failed: %s\n", strerror(errno));
+ }
+ }
+
+
+
#endif /* LINUX */
#ifdef SUNOS4
pid = wait(&status);
--
R. Ganesan (rganesan at novell.com) | Ph: 91-80-5721856 Ext: 2149
Novell India Development Center. | #include <std_disclaimer.h>
More information about the Strace-devel
mailing list