[PATCH v5 1/3] Introduce -l/--syscall-limit option
Dmitry V. Levin
ldv at strace.io
Tue Mar 28 11:53:30 UTC 2023
Hi,
On Tue, Mar 28, 2023 at 01:11:50PM +0530, Sahil Siddiq wrote:
> On Monday, 27 March 2023 03:00:37 IST Dmitry V. Levin wrote:
> > On Sun, Mar 26, 2023 at 08:34:17PM +0530, Sahil Siddiq wrote:
> > > On Sunday, 26 March 2023 16:06:04 IST Dmitry V. Levin wrote:
> > > > On Sun, Mar 26, 2023 at 03:57:24PM +0530, Sahil Siddiq wrote:
> > > > > Hi,
> > > > >
> > > > > Thank you for the feedback. There are a few things that I haven't
> > > > > really understood.
> > > > >
> > > > > On Sunday, 26 March 2023 01:11:07 IST Dmitry V. Levin wrote:
> > > > > > On Mon, Mar 20, 2023 at 11:10:56AM +0530, Sahil Siddiq wrote:
> > > > > > [...]
> > > > > > for example, a tracee
> > > > > > doesn't invoke syscalls, strace -l won't finish.
> > > > >
> > > > > I didn't understand this example. In case the argument to -l is greater
> > > > > than the number of syscalls that are invoked, wouldn't strace proceed
> > > > > as usual? If the tracee does not invoke any syscall, the syscall_limit
> > > > > counter will not decrease in syscall_exiting_trace().
> > > >
> > > > Here is an example: take your strace--syscall-limit.c test, insert sleep(60)
> > > > right before the waitpid loop, and see what happens.
> > >
> > > I am still a bit confused. I inserted sleep(60) so that the test snippet now looks
> > > like this:
> > >
> > > sleep(60);
> > > while ((waitpid(child, &status, 0)) != child) {
> > > if (errno == EINTR)
> > > continue;
> > > perror_msg_and_fail("waitpid: %d", child);
> > > }
> > >
> > > I first ran the following command:
> > >
> > > strace --signal='!SIGCHLD,SIGCONT' --quiet=path-resolution -f -a 9 --trace=chdir,getpid --syscall-limit=3 ./strace--syscall-limit
> > >
> > > This gives the same output that I would expect. It prints only the first three syscalls
> > > that were traced along with the "process <pid> attached/detached" log lines and it
> > > terminates after that.
> > >
> > > However, when running "bash ./strace--syscall-limit.test", the test indeed does not
> > > terminate.
> > >
> > > Running "ps f" gives the following output:
> > >
> > > 11441 pts/2 Ss 0:00 /bin/bash
> > > 254572 pts/2 S+ 0:00 \_ bash ./strace--syscall-limit.test
> > > 254648 pts/2 S+ 0:00 \_ ../../src/strace -o log --signal=!SIGCHLD,SIGCONT --quiet=path-resolution -f -a 9 --trace=chdir,getpid --syscall-limit=3
> > > 254651 pts/2 S+ 0:00 \_ ../strace--syscall-limit
> > > 254652 pts/2 Z+ 0:00 \_ [strace--syscall] <defunct>
> > >
> > > I wonder why this is the case. I'll try to figure this out.
> >
> > The test doesn't hang, what you see is that strace is waiting for the
> > syscall behind sleep(60) to exit, at that moment strace will detach that
> > process and the test will finish.
>
> Sorry, I am still trying to understand this. I think I have understood what's
> going on here. But I am still trying to figure out why this is happening. Is
> it only because the tracee gets detached immediately after the event occurs or
> is there something else too?
Yes, only after the exit stop event occurs.
> The tracee will detach only after the syscall
> behind sleep(60) finally exits successfully, right?
Correct.
> In the ps tree, am I right
> to assume the the child process does not terminate properly which is why it is
> reported as defunct?
The child process is defunct because the parent process hasn't called
waitpid() yet.
> > As I said before, all you need is to break the event loop
> > almost as if strace was terminated.
>
> I was experimenting with the test a bit more and I have a few queries:
>
> 1. I applied your changes and experimented with it a bit. The test now ends
> immediately. However, I noticed that in this case, strace terminates completely
> and the tracee is reparented. Once this happens, the process seems to be in the
> interruptible sleep state. If the tracee is interactive and requires input from the
> user, it is unable to do so because the process no longer remains in the foreground.
What is the alternative? When strace is interrupted by signal,
it forwards the signal to strace_child process, but in this case
there is no signal to forward. Also, I suppose --syscall-limit
would be used together with -p.
> I tried this out using a multi-threaded program that you can find here:
>
> https://github.com/valdaarhun/Intro_To_OS_Udacity/tree/main/priority_readers_writers
>
> On Sun, Mar 26, 2023 at 08:34:17PM +0530, Sahil Siddiq wrote:
> > [...]
> > However, when running "bash ./strace--syscall-limit.test", the test indeed does not
> > terminate.
> >
> > Running "ps f" gives the following output:
> >
> > 11441 pts/2 Ss 0:00 /bin/bash
> > 254572 pts/2 S+ 0:00 \_ bash ./strace--syscall-limit.test
> > 254648 pts/2 S+ 0:00 \_ ../../src/strace -o log --signal=!SIGCHLD,SIGCONT --quiet=path-resolution -f -a 9 --trace=chdir,getpid --syscall-limit=3
> > 254651 pts/2 S+ 0:00 \_ ../strace--syscall-limit
> > 254652 pts/2 Z+ 0:00 \_ [strace--syscall] <defunct>
> >
> > I wonder why this is the case. I'll try to figure this out.
>
>
> 2. In the child process in strace--syscall-limit.c, I added the following lines
> before "int pid = getpid();"
>
> char *argv[] = {(char *)"/usr/bin/true", NULL};
> execv(argv[0], argv);
>
> I then changed strace--syscall-limit.test to run -b execve instead of -l.
>
> Comment out:
>
> set -- --syscall-limit=3 "$@"
>
> and add this:
>
> set -- -b execve "$@"
>
> Also, comment out:
>
> [ -n "$args" -a \( -z "${args##*-e trace=*}" -o \
> -z "${args##*-etrace=*}" -o \
> -z "${args##*--trace=*}" \) ] ||
> set -- --trace="chdir,getpid" "$@"
>
> I ran the test, and its behaviour seems to be similar. "ps f" displays
> a similar ps tree. Does that mean the behaviour of -b execve needs to be
> changed too?
What behaviour would you suggest to fix?
The child tracee has been detached, and the parent tracee is doing
something, everything seems to behave as expected.
--
ldv
More information about the Strace-devel
mailing list