[PATCH v5 1/3] Introduce -l/--syscall-limit option

Dmitry V. Levin ldv at strace.io
Tue Mar 28 11:53:30 UTC 2023


Hi,

On Tue, Mar 28, 2023 at 01:11:50PM +0530, Sahil Siddiq wrote:
> On Monday, 27 March 2023 03:00:37 IST Dmitry V. Levin wrote:
> > On Sun, Mar 26, 2023 at 08:34:17PM +0530, Sahil Siddiq wrote:
> > > On Sunday, 26 March 2023 16:06:04 IST Dmitry V. Levin wrote:
> > > > On Sun, Mar 26, 2023 at 03:57:24PM +0530, Sahil Siddiq wrote:
> > > > > Hi,
> > > > >
> > > > > Thank you for the feedback. There are a few things that I haven't
> > > > > really understood.
> > > > >
> > > > > On Sunday, 26 March 2023 01:11:07 IST Dmitry V. Levin wrote:
> > > > > > On Mon, Mar 20, 2023 at 11:10:56AM +0530, Sahil Siddiq wrote:
> > > > > > [...]
> > > > > > for example, a tracee
> > > > > > doesn't invoke syscalls, strace -l won't finish.
> > > > >
> > > > > I didn't understand this example. In case the argument to -l is greater
> > > > > than the number of syscalls that are invoked, wouldn't strace proceed
> > > > > as usual? If the tracee does not invoke any syscall, the syscall_limit
> > > > > counter will not decrease in syscall_exiting_trace().
> > > >
> > > > Here is an example: take your strace--syscall-limit.c test, insert sleep(60)
> > > > right before the waitpid loop, and see what happens.
> > >
> > > I am still a bit confused. I inserted sleep(60) so that the test snippet now looks
> > > like this:
> > >
> > > sleep(60);
> > > while ((waitpid(child, &status, 0)) != child) {
> > > if (errno == EINTR)
> > > continue;
> > > perror_msg_and_fail("waitpid: %d", child);
> > > }
> > >
> > > I first ran the following command:
> > >
> > > strace --signal='!SIGCHLD,SIGCONT' --quiet=path-resolution -f -a 9 --trace=chdir,getpid --syscall-limit=3 ./strace--syscall-limit
> > >
> > > This gives the same output that I would expect. It prints only the first three syscalls
> > > that were traced along with the "process <pid> attached/detached" log lines and it
> > > terminates after that.
> > >
> > > However, when running "bash ./strace--syscall-limit.test", the test indeed does not
> > > terminate.
> > >
> > > Running "ps f" gives the following output:
> > >
> > > 11441 pts/2    Ss     0:00 /bin/bash
> > > 254572 pts/2    S+     0:00  \_ bash ./strace--syscall-limit.test
> > > 254648 pts/2    S+     0:00      \_ ../../src/strace -o log --signal=!SIGCHLD,SIGCONT --quiet=path-resolution -f -a 9 --trace=chdir,getpid --syscall-limit=3
> > > 254651 pts/2    S+     0:00          \_ ../strace--syscall-limit
> > > 254652 pts/2    Z+     0:00              \_ [strace--syscall] <defunct>
> > >
> > > I wonder why this is the case. I'll try to figure this out.
> > 
> > The test doesn't hang, what you see is that strace is waiting for the
> > syscall behind sleep(60) to exit, at that moment strace will detach that
> > process and the test will finish.
> 
> Sorry, I am still trying to understand this. I think I have understood what's
> going on here. But I am still trying to figure out why this is happening. Is
> it only because the tracee gets detached immediately after the event occurs or
> is there something else too?

Yes, only after the exit stop event occurs.

> The tracee will detach only after the syscall
> behind sleep(60) finally exits successfully, right?

Correct.

> In the ps tree, am I right
> to assume the the child process does not terminate properly which is why it is
> reported as defunct?

The child process is defunct because the parent process hasn't called
waitpid() yet.

> > As I said before, all you need is to break the event loop
> > almost as if strace was terminated.
> 
> I was experimenting with the test a bit more and I have a few queries:
> 
> 1. I applied your changes and experimented with it a bit. The test now ends
>     immediately. However, I noticed that in this case, strace terminates completely
>     and the tracee is reparented. Once this happens, the process seems to be in the
>     interruptible sleep state. If the tracee is interactive and requires input from the
>     user, it is unable to do so because the process no longer remains in the foreground.

What is the alternative?  When strace is interrupted by signal,
it forwards the signal to strace_child process, but in this case
there is no signal to forward.  Also, I suppose --syscall-limit
would be used together with -p.

>     I tried this out using a multi-threaded program that you can find here:
> 
>     https://github.com/valdaarhun/Intro_To_OS_Udacity/tree/main/priority_readers_writers
> 
> On Sun, Mar 26, 2023 at 08:34:17PM +0530, Sahil Siddiq wrote:
> > [...]
> > However, when running "bash ./strace--syscall-limit.test", the test indeed does not
> > terminate.
> >
> > Running "ps f" gives the following output:
> >
> > 11441 pts/2    Ss     0:00 /bin/bash
> > 254572 pts/2    S+     0:00  \_ bash ./strace--syscall-limit.test
> > 254648 pts/2    S+     0:00      \_ ../../src/strace -o log --signal=!SIGCHLD,SIGCONT --quiet=path-resolution -f -a 9 --trace=chdir,getpid --syscall-limit=3
> > 254651 pts/2    S+     0:00          \_ ../strace--syscall-limit
> > 254652 pts/2    Z+     0:00              \_ [strace--syscall] <defunct>
> >
> > I wonder why this is the case. I'll try to figure this out.
> 
> 
> 2. In the child process in strace--syscall-limit.c, I added the following lines
>     before "int pid = getpid();"
> 
>         char *argv[] = {(char *)"/usr/bin/true", NULL};
>         execv(argv[0], argv);
>     
>     I then changed strace--syscall-limit.test to run -b execve instead of -l.
>     
>     Comment out:
> 
>         set -- --syscall-limit=3 "$@"
> 
>     and add this:
> 
>         set -- -b execve "$@"
> 
>     Also, comment out:
> 
>         [ -n "$args" -a \( -z "${args##*-e trace=*}" -o \
>                 -z "${args##*-etrace=*}" -o \
>                 -z "${args##*--trace=*}" \) ] ||
>         set -- --trace="chdir,getpid" "$@"
> 
>     I ran the test, and its behaviour seems to be similar. "ps f" displays
>     a similar ps tree. Does that mean the behaviour of -b execve needs to be
>     changed too?

What behaviour would you suggest to fix?
The child tracee has been detached, and the parent tracee is doing
something, everything seems to behave as expected.


-- 
ldv


More information about the Strace-devel mailing list