[PATCH v5 1/3] Introduce -l/--syscall-limit option

Sahil Siddiq icegambit91 at gmail.com
Tue Mar 28 07:41:50 UTC 2023


Hi,

On Monday, 27 March 2023 03:00:37 IST Dmitry V. Levin wrote:
> On Sun, Mar 26, 2023 at 08:34:17PM +0530, Sahil Siddiq wrote:
> > On Sunday, 26 March 2023 16:06:04 IST Dmitry V. Levin wrote:
> > > On Sun, Mar 26, 2023 at 03:57:24PM +0530, Sahil Siddiq wrote:
> > > > Hi,
> > > >
> > > > Thank you for the feedback. There are a few things that I haven't
> > > > really understood.
> > > >
> > > > On Sunday, 26 March 2023 01:11:07 IST Dmitry V. Levin wrote:
> > > > > On Mon, Mar 20, 2023 at 11:10:56AM +0530, Sahil Siddiq wrote:
> > > > > [...]
> > > > > for example, a tracee
> > > > > doesn't invoke syscalls, strace -l won't finish.
> > > >
> > > > I didn't understand this example. In case the argument to -l is greater
> > > > than the number of syscalls that are invoked, wouldn't strace proceed
> > > > as usual? If the tracee does not invoke any syscall, the syscall_limit
> > > > counter will not decrease in syscall_exiting_trace().
> > >
> > > Here is an example: take your strace--syscall-limit.c test, insert sleep(60)
> > > right before the waitpid loop, and see what happens.
> >
> > I am still a bit confused. I inserted sleep(60) so that the test snippet now looks
> > like this:
> >
> > sleep(60);
> > while ((waitpid(child, &status, 0)) != child) {
> > if (errno == EINTR)
> > continue;
> > perror_msg_and_fail("waitpid: %d", child);
> > }
> >
> > I first ran the following command:
> >
> > strace --signal='!SIGCHLD,SIGCONT' --quiet=path-resolution -f -a 9 --trace=chdir,getpid --syscall-limit=3 ./strace--syscall-limit
> >
> > This gives the same output that I would expect. It prints only the first three syscalls
> > that were traced along with the "process <pid> attached/detached" log lines and it
> > terminates after that.
> >
> > However, when running "bash ./strace--syscall-limit.test", the test indeed does not
> > terminate.
> >
> > Running "ps f" gives the following output:
> >
> > 11441 pts/2    Ss     0:00 /bin/bash
> > 254572 pts/2    S+     0:00  \_ bash ./strace--syscall-limit.test
> > 254648 pts/2    S+     0:00      \_ ../../src/strace -o log --signal=!SIGCHLD,SIGCONT --quiet=path-resolution -f -a 9 --trace=chdir,getpid --syscall-limit=3
> > 254651 pts/2    S+     0:00          \_ ../strace--syscall-limit
> > 254652 pts/2    Z+     0:00              \_ [strace--syscall] <defunct>
> >
> > I wonder why this is the case. I'll try to figure this out.
> 
> The test doesn't hang, what you see is that strace is waiting for the
> syscall behind sleep(60) to exit, at that moment strace will detach that
> process and the test will finish.

Sorry, I am still trying to understand this. I think I have understood what's
going on here. But I am still trying to figure out why this is happening. Is
it only because the tracee gets detached immediately after the event occurs or
is there something else too? The tracee will detach only after the syscall
before sleep(60) finally exits successfully, right? In the ps tree, am I right
to assume the the child process does not terminate properly which is why it is
reported as defunct?

> As I said before, all you need is to break the event loop
> almost as if strace was terminated.

I was experimenting with the test a bit more and I have a few queries:

1. I applied your changes and experimented with it a bit. The test now ends
    immediately. However, I noticed that in this case, strace terminates completely
    and the tracee is reparented. Once this happens, the process seems to be in the
    interruptible sleep state. If the tracee is interactive and requires input from the
    user, it is unable to do so because the process no longer remains in the foreground.

    I tried this out using a multi-threaded program that you can find here:

    https://github.com/valdaarhun/Intro_To_OS_Udacity/tree/main/priority_readers_writers

On Sun, Mar 26, 2023 at 08:34:17PM +0530, Sahil Siddiq wrote:
> [...]
> However, when running "bash ./strace--syscall-limit.test", the test indeed does not
> terminate.
>
> Running "ps f" gives the following output:
>
> 11441 pts/2    Ss     0:00 /bin/bash
> 254572 pts/2    S+     0:00  \_ bash ./strace--syscall-limit.test
> 254648 pts/2    S+     0:00      \_ ../../src/strace -o log --signal=!SIGCHLD,SIGCONT --quiet=path-resolution -f -a 9 --trace=chdir,getpid --syscall-limit=3
> 254651 pts/2    S+     0:00          \_ ../strace--syscall-limit
> 254652 pts/2    Z+     0:00              \_ [strace--syscall] <defunct>
>
> I wonder why this is the case. I'll try to figure this out.


2. In the child process in strace--syscall-limit.c, I added the following lines
    before "int pid = getpid();"

        char *argv[] = {(char *)"/usr/bin/true", NULL};
        execv(argv[0], argv);
    
    I then changed strace--syscall-limit.test to run -b execve instead of -l.
    
    Comment out:

        set -- --syscall-limit=3 "$@"

    and add this:

        set -- -b execve "$@"

    Also, comment out:

        [ -n "$args" -a \( -z "${args##*-e trace=*}" -o \
                -z "${args##*-etrace=*}" -o \
                -z "${args##*--trace=*}" \) ] ||
        set -- --trace="chdir,getpid" "$@"

    I ran the test, and its behaviour seems to be similar. "ps f" displays
    a similar ps tree. Does that mean the behaviour of -b execve needs to be
    changed too?

Regards,
Sahil




More information about the Strace-devel mailing list