[PATCH v5 1/3] Introduce -l/--syscall-limit option

Sahil Siddiq icegambit91 at gmail.com
Tue Mar 28 18:54:48 UTC 2023


Hi,

Thank you for the prompt response.

On Tuesday, 28 March 2023 17:23:30 IST Dmitry V. Levin wrote:
> Hi,
>
> On Tue, Mar 28, 2023 at 01:11:50PM +0530, Sahil Siddiq wrote:
> >[...]
> > Sorry, I am still trying to understand this. I think I have understood
> > what's going on here. But I am still trying to figure out why this is
> > happening. Is it only because the tracee gets detached immediately after
> > the event occurs or is there something else too?
>
> Yes, only after the exit stop event occurs.
>
> > The tracee will detach only after the syscall
> > behind sleep(60) finally exits successfully, right?
>
> Correct.
>
> > In the ps tree, am I right
> > to assume the the child process does not terminate properly which is why
> > it is reported as defunct?
>
> The child process is defunct because the parent process hasn't called
> waitpid() yet.
>

Ok, got it, this makes sense now.

> > > As I said before, all you need is to break the event loop
> > > almost as if strace was terminated.
> >
> > I was experimenting with the test a bit more and I have a few queries:
> >
> > 1. I applied your changes and experimented with it a bit. The test now
> > ends
> >
> > immediately. However, I noticed that in this case, strace terminates
> > completely and the tracee is reparented. Once this happens, the
> > process seems to be in the interruptible sleep state. If the tracee
> > is interactive and requires input from the user, it is unable to do
> > so because the process no longer remains in the foreground.
> What is the alternative?  When strace is interrupted by signal,
> it forwards the signal to strace_child process, but in this case
> there is no signal to forward. 

I am not sure yet. I am still playing around with the code.

> Also, I suppose --syscall-limit would be used together with -p.

No, not necessarily. I thought it would be nice to use it without -p as well.
For example, the "strace--syscall-limit-*.test" tests run without -p.

> > [...]
> > 2. In the child process in strace--syscall-limit.c, I added the following
> > lines before "int pid = getpid();"
> >
> > char *argv[] = {(char *)"/usr/bin/true", NULL};
> > execv(argv[0], argv);
> >
> > I then changed strace--syscall-limit.test to run -b execve instead of -l.
> >
> > Comment out:
> >     set -- --syscall-limit=3 "$@"
> >
> > and add this:
> >     set -- -b execve "$@"
> >
> > Also, comment out:
> >     [ -n "$args" -a \( -z "${args##*-e trace=*}" -o \
> >         -z "${args##*-etrace=*}" -o \
> >         -z "${args##*--trace=*}" \) ] ||
> >         set -- --trace="chdir,getpid" "$@"
> >
> > I ran the test, and its behaviour seems to be similar. "ps f" displays
> > a similar ps tree. Does that mean the behaviour of -b execve needs to
> > be changed too?
>
> What behaviour would you suggest to fix?
> The child tracee has been detached, and the parent tracee is doing
> something, everything seems to behave as expected.

Right, this behaves as expected.

I was going through the code that handles "-b execve" and I have one more query.
If I am not mistaken, as soon as the "TE_STOP_BEFORE_EXECVE" event occurs, the
tracees are detached in "dispatch_event". I was under the impression that the
implementations of "-b execve" and -l are similar in this context. I haven't understood
why "-b execve" doesn't give a problem because of this while -l may give a problem.

On Sun, Mar 26, 2023 at 08:34:17PM +0530, Sahil Siddiq wrote:
> On Sunday, 26 March 2023 16:06:04 IST Dmitry V. Levin wrote:
> > On Sun, Mar 26, 2023 at 03:57:24PM +0530, Sahil Siddiq wrote:
> > > Hi,
> > >
> > > Thank you for the feedback. There are a few things that I haven't
> > > really understood.
> > >
> > > On Sunday, 26 March 2023 01:11:07 IST Dmitry V. Levin wrote:
> > > > On Mon, Mar 20, 2023 at 11:10:56AM +0530, Sahil Siddiq wrote:
> > > > [...]
> > > > for example, a tracee
> > > > doesn't invoke syscalls, strace -l won't finish.
> > >
> > > I didn't understand this example. In case the argument to -l is greater
> > > than the number of syscalls that are invoked, wouldn't strace proceed
> > > as usual? If the tracee does not invoke any syscall, the syscall_limit
> > > counter will not decrease in syscall_exiting_trace().
> >
> > Here is an example: take your strace--syscall-limit.c test, insert sleep(60)
> > right before the waitpid loop, and see what happens.
>
> I am still a bit confused. I inserted sleep(60) so that the test snippet now looks
> like this:
>
>     sleep(60);
>     while ((waitpid(child, &status, 0)) != child) {
>         if (errno == EINTR)
>             continue;
>         perror_msg_and_fail("waitpid: %d", child);
>     }
>
> I first ran the following command:
>
> strace --signal='!SIGCHLD,SIGCONT' --quiet=path-resolution -f -a 9 --trace=chdir,getpid --syscall-limit=3 ./strace--syscall-limit
>
> This gives the same output that I would expect. It prints only the first three syscalls
> that were traced along with the "process <pid> attached/detached" log lines  and it
> terminates after that.
>
> However, when running "bash ./strace--syscall-limit.test", the test indeed does not
> terminate.

I was going through this test again and I just realized that it might be working as
expected. I timed the test and realized that it takes exactly 2 minutes for the test
to execute completely. The first minute is spent executing:

    run_prog > /dev/null

and the next minute is spent running:

    run_strace --signal='!SIGCHLD,SIGCONT' --quiet=path-resolution -f -a 9 "$@" $args > "$EXP"

The contents of the files referenced by "$EXP" and "$LOG" match.

Could you please give me another example where detaching the tracee 
immediately may give a problem?

Regards,
Sahil




More information about the Strace-devel mailing list