[PATCHv4] print stack trace after each syscall

Sun Oct 6 06:42:14 UTC 2013

On Sat, Sep 21, 2013 at 6:45 PM, Luca Clementi <luca.clementi at gmail.com> wrote:
> On Sat, Sep 21, 2013 at 12:54 PM, Masatake YAMATO <yamato at redhat.com> wrote:
>> Hi,
>>
>> Thank you for reviewing the patch.
>> But I need more discussion.
>>
>> On Thu, 19 Sep 2013 18:05:06 -0700, Luca Clementi <luca.clementi at gmail.com> wrote:
>> ...
>>>> Why delete_mmap_cache invocation is needed after wait related system calls?
>>>>
>>>
>>> I can't find a logical explanation for that.
>>> I think that part of the code should be removed as you indicated.
>>
>> O.k. Let's remove the code block.
>>
>> On Thu, 19 Sep 2013 18:05:06 -0700, Luca Clementi <luca.clementi at gmail.com> wrote:
>>> Although I have one minor comment.
>>> Since we are printing the stack trace on return of a sys call,
>>> shouldn't we update the mapping _before_ we print the stack trace?
>>> In most of the situation it does not matter except for execve (we can
>>> avoid a "backtracing_error").
>>>
>>> I simply moved the delete_mmap_cache up in the trace_syscall_exiting:
>>
>>
>> I think there is no difference between  _after_ and _before_.
>>
>>
>> Let's think about calling mmap then getpid.
>>
>> ** _after_ case **
>>
>>        -1. mmap run in kernel and the map of process is modified.
>>         0. control for mmap syscall is returned from kernel, then
>>         1. the stack for calling mmap is printed, then
>>         2. the cache is cleared, then
>>         3. getpid is called, then
>>         4. cache is rebuilt (in print_stack), then
>>         5. the stack for calling getpid is printed(in print_stack).
>>
>>         Note: the map is refereed in 4 but between 0 and 5, the map for
>>         the process is not modified.
>>
>> ** _before_ case **
>>
>>        -1. mmap run in kernel and the map of process is modified.
>>         0. control for mmap syscall is returned from kernel, then
>>         1. the cache is cleared, then
>>         2. cache is rebuilt (in print_stack), then
>>         3. the stack for calling mmap is printed, then
>>         4. getpid is called, then
>>         5. the stack for calling getpid is printed using the cache
>>            built in 2.
>>
>> In the both case, when printing the stack trace for getpid, newer
>> cache is used.
>>
>> As you wrote, execve is special.
>> We are interested in the context of execve invoked.
>> So we should capture the stack in `trace_syscall_entering'.
>> After execve, the context is destroyed.
>
> I agree on this, but in this way you will have to buffer the output
> and print it only in the trace_syscall_exiting (to preserve the
> current output, which in my opinion is more user friendly).
> My was a dirty and bad work-around, I admit it.
>
>> ...However, I've found more important thing.
>> libunwind scans /proc/$pid/maps every print_stacktrace invocation!
>> If libunwind provides `unw_get_proc_name_and_path',
>> the map cache management will not be needed in strace.
>
> I noticed that. Honestly speaking, I was thinking to first get this
> patch in and then in a second step, try to code another patch which
> provides a custom get_dyn_info_list_addr call back [1] (which is where
> the scan of the maps is done).
> In the custom get_dyn_info_list_addr we can then use the info we
> already have in our mmap_cache.
> My first concern was to get the functionality in, then we can work on
> improving its performance.
>
> The problem is that libunwind will scan anyway maps every stack trace
> (aka every system call, that is performance killer), if you re-write
> get_dyn_info_list_addr to use our mmap_cache you can parse the maps
> file only when really needed and use the cache the other times.
>

Hey Masatake,
Are you gonna send a patch to fix the map reloading even when using -e?

I don't mean to hurry you, I was wondering how do you want to proceed.

Sincerely,
Luca