Experience from writing trace_inputs.py

Fri Mar 21 14:49:55 UTC 2014

We got a race conditions with our emails, sorry for the duplicated arguments.

In short, we are in agreement.

2014-03-21 9:43 GMT-04:00 Philippe Ombredanne <pombredanne at nexb.com>:
> In your case using the standard Python shlex module may be of some help?

Eh, definitely a lack of knowledge on my part as I didn't know about
shlex. I would have saved some work there.

> Indeed in newer strace versions some things may either have been fixed
> or there are new features that could help alleviate some of your
> issues.
> You might be using 4.5.20: a goodie but oldie.
> For instance it does not have the important --in your case-- -y option
> for absolute file descriptor paths decoding.

Yes I'm on 4.5.20. I mildly recall hitting problems when trying to
compile strace's tip of tree since I think (?) it required a newer gcc
version than the stock one (4.6.3) in 12.04. That was in 2012. At that
point, I wanted the tool to be usable as is for devs so I decided
against requiring a newer version.

>> In practice, I stumbled upon a lot of issues while reading the logs;
>> - Processes being killed arbitrarily, causing all sorts of incomplete logs.
>> The chromium unit tests tends to open a lot of processes quickly and kill
>> them just for fun.
>
> In general using -f may yield some weirdness.
> My experience is that -ff works great even in these cases.
> If you found some bugs there, please drop a mail here!

I'd have to reproduce using latest strace version. If I do, I'll report back.

>> - Even using one log file per process with -ff still cause occasional
>> corrupted log file on heavily multithreaded processes. Note that using a
>> workstation will a lots of core may probably help exhibit this problem more.
>
> ... using -ff is a better way to trace multiple threads and processes correctly.
> I am using it on very large trace with 4.8 and HEAD without much of a
> bad behavior (by large I mean 300 to 500K processes and 20 to 30 GB
> strace outputs).
>
> If you see some issues that can be reproduced, please drop a mail here
> so someone can investigate.

That's larger than what I traced, usually <1gb and a few hundreds of processes.

>> - No clear separation of call / return in the logs, causing a funny search
>> to figure out which call completed. There's an heuristic in
>> Strace.Content.Process._pending_calls.
>
> There are no unfinished/resumed calls when using -ff. So IMHO you
> should be using -ff -tt/ttt and then reassemble the per-process traces
> in one big file that you can then stream parse line by line. Using the
> strace-log-merge util will do this for you (this is in Perl and if
> this is an issue this would be easy enough to write that in Python).
> The calls will be in the right sequence and each line will have a pid.
> This approach is used in strace tests itself : see [2] for a tiny
> shell script example.

I definitely recall seeing unfinished/resumed calls when using -ff, at
least on 4.5.20. If I reproduce on newer version, will report back.

Thanks, I didn't know about strace-log-merge.

>> - Encoding, since linux doesn't prescribe utf-8, I have to hope for the
>> best. Note: I should use -x, I don't recall why I had commented it out.
>> Maybe just laziness.
>
> Do you have an example of a problematic output?
> FWIW, the approach taken by rsync with file paths listing is the following:
>  - try to decode paths using the locale/file system encoding (possibly utf-8)
>  - if this fail, encode failing characters in octal
>  - alternatively force utf-8 with an option and fail if decoding cannot be done
>
> Strace has a similar logic with -x alright, except it tries only ASCII
> decoding and falls back to hex.
> IMHO that should work out well for a json structured output.
>
> Why did you drop using -x?

Didn't drop, just hadn't started using it. I don't recall why; it's
been almost 2 years since I effectively touched this code. In
practice, it was not necessary for my end goal. Writing
trace_inputs.py was just a mean to achieve what I wanted to do
(identify all run time dependencies), not a goal in itself. So that's
why it has a very limited feature set.

>> - Only after the process hierarchy is figured out I could figure out the
>> starting current working directory of each process, causing a lot of "late
>> bound path". See RelativePath class. It definitely slows down reading the
>> logs and adds potential corruption of the paths in the trace reader if it
>> incorrectly guessed the starting cwd.
>
> This is an issue indeed. YMMV, but in your case (i.e. Ubuntu) where
> most exe have likely been compiled with a GNU toolchain you would end
> up having typically a getcwd syscall early on in most processes and
> typically in the first one.

FTR, the chromium unit tests usually fork and start working right away
instead of exec'ing, so most processes didn't log an initial getcwd().
But yes -y would have saved me a lot of work.

>> - No header to list the hooked functions I should expect to appear, and the
>> number of arguments in each. Since it changes from OS version to another, I
>> made the code assert on unknown function so I can add them manually.
>
> Good point: someone (Mike F./vapier?) suggested we should return some
> directory of supported calls and calls specs. How would you see such
> an output structure to be useful?

In general the reader has to know what it is looking for, so it's not
that much of a big deal. But I think it'd be generally interesting.

>> - Variable number of arguments to a function coupled with split log lines
>> and interesting quoting caused some headaches IIRC. (Or was it OSX doing
>> this? Sorry I don't recall)
>
> If you are talking about suspend/resume, then using -ff takes that away.
> As for varargs --if this happens-- I guess this is the way the
> syscalls are. Not much that can be done.
> To the best of my knowledge quoting in strace is rather consistent,
> but they are surely some gremlins left and right.
> Do you have an example of this?

Actually, I think I misremembered and it was indeed an issue with
dtrace (sorry for the confusion). For the varargs, for example open()
accepts either 2 or 3 arguments but the code is handling it just fine.
It's just a fact of life and the reader has to live with it. I should
have skipped this comment.

>> Of all the problems, the log corruption with heavily multithreaded process
>> on a multi-core system has been the most problematic. Everything else is
>> solvable but when the log is corrupted, there isn't much to do. Last time I
>> tried, which is a long long time ago, I was still able to reproduce
>> occasional corruptions with our test named "browser_tests".
>
> Please send a note here with details. I am using strace routinely to
> trace execution of 300K+ processes at once and have not seen a bad
> behavior to date. Also please consider checking out 4.8 or HEAD.

I didn't keep any broken log and never trace nowadays. Will report
with one if I reproduce again with a newer version.

>> A separate issue I got was that I was not able to trace native client
>> sandboxed unit test processes unless strace was running under sudo but to do
>> that I need to attach to the already started process to be able to not run
>> the unit test itself as root. It's been a pain to implement. Note strictly
>> strace's fault, just FYI.
>
> Is this "native client" of yours using some setuid/setgid? If so,
> there are known limitations since strace uses the ptrace syscall an
> inherits from its limitations which are about security. This is more
> or less explained in the man (2) ptrace and man (1) strace though this
> is an area that has never been quite clear to me. Have your tried if
> the -u  option could be of any help to you?

Native Client = https://code.google.com/p/nativeclient/

I didn't use -u, I should have. It would have been simpler. Again,
lack of knowledge (or properly reading the man page, I naively assumed
this functionality wasn't there).

>> What I'd recommend;
>> - json is good for smallish document but is not great for unbounded
>> documents. Logs can get >2gb. IMHO, I'm not sure it's a good fit.
>
> What about if a trace is not ONE json doc, but instead MANY?

We agree on this, I explained it more in a separate email.

>> Something custom would be fine with me. Just expose a reference code to read
>> it back in 1 language, 2-3 popular languages if possible.
>
> Once we start to have a more or less working feature, would you be
> willing to chip in? Like a possible Python reference code?

I can't promise anything currently but the format I proposed in the
other email would be parseable in a relatively short amount of code.
We'll see in time. :)

> About the cwd, see above: adding absolute paths everywhere makes the
> cwd possibly a non issue. That said, the pid, ppid and cwd should
> eventually be data attached to every call in a structured output, even
> if this is a bit redundant.

FTR, I think it's simpler (IMHO) to only record what was called and
have the reader reconstruct it. In particular, to not make the
structured output too configurable. This makes the reader code
simpler, since there's less code paths. More options = more file
formats = more reader code paths = less tested code. That's my humble
opinion, feel free to ignore. My preference is to have one canonical
way to print it out that is dense enough but has all the required
information to read it deterministically, and that's it.

> If you were to encode this by hand in json, how would you structure this?
> Also I wonder if attaching some arbitrary unique id to every call
> could help reassemble multiple entries sanely.

Wrote example in the other thread.

> I guess the strace maintainers should take this as a compliment :P

It is. :)

M-A