Experience from writing trace_inputs.py

Fri Mar 21 13:35:59 UTC 2014

On Thu, Mar 20, 2014 at 9:50 PM, Marc-Antoine Ruel <maruel at chromium.org> wrote:
> Hi,

Hi Marc-Antoine!
and thank you for this detailed feedback!

> I was pointed out to this thread. I wrote trace_inputs.py.

For reference I think the code you are talking about can be found at [1]

> It was used to
> isolate chromium tests and figure out which files are touched by each test
> case. It doesn't care about anything except files being opened so strace log
> parsing was kept to a minimum. Note that I've been mostly using the stock
> ubuntu strace version, which is quite old. Some of the issues below may have
> been fixed already.

Indeed in newer strace versions some things may either have been fixed
or there are new features that could help alleviate some of your
issues.
You might be using 4.5.20: a goodie but oldie.
For instance it does not have the important --in your case-- -y option
for absolute file descriptor paths decoding.

> For background context, note that I wrote the tool in
> python because 1) I'm lazy and 2) it was easier to use a single tool to
> deploy quickly on linux, OSX and Windows and that the goal of this tool is
> to hide as much as possible the underlying differences between each OS.

IMHO, lazyness is a quality!

> In practice, I stumbled upon a lot of issues while reading the logs;
> - Processes being killed arbitrarily, causing all sorts of incomplete logs.
> The chromium unit tests tends to open a lot of processes quickly and kill
> them just for fun.

In general using -f may yield some weirdness.
My experience is that -ff works great even in these cases.
If you found some bugs there, please drop a mail here!

> - Using a single log file resulted in severely corrupted log file, looks
> like multithreading is not properly interlocked in the output file.

I cannot say for sure but there could bugs in 4.5.20 that you may have hit?
That version is 4 years old!
Yet practically ....

> - Even using one log file per process with -ff still cause occasional
> corrupted log file on heavily multithreaded processes. Note that using a
> workstation will a lots of core may probably help exhibit this problem more.

... using -ff is a better way to trace multiple threads and processes correctly.
I am using it on very large trace with 4.8 and HEAD without much of a
bad behavior (by large I mean 300 to 500K processes and 20 to 30 GB
strace outputs).

If you see some issues that can be reproduced, please drop a mail here
so someone can investigate.

> - No clear separation of call / return in the logs, causing a funny search
> to figure out which call completed. There's an heuristic in
> Strace.Content.Process._pending_calls.

There are no unfinished/resumed calls when using -ff. So IMHO you
should be using -ff -tt/ttt and then reassemble the per-process traces
in one big file that you can then stream parse line by line. Using the
strace-log-merge util will do this for you (this is in Perl and if
this is an issue this would be easy enough to write that in Python).
The calls will be in the right sequence and each line will have a pid.
This approach is used in strace tests itself : see [2] for a tiny
shell script example.

> - Encoding, since linux doesn't prescribe utf-8, I have to hope for the
> best. Note: I should use -x, I don't recall why I had commented it out.
> Maybe just laziness.

Do you have an example of a problematic output?
FWIW, the approach taken by rsync with file paths listing is the following:
 - try to decode paths using the locale/file system encoding (possibly utf-8)
 - if this fail, encode failing characters in octal
 - alternatively force utf-8 with an option and fail if decoding cannot be done

Strace has a similar logic with -x alright, except it tries only ASCII
decoding and falls back to hex.
IMHO that should work out well for a json structured output.

Why did you drop using -x?

> - Encoding, I had to write a state machine to read the logs properly, see
> the >110 lines of strace_process_quoted_arguments(). That's independent of
> -x.

Agreed this is an issue and that is one reason for getting a
structured output. Let me make a separate post with a python snippet
that may help you there.

> - No clear parent-child relationship between the logs when using -ff,

Using the -ff -ttt plus strace-log-merge reassembly would help in the
short term as you ending with each strace output line prefixed with
the pid.
But that does not help with parent/child yet as you still need to
track the fork/clone calls to get that. Definitely something we want
to have in a structured output!

> causing the reader code to play a game to figure out which process started
> which processes.

Again using the -ff -ttt plus strace-log-merge reassembly will give
you a clean timeline and a pid for each strace line. With that, the
first process becomes clearly the first in the trace.

> - Only after the process hierarchy is figured out I could figure out the
> starting current working directory of each process, causing a lot of "late
> bound path". See RelativePath class. It definitely slows down reading the
> logs and adds potential corruption of the paths in the trace reader if it
> incorrectly guessed the starting cwd.

This is an issue indeed. YMMV, but in your case (i.e. Ubuntu) where
most exe have likely been compiled with a GNU toolchain you would end
up having typically a getcwd syscall early on in most processes and
typically in the first one. This gives you the initial cwd and should
help tracing the cwd further (though you would still need to trace
cwd-relative paths and possibly dir changes) and may remove the
possible need for late bounding in your cases. Using the -y option
decodes all descriptors as absolute paths, making the tracing of the
cwd less important too.
FWIW there is a pending GSOC project idea for advanced path decoding
to eventually return all paths args as absolute, making this a non
issue.

> - Generally inconsistent output, see all the RE_* in Strace.Content. No
> regexp is guaranteed to be able to read a log line.

Yep. The output is mostly open coded at this stage and a goal is to
fix this with the introduction of a structured output.

> - No header to list the hooked functions I should expect to appear, and the
> number of arguments in each. Since it changes from OS version to another, I
> made the code assert on unknown function so I can add them manually.

Good point: someone (Mike F./vapier?) suggested we should return some
directory of supported calls and calls specs. How would you see such
an output structure to be useful?

> - Variable number of arguments to a function coupled with split log lines
> and interesting quoting caused some headaches IIRC. (Or was it OSX doing
> this? Sorry I don't recall)

If you are talking about suspend/resume, then using -ff takes that away.
As for varargs --if this happens-- I guess this is the way the
syscalls are. Not much that can be done.
To the best of my knowledge quoting in strace is rather consistent,
but they are surely some gremlins left and right.
Do you have an example of this?

> Of all the problems, the log corruption with heavily multithreaded process
> on a multi-core system has been the most problematic. Everything else is
> solvable but when the log is corrupted, there isn't much to do. Last time I
> tried, which is a long long time ago, I was still able to reproduce
> occasional corruptions with our test named "browser_tests".

Please send a note here with details. I am using strace routinely to
trace execution of 300K+ processes at once and have not seen a bad
behavior to date. Also please consider checking out 4.8 or HEAD.

> A separate issue I got was that I was not able to trace native client
> sandboxed unit test processes unless strace was running under sudo but to do
> that I need to attach to the already started process to be able to not run
> the unit test itself as root. It's been a pain to implement. Note strictly
> strace's fault, just FYI.

Is this "native client" of yours using some setuid/setgid? If so,
there are known limitations since strace uses the ptrace syscall an
inherits from its limitations which are about security. This is more
or less explained in the man (2) ptrace and man (1) strace though this
is an area that has never been quite clear to me. Have your tried if
the -u  option could be of any help to you?

> What I'd recommend;
> - json is good for smallish document but is not great for unbounded
> documents. Logs can get >2gb. IMHO, I'm not sure it's a good fit.

What about if a trace is not ONE json doc, but instead MANY? A whole
trace would not be a valid json thing, but each line (in the sense of
one line per call for instance) would be its own json? This would make
it smallish enough and would work find for unbounded traces as well as
streaming parsing.

> Especially
> with regards to 64 ints and json assumes unicode, which  may not be correct
> on some systems. That said, I don't have a good recommendation either.

About ints and floats, the only sane way might be to treat everything
as a JSON string. As for unicode, I guess this is a good thing. Yet
you quite rightly point that it may be problemtatic in some cases. As
said earlier, using ASCII and returning hex if failing with -x should
work in all cases imho.

> Something custom would be fine with me. Just expose a reference code to read
> it back in 1 language, 2-3 popular languages if possible.

Once we start to have a more or less working feature, would you be
willing to chip in? Like a possible Python reference code?

> - Optionally listing at the start the current process info; pid, ppid, cwd,
> list of hooked functions and the number of args for each, etc. Would save a
> *lot* of processing when reading the logs of a process tree.

About the cwd, see above: adding absolute paths everywhere makes the
cwd possibly a non issue. That said, the pid, ppid and cwd should
eventually be data attached to every call in a structured output, even
if this is a bit redundant.

> - Always generate 2 entries per call, one enter, one exit. Don't try to be
> smart and merge when possible. Be consistent instead! The exit call should
> be extremely easy to match with the corresponding enter. In particular in
> case of reentrancy. It's less messy if the thread gets killed in the middle
> of execution.

If you were to encode this by hand in json, how would you structure this?
Also I wonder if attaching some arbitrary unique id to every call
could help reassemble multiple entries sanely.

> - python is ridiculously slow at parsing, ensure whatever you do doesn't
> make reading the log *too* slow. In particular, keep support for streaming
> the log instead of forcing to read all the log at once, like a big json
> file.

See my previous comment about having one json "doc" per "line" to make
it easy to stream parse large traces.
Do you thing this could work?

> It may look like I was dismayed at strace log parsing but you haven't see
> Windows' tracerpt.exe output. See LogmanTrace.Tracer.CsvReader.next() to a
> contest of which log format is the worst to read. Whoever wrote this junk
> shouldn't be allowed to code ever again.

I guess the strace maintainers should take this as a compliment :P

> I hope my comments are useful.

They are ++!

[1] https://code.google.com/p/swarming/source/browse/trace_inputs.py?repo=client
[2] https://sourceforge.net/p/strace/code/ci/master/tree/tests/net-fd.test

-- 
Philippe Ombredanne