Structured output?

Philippe Ombredanne pombredanne at nexb.com
Mon Feb 3 12:46:08 UTC 2014


On Mon, Feb 3, 2014 at 9:15 AM, Zev Weiss <zev at bewilderbeest.net> wrote:
> I'm not sure how much use it'd be in another context, but for what it's worth
> I've written a partial parser for its current output format (assuming a few
> flags, 'strace -fqtttTv' basically) for a project of mine, a trace replayer
>  called ARTC: https://research.cs.wisc.edu/wind/Software/artc/

That was my point. You wrote a partial lexer, I wrote another partial
one, these projects wrote parsers too:
https://code.google.com/p/pystrace/source/browse/strace.py
https://github.com/johnlcf/Stana/blob/master/straceParserLib/StraceParser.py
https://code.google.com/p/swarming/source/browse/trace_inputs.py?repo=client#786
http://search.cpan.org/~dgl/Sys-Trace-0.03/lib/Sys/Trace/Impl/Strace.pm
https://github.com/yhuai/tableplacement/blob/master/tableplacement-experiment/straceAnalyzer/strace_analyzer.py

and there are likely several other example.

> So parsing it as-is is certainly feasible (though I had to add a small
> format-string fix at one point), but yeah, having a more easily
> parseable output format available would be quite nice...
> of those you listed I'd probably vote for JSON, though frankly as long
> as it's something other than XML I'd be happy.

We are thinking alike.

> One possible complication is how to handle multi-threaded
> (or multi-process) traces, where syscalls might "interrupt"
> each other in the output stream.  I guess the obvious options
> would be explicit entry/exit timestamps in the output and a
> semi-out-of-order output stream (i.e. output only on return),
> or special record types for submission vs. completion
> (analogous to what it's like now).

FWIW, using -ff aka tracing each process in a separate output
mitigates this a bit and produces much fewer "interrupts" if any,
depending on what you are tracing. Per process traces can then be
merged in a single sorted file with the strace-log-merge script if you
prefer this type of input. For instance this is used in the strace
self tests.

-- 
Philippe Ombredanne




More information about the Strace-devel mailing list