Trace operations on a particular file/device?

Wed Feb 2 01:45:00 UTC 2011

On 2011-02-02, Roland McGrath <roland at redhat.com> wrote:

>> [...] for all the use-cases I run into, simply remembering what path
>> was associated with an descriptor returned by open() would suffice.
>> Though I never need it, one could remember descriptors returned by
>> creat(), and copy path info when dup[23]() is called as well.

[...]

> What I had always imagined along these lines is somewhat more
> grandiose. That is, something like an XML form of trace output, along
> with richer semantics encoded somewhere, that know things like that a
> particular argument to a particular call is not just "an int" but is
> "an open file descriptor", and that the return value from other calls
> is not just "an int" but is "a file descriptor being established" or
> "a file descriptor being destroyed".

Outputting XML would just make the "output size" problem even worse.
What I'm concerned with is what happens when you have a program that
runs for days before a problem occurs.  Tracing all reads/writes to
all files/devices can generate output that is problematically large
(e.g. can't be e-mailed), so I'm looking for a way to capture only
operations on the one device I'm interested in.

> Then the fancy post-processing would consist of using the trace
> records of each establishing/destroying call (i.e. open, close, dup,
> etc.) to maintain a table mapping known descriptor values to the past
> trace record that last established it (and last destroyed it
> thereafter), and annotating the trace record of a call using a
> descriptor so that the descriptor argument is a link back to the
> establishing/destroying record.

At least for my uses, that's asy enough to do with the output I have
now.  Parsing the current output format isn't difficult.

>> > In Linux, it's possible to deduce some information about descriptors
>> > by looking at the /proc/PID/fd/FD symlink target.  strace doesn't do
>> > anything like that.
>> 
>> Yup.  I could attempt to write some sort of front-end that figures out
>> file descriptors based on that and then runs strace, but that only
>> works for long-running processes and capturing the initial open() and
>> ioctl() calls would be difficult.
>
> What I was thinking of was more like hacking strace to look up
> /proc/PID/fd/FD at the time of each fd-using syscall.  Then it could
> put (3 [/dev/ttyS0], ...) in the trace output.  Or that could be used
> to feed a name-matching filter option, etc.

I thought about that, but was worried that it might impose too much
overhead compared with just "remembering" a path for a file
descriptor.  I suppose it would probably be a net win if the operation
on /proc prevented an output line from being written.

I may try out a few different options on the Python version of strace.

-- 
Grant