RFC: path display and path filtering features.
Grant Edwards
grant.b.edwards at gmail.com
Tue Feb 15 16:01:57 UTC 2011
On 2011-02-15, Roland McGrath <roland at redhat.com> wrote:
>> 1) Support for non-Linux OSes isn't done yet. For example, the
>> handling of the clone system call where it is determined whether
>> to copy or share the descriptor state assumes the OS is Linux.
>
> Maintenance of strace has been thoroughly Linux-centric for years
> now. I really think it's just fine for a new feature to be entirely
> under #ifdef LINUX and to let people concerned with other systems
> worry about it when they want to put the effort into it.
I can work with that. :)
>> 2) It is assumed that TRACE_DESCRIPTOR means arg[0] is a file
>> descriptor, and that TRACE_FILE means arg[0] is path. That's not
>> 100% accurate, but so far it's close enough for all of my use
>> cases.
I should probably be more specific about when I used the
TRACE_DESCRIPTOR/TRACE_FILE/arg[0] assumption. That assumption is
used only for the test to determine whether or not to display a
syscall.
The code that acually handles the fd/path association tests explicitly
for syscalls that can affect fd/path association. The displaying of
paths along with fd arguments is, of course, handled by the individual
syscall formatting routines (I probably haven't found all of those
cases yet either).
>> 6) There are probably some more obsure ways to associate an fd with a
>> path that I haven't caught (for example, the fcntl DUPFD operations).
>
> These are to me the same item, under the general heading of how the
> fd-tracking is tied into syscall decoding. I don't think this
> short-cut is adequate, and I also don't see the need to have done it
> that way at all.
>
> You already have individual hooks in the specific syscall decoders
> where an fd is presented. So why do you need a pre-pass that
> operates blindly on all system calls? The same calls that decode an
> fd for printing can be where the tie into the tracking happens, can't
> they?
>
> The only generic case I really see is that return-value printing is
> generic in the way strace is structured now, so when the return value
> is an fd, you need to handle that. That seems like a good fit for a
> flag bit in the sysent table (saying the return value is an fd being
> established).
>
> Hmm, perhaps this is meant so that the fd-tracking happens on the
> calls that are not being printed at all due to the use of -e options?
Exactly. I wanted to keep the fd-tracking separate from the display
code so that fd-tracking is not affected by filtering options.
Requiring that a user wanting to trace on a "path" not filter out
syscalls like open/close/dup/clone wouldn't be too awful (they're
pretty low-frequency events), but it seems like something people would
trip over.
For example, one of the programs I use always does input handling
using a select/FIONREAD/read sequence. The select and FIONREAD calls
just clutter up the outout. So, if all I want to see are the
read/write calls on /dev/ttyS0, it's awfully nice to be able to do
strace -P /dev/ttyS0 -e read,write ...
I'd like to keep the fd-tracking and fd/path-match code separate from
the display code, so what I'm thinking about trying next for the
"match" code is to:
1) Assume that syscalls with neither TRACE_DESCRIPTOR nor TRACE_FILE set
don't contain a descriptor or a path, and can be ignored.
2) Add special cases for syscalls with the DESCRIPTOR/FILE flag set
where it isn't only arg[0] that should be tested.
What I don't my approach approach in general is that information on a
syscall would now be stored in 4 different places:
* The sysent table
* The fd-tracking routine
* The fd/path match routine
* The syscall decoder functions
One way to avoid this would be to go the route that the Python
implementation of strace took: put sufficient argument descriptions
(name/type/decoding info) into the sysent table to allow use of
"generic" functions for displaying, fd-tracking, fd/path-matching.
One of the things that makes this possible in the Python version is
that the argument decoding is less sophisticated, and things like
ioctl arguments aren't decoded in a manner dependent on each other.
But I don't want to sacrifice the current decoding of things like
ioctl() calls. [FWIW, what I care about mostly is serial ports, and
the ability to decode termios ioctl calls in detail is invaluable to
me.]
>> 3) The code that tracks fd table state is invoked even if paths are
>> neither being displayed nor filtered-upon. That means that the
>> handling of a few system calls (e.g. open/close/dup) takes
>> slightly longer than it needs to in the cases where neither of the
>> two new options is being used.
>
> That should be fixed and it certainly ought to be easy and clean to
> conditionalize the extra work centrally.
Yes, that's simple enough to do -- I just haven't done it yet.
>> 4) I have no idea how portable the readlink() on /proc/<pid>/fd/<fd>
>> mechanism is, or if there are alternative mechanisms for other OSes.
>
> It is entirely Linux-specific, but that is fine. If other systems
> have a mechanism, people interested in their support can write the
> code. Just make the code use clean divisions into subroutines as
> good practice would dictate regardless, and this is ready for future
> porting.
>
>> 5) The indentation needs to be fixed. I ran all the sources through
>> indent in order to get them consistent before I started work, and
>> the attached diffs were done after everything was re-indented.
>
> That's just not the way things are done. The prevailing formatting
> conventions of the strace source are not my first choice either, but
> we leave things how they have been and make new code match.
There are different indentation styles in different files. Can I pick
any of them? (If so, I'll go with the indentation style in count.c.)
>> 7) Though descriptors with the close-on-exec flag set aren't handled
>> properly, the path will get replaced when the child process does
>> associate a new file with the file descriptor.
>
> I think that's fine--perhaps even a feature. That is, it seems
> worthwhile to see the last-known association of an fd when it's used
> in a later call that fails with EBADF.
That's sort of what I was thinking. It would presumably tell you what
the program's author thought he was doing, and the return value would
tell you that it failed (and why).
> Ideally, you the fd-tracking would notice FD_CLOEXEC being set or
> cleared in an fcntl call, and the O_CLOEXEC flag being used in an
> establishing call like open. This seems fairly straightforward: it's
> just tracking a flag bit along with tracking a file name--it also
> seems wortwhile to track the open modes (O_RDONLY vs O_RDWR, etc.)
> It's then pretty trivial to keep an "exec generation" count in a tcb
> and its fd-tracking records, so you can report the known details
> about an fd as "last opened as file foo but then closed implicitly by
> exec of bar". That's desireable just the same as is reporting "last
> opened as file foo but then closed by a close call".
That would be pretty handy when troubleshooting cases of accesses to a
"stale" file descriptor.
>> If, with some additional work, the new options would be accepted, then
>> I'd be happy to work on addressing these (or other) issues.
>
> These days, I'm not really a gatekeeper of patches, that's mostly
> Dmitry. But this seems like useful stuff to me. So, I would say it
> is worth working on it to make it worthy of including. The current
> issues, both trivia like indentation, and substance like lack of
> correct identification of fd and file name parameters/results in all
> calls, make it not yet worthy.
I completely agree that it's not ready to be merged yet.
But I wanted to test the waters before continuing. Since it meets my
requirements, I can't really justify the expenditure of additional
resources unless doing so would make it useful to others and mean that
it could be merged (IOW: I can tell the people that worry about
budgets that the non-recurring expense of getting merged will it save
on the recurring expense of maintaining a fork).
--
Grant Edwards grant.b.edwards Yow! What's the MATTER
at Sid? ... Is your BEVERAGE
gmail.com unsatisfactory?
More information about the Strace-devel
mailing list