[PATCH] -e write=trace doesn't dump data after error return

Thu Aug 9 22:05:43 UTC 2007

On Thu, Aug 09, 2007 at 11:21:00PM +0200, Jan Kratochvil wrote:
> On Sat, 28 Jul 2007 15:11:46 +0200, J. Bruce Fields wrote:
> ...
> > When you provide the commandline option "-e write=fd", strace still
> > doesn't dump the full write data in the case where the write system call
> > returns an error.
> 
> I think this is the right behavior according to the man page
> 	-e write=set
> 		Perform a full hexadecimal and ASCII dump of
> 		all the data written to file descriptors listed in the
> 		^^^^^^^^^^^^^^^^^^^^^^^
> There is not a wording `all the data _attempted_to_be_ written to'.
> (->NOTABUG)

I suppose we could update the documentation to say "all data written
(succesfully or unsuccesfully) to...".  Do you think there's a real
requirement that write data be dumped only on succesfull writes, or that
somebody's script somewhere depends on that?  It seems more likely to me
that the documentation was written this way just for simplicity's sake,
but I'll defer to experience.

I ran into this because I was debugging interactions between knfsd and
mountd, which communicate over /proc files; in cases such as this (which
there are lots of), a write that fails may be failing for some important
reason which has to do with the particular data written.

Even in the case of a regular file, knowing what the program was trying
to write at the time of a failure could occasionally be helpful.  (E.g.
"Why'd we get -ENOSPC?  Oh, I see, it's trying to write such-and-such a
record over and over again.  I know where that code is...")

> You can use the regular `-e trace=write' functionality for the write attempts,
> you can see the whole data being written using the `-s SIZE' option.

Oh, I'd missed that, thanks.

> > (You could even wonder whether it makes sense to skip dumping in the
> > read case.  I tend to suspect the likelihood of there being interesting
> > data in the read buffer is small in that case,
> 
> http://www.opengroup.org/onlinepubs/009695399/functions/read.html
> 	If a read() is interrupted by a signal before it reads any data, it
> 	shall return -1 with errno set to [EINTR].
> 	If a read() is interrupted by a signal after it has successfully read
> 	some data, it shall return the number of bytes read.
> 
> Therefore if no data were read it would report syserror(tcp) and no data make
> sense to dump.  If some data were read then syserror(tcp) would not happen (as
> the returned syscall value would be larger than zero).

Yeah, I know.  My only thought was that it's conceivable the process's
read buffer could still contain some useful hint, even if it's just
stale data from some previous use of the buffer.  But I think that's
farfetched.  The write case isn't--I actually have needed the full data
passed into a failed write to debug real problems on several occasions.

I also hadn't noticed the -s option--thanks for pointing that out!  It
would have been sufficient for my purposes.

I still think it'd be useful to include failed writes in the write=fd
case, but the existance of -s makes that less important to me.

--b.