bugfix for strace for less-aligned kernel memory
Dmitry V. Levin
ldv at altlinux.org
Tue Mar 31 22:04:50 UTC 2015
On Thu, Mar 19, 2015 at 03:48:35PM -0500, Bolo wrote:
> This bugfix / patch consists of two portions
>
> 1) A 'Z' option which forces strace to use ptrace() memory access
> routines all the time instead of vm-based memory access routines if
> they are present.. This allows a fallback if you find that VM based
> requests are failing or producing incorrect output. The existing
> ptrace based methods are quite robust and tested :-)
As there are no known issues with kernel implementation of process_vm_readv,
why should anybody want to disable fast process_vm_readv based code
in favour of slow PTRACE_PEEKDATA based one?
> 2) A correction for a long-term bug in th VM based memory access
> method of strace. This bug has been in strace in various forms since
> at least 4.5.19.
>
> The ptrace based code is correct.
Well, it was a genuine bug in process_vm_readv based code, thank you for
reporting it, although I haven't got it from your description and patch
after the first read.
It's fixed by commit v4.10-56-g4832134 which also contains a regression
test. There is also a follow-up cleanup commit v4.10-57-gea1fea6.
> What happens is that the vm based access code in strace doesn't deal
> correctly with alignments of data in the kernel. It assumes that
> everything is at a certain alignment. This is an incorrrect assumption;
> when there are a lot of objects the kernel starts putting string data at
> lesser aligned addresses. I discovered this through debugging
> of strace failures and finding the kernel relaxing alignment constraints.
> This is true of older (such as rhel 5.9 and newer such as rhel -6.6 kernels).
I've tried to reproduce alignment issues with process_vm_readv from your
description, but without success.
> There was also an issue with the code not treating page boundaries
> correctly due to incorrect page arithmetic.
The actual bug was using wrong address for the page arithmetic.
> When this alignment constraint is relaxed, or strace incorrectly issues
> reads across page boundaries, strace fails with a error of
>
> "umovestr: short read (%d < %d) @0x%lx"
>
> due to the incorect code. In addition, since the vm-based code doesn't
> correctly update the address and lengths of the region to be accessed,
> the fallback code -- which is implemented correctly, fails to work.
There is no fallback to PTRACE_PEEKDATA based code in case when
process_vm_readv has read at least one byte.
> The new code also deals with arbitrary page sizes correctly in extracting
> data using the vm mode, instead of relying upon a 4k pagesize.
>
> 3) Reproducing this error depends upon where the kernel is putting items
> in memory, and may also be based on the load of the kernel. As such,
> this error can be difficult to reproduce.
Actually, the reproducer is quite simple:
http://sourceforge.net/p/strace/code/ci/HEAD/tree/tests/umovestr.c
--
ldv
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://lists.strace.io/pipermail/strace-devel/attachments/20150401/eb8b98b4/attachment.bin>
More information about the Strace-devel
mailing list