x32 gcc sign extension bug (was: Long/address printing in non-native personalities)

Dmitry V. Levin ldv at altlinux.org
Wed Jul 20 14:48:10 UTC 2016


On Sat, Jun 11, 2016 at 04:11:38AM +0300, Dmitry V. Levin wrote:
> On Fri, Jun 10, 2016 at 06:58:57PM +0300, Eugene Syromyatnikov wrote:
> > Hello.
> > 
> > It's been brought to my attention that one of strace tests (nsyscalls) is
> > failing when x32 ABI is used.
> > 
> > $ cat tests-mx32/nsyscalls.log
> > 1c1
> > < syscall_546(0xbadc0ded, 0xbadc1ded, 0xbadc2ded, 0xbadc3ded, 0xbadc4ded, 0xbadc5ded) = -1 (errno 38)
> > ---
> > > syscall_546(0xbadc0ded, 0xbadc1ded, 0xbadc2ded, 0xbadc3ded, 0xbadc4ded, 0xffffffffbadc5ded) = -1 (errno 38)
> > nsyscalls.test: failed test: ../strace -e trace=none ./nsyscalls output mismatch
> > FAIL nsyscalls.test (exit status: 1)
> > 
> > Upon investigation, it has been revealed that the problem is in the way strace
> > handles tracee's long integer variables, which, in turn, has been brought to
> > light by the quirk in the way argument passing code being generated by GCC.
> > More specifically, when a 6-parameter syscall is called (as nsyscalls tests
> > does), 7 parameters should be passed to libc's syscall() wrapper: syscall
> > number itself and its parameters. As a result, 7th argument should be passed
> > via stack in accordance with x32 ABI (
> > https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnx4MzJhYml8Z3g6MzVkYzQwOWIwOGU4YzViYw
> > , page 22). Also, in conformance with x32 ABI, all arguments should be extended
> > to 8 bytes in size. Standard does not explicitly specify the way this should be
> > performed, and that's where GCC does something strange: it sign-extends
> > parameters passed via stack while zero-extends parameters passed via registers
> > (or, more precisely, doesn't care about higher bits at all in latter case, for
> > obvious reasons):
> > 
> > Dump of assembler code for function main:
> > 0x00400370 <+0>: push %rbx
> > 0x00400371 <+1>: mov $0xbadc4ded,%r9d
> > 0x00400377 <+7>: mov $0xbadc3ded,%r8d
> > 0x0040037d <+13>: mov $0xbadc2ded,%ecx
> > 0x00400382 <+18>: mov $0xbadc1ded,%edx
> > 0x00400387 <+23>: mov $0xbadc0ded,%esi
> > 0x0040038c <+28>: sub $0x8,%esp
> > 0x0040038f <+31>: mov $0x40000222,%edi
> > 0x00400394 <+36>: xor %eax,%eax
> > => 0x00400396 <+38>: pushq $0xffffffffbadc5ded
> > 0x0040039b <+43>: callq 0x400350 <syscall at plt>
> 
> Eugene, thanks for the analysis, this explains the issue.
> 
> GCC actually produces the following instructions:
> 
> 
>         pushq   %rbx
>         movl    $-1159967251, %r9d
>         movl    $-1159971347, %r8d
>         movl    $-1159975443, %ecx
>         movl    $-1159979539, %edx
>         movl    $-1159983635, %esi
>         subl    $8, %esp
>         movl    $1073742370, %edi
>         xorl    %eax, %eax
>         pushq   $-1159963155
>         call    syscall
> 
> This is translated into the code you've cited.
> 
> > And this is what leads to test failure — strace does not care about tracee's
> > size of long when it prints syscall arguments (printargs() in syscall.c simply
> > calls tprintf("%s%#lx", i ? ", " : "", tcp->u_arg[i])). Moreover, it similarly
> > does not care about tracee's pointer size in places where it prints them (my
> > first attempt to fix this issue was by (incorrectly) utilising
> > printnum_ulong(), but it failed hilariously, since umoven_or_printaddr(), which
> > is apparently called before actual printing, leads to precisely the same
> > results since it simply calls tprintf("%#lx", addr)). As a result, I propose
> > the following change where all cases when tracee's long (or pointer, since they
> > have the same size in Linux ABIs AFAIK) should be printed are handled via
> > printaddr() and logic regarding dispatching current_wordsize is added to
> > printaddr() itself (maybe it also should be renamed to printlong() or something
> > more suitable).
> 
> In strace, we are trying to display things from the kernel perspective rather
> than from the userspace perspective.  When these points of view don't
> match, we are trying to present the kernel's side of the story.
> 
> For example, when the kernel takes a pid argument, we print it as int.
> 
> With regards to address printing, the x86_64 kernel operates with
> addresses as whole 64-bit words both in x86_64 and x32 modes, so explicit
> cast to a 32-bit word won't be an appropriate solution.
> 
> Imagine an invocation of a 6-arg syscall with the last argument being an
> address, e.g. pselect6.  Such syscall cannot be invoked properly on x32
> using a syscall() wrapper because, as you've shown, the 32-bit address
> will be sign-extended, producing a potentially wrong address that would
> result to a failed syscall; I think strace would be less useful if it
> stopped showing the 64-bit address actually passed to this syscall.
> 
> For this reason, I recommend x86_64 strace over native x32 strace for
> tracing x32 processes because x32 strace doesn't show these high bits
> (which are visible to the kernel) while x86_64 strace does show them.
> 
> Said that, I fully support your proposal of printing addresses consistently
> using printaddr().  There are still about two dozens cases that should be
> converted to use printaddr() instead of direct tprintf() calls.
> 
> printargs(), however, is used in different contexts, so changing it
> to use printaddr() might result to NULL being printed instead of 0
> in places where such a change is not quite appropriate.

I've changed printargs() to print 64-bit args on x32 personality,
and changed nsyscalls.test to specify 64-bit syscall args on x32.


-- 
ldv
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.strace.io/pipermail/strace-devel/attachments/20160720/1ce29343/attachment.bin>


More information about the Strace-devel mailing list