[patch] get_scno should test TCB_INSYSCALL at the beginning

jochen jochen at penguin-breeder.org
Wed Sep 24 00:37:06 UTC 2003


Hi,

testcase is difficult if you don't have access to some zSeries machine.
If you have, start java (IBM JDK 1.4.1) on it (running suse enterprise
server 8 here, 6 cpus, 2 gig memory) and strace it. As soon as the java
program gets signaled while in rt_sigsuspend strace will just terminate:

$ uname -a 
Linux gfree18 2.4.19-3suse-SMP #1 SMP Wed Nov 6 22:34:43 UTC 2002 s390
unknown

$ strace -p <some java thread>
...
rt_sigprocmask(SIG_BLOCK, NULL, [RTMIN], 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN], [RTMIN], 8) = 0
gettimeofday({1064391119, 839071}, NULL) = 0
nanosleep({0, 8929000}, NULL)           = 0
rt_sigprocmask(SIG_SETMASK, [RTMIN], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend([] <unfinished ...>
--- SIGRTMIN (Real-time signal 0) ---

$ strace strace -p <same pid>
ptrace(PTRACE_SYSCALL, 13839, 0x1, SIGUSR2) = 0
--- SIGCHLD (Child exited) ---
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
wait4(-1, [WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP], 0x40000000, NULL) =
13839
rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT PIPE TERM], NULL, 8) = 0
ptrace(PTRACE_PEEKUSER, 13839, psw_addr, [0x40085642]) = 0
ptrace(PTRACE_PEEKTEXT, 13839, 0x4008563e, [0xf0640ab3]) = 0
ptrace(PTRACE_PEEKUSER, 13839, gpr2, [0xffffffda]) = 0
ptrace(PTRACE_PEEKUSER, 13839, orig_gpr2, [0x403c4340]) = 0
ptrace(PTRACE_PEEKUSER, 13839, gpr3, [0x8]) = 0
ptrace(PTRACE_PEEKDATA, 13839, 0x403c4340, [0x80004207]) = 0
ptrace(PTRACE_PEEKDATA, 13839, 0x403c4344, [0]) = 0
write(2, "rt_sigsuspend([HUP INT QUIT USR1"..., 44rt_sigsuspend([HUP INT
QUIT US R1 TERM RTMIN]) = 44
ptrace(PTRACE_SYSCALL, 13839, 0x1, SIG_0) = 0
--- SIGCHLD (Child exited) ---
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
wait4(-1, [WIFSTOPPED(s) && WSTOPSIG(s) == SIGUSR2], 0x40000000, NULL) =
13839
rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT PIPE TERM], NULL, 8) = 0
write(2, " <unfinished ...>\n", 18 <unfinished ...>
)     = 18
write(2, "--- SIGUSR2 (User defined signal"..., 40--- SIGUSR2 (User
defined sign al 2) ---
) = 40
open("/proc/13839/status", O_RDONLY)    = 3
read(3, "Name:\tjava\nState:\tT (stopped)\nTg"..., 2048) = 902
close(3)                                = 0
ptrace(PTRACE_SYSCALL, 13839, 0x1, SIGUSR2) = 0
--- SIGCHLD (Child exited) ---
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
wait4(-1, [WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP], 0x40000000, NULL) =
13839
rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT PIPE TERM], NULL, 8) = 0
ptrace(PTRACE_PEEKUSER, 13839, psw_addr, [0x403b16b4]) = 0
ptrace(PTRACE_PEEKTEXT, 13839, 0x403b16b0, [0x7f40707]) = 0
ptrace(PTRACE_PEEKUSER, 13839, gpr4, [0]) = 0
ptrace(PTRACE_PEEKTEXT, 13839, 0x707, [0x7f40707]) = -1 EIO
(Input/output error)
ptrace(PTRACE_DETACH, 13839, 0x1, SIG_0) = 0
_exit(0)                                = ?

ok, that's what happening, after getting the signal the process falls
into some signal handler and thus leaves the kernel space and is stopped
by sigtrap. now strace comes and tries to figure out which syscall has
happened (although it should know better, because the process was
already in a syscall and should now fall out of it).

However, because the process is now in a sighandler, the get_scno code
for s390 cannot find the correct assembler instruction, thus tries to
decode garbage and thus generates and invalied access and causes strace
to terminate (without notice...)

with latest strace and latest linux for s390 this behaviour isn't
visible anymore, because the kernel just tells whatever process is
ptracing another the syscall number. However, to support older kernels,
the code causing the bad memory access is still part of strace.

There would be now two ways to fix this: a) test directly before the old
s390 code whether the process is actually entering the kernel or leaving
it or b) test this at the beginning of get_scno

IMHO b) is far better, because it's determined to fail to determine the
syscall number when you leave the kernel. If you have a look at the ia32
code for example, it also just peeks a (when leaving the kernel) more or
less random address and takes whatever value it finds as syscall number.

to summarize: get_scno should only be invoked when the kernel is
entered, because otherwise there is no way to determine the actuall
syscall number (the process might be in a signal handler at that time)

regards
-- jochen

On Tue, Sep 23, 2003 at 03:53:17PM -0700, Roland McGrath wrote:
> If there is an actual bug caused by the current code, please post a test
> case to demonstrate it.  I am not convinced your patch is safe, because
> get_scno sets various global variables that may be used by syscall_fixup.
> (Yes, the code is horrible spaghetti and was that way before I got to it.
> I'm just trying not to make it worse.)
> 
> 
> Thanks,
> Roland
> 




More information about the Strace-devel mailing list