wait4 testcase for y2038 systems

Sat Mar 7 02:57:26 UTC 2020

On Fri, Mar 06, 2020 at 06:26:40PM -0800, Alistair Francis wrote:
> On Fri, Mar 6, 2020 at 6:13 PM Dmitry V. Levin <ldv at altlinux.org> wrote:
> >
> > Hi,
> >
> > On Fri, Mar 06, 2020 at 05:38:33PM -0800, Alistair Francis wrote:
> > > Hey,
> > >
> > > I have a question.
> > >
> > > Since the 5.1 kernel new 32-bit architectures (like RV32) don't have
> > > the wait4 kernel syscall as it isn't y2038 safe. Other 32-bit
> > > architectures will eventually also stop using the wait4 kernel syscall
> > > (by 2038 at least).
> > >
> > > glibc still supports the wait4() function call, internally glibc will
> > > convert the programs wait4() call into waitid kernel calls.
> >
> > glibc will support wait4() function call using __NR_wait4 at least until
> > the minimal Linux kernel version supported by glibc raises to 5.4, which
> > is not going to happen any time soon.
> 
> That's not what gibc does. On older architectures it will continue to
> support the __NR_wait4 syscall. On new 32-bit architectures (like
> RV32) that require the 5.1+ kernel there is no __NR_wait4 syscall and
> glibc will use __NR_waitid.
> 
> You can see the code here:
> https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/wait4.c;h=3a8bed1169c799a8fb20360c06f2808cf648c874;hb=HEAD
> 
> Note that due to a bug in __NR_waitid it will actually only do it for
> the 5.4+ kernel, which is the minimum version for RV32.

I don't see any difference between your description and mine. :)

> > To be honest, I don't see any difference between __NR_wait4 and
> > __NR_waitid from y2038 perspective as both of them use the same
> > struct rusage.
> 
> The time in rusage is still 32-bit for both calls. This is because the
> time is a diff (so it won't really overflow).
> 
> As __NR_waitid is a superset of __NR_wait4 I think it was just easier
> to have only one. I also think originally there was a plan for
> __NR_waitid_time64 but that seems to have been dropped.

Yes, I remember that discussion.

> > > This means the strace wait4 testcase fails as it is run with these arguments:
> > >   strace -o log -e trace=wait4 -esignal=none ./wait4
> > >
> > > The wait4 test case prints out wait4 calls, but the actual wait4
> > > function that is called doesn't do any wait4 kernel calls instead it
> > > does waitid kernel calls.
> > >
> > > What would be the best way to handle this test for y2038 32-bit systems?
> >
> > The main purpose of the strace test suite is to test strace.  Yes,
> > sometimes we discover bugs in other projects [1], but that's not the
> > purpose of the strace test suite.  A test of wait4 syscall decoder shall
> > invoke wait4 syscall.  If libc wait4() cannot be relied upon, the test
> > shall invoke __NR_wait4 syscall directly, like many other tests do when
> 
> That is the problem, there is no __NR_wait4 syscall for new 32-bit archs.

Why do you think this is a problem?  If there is no __NR_wait4, then the
test of wait4 decoder should be skipped.  How is it different from your
change to tests/clock_nanosleep.c?

> > they cannot use libc functions.  For example, tests/waitid.c invokes
> > __NR_waitid syscall directly because libc waitid() doesn't expose the 5th
> > argument.
> 
> In this case we are fine as __NR_waitid still exists.

OK, at least something still exists. :)

-- 
ldv