Wrong traces for connected sockets of socketpair with -ff option

Masatake YAMATO yamato at redhat.com
Wed Sep 19 12:53:25 UTC 2018


On Wed, 19 Sep 2018 02:40:44 +0900 (JST), Masatake YAMATO <yamato at redhat.com> wrote:
> Reproduced event with -f option.
> 
> It seems that strace failed to get the information for the sockets.
> 
> ========================================================================
> [jet at localhost]~/var/strace% ./strace -o /tmp/LOGX -f -yy -e socketpair /usr/bin/google-chrome
> ./strace -o /tmp/LOGX -f -yy -e socketpair /usr/bin/google-chrome
> ./strace: ERROR is received while getting socket info for 4718911
> ./strace: ERROR is received while getting socket info for 4718912
> ./strace: ERROR is received while getting socket info for 4718955
> 
> diff --git a/socketutils.c b/socketutils.c
> index c3f28a1b..60bef756 100644
> --- a/socketutils.c
> +++ b/socketutils.c
> @@ -240,6 +240,9 @@ receive_responses(struct tcb *tcp, const int fd, const unsigned long inode,
>  		if (!is_nlmsg_ok(h, ret))
>  			return false;
>  		for (; is_nlmsg_ok(h, ret); h = NLMSG_NEXT(h, ret)) {
> +			if (h->nlmsg_type == NLMSG_ERROR)
> +				error_msg("ERROR is received while getting socket info for %lu",
> +					  inode);
>  			if (h->nlmsg_type != expected_msg_type)
>  				return false;
>  			const int rc = parser(NLMSG_DATA(h),
> ========================================================================
> 
> The kernel may send more infomation about the error, printing it may be the
> first step for fixing this.

By attach strace to strace, I got:

    socket(AF_NETLINK, SOCK_RAW, NETLINK_SOCK_DIAG) = 4
    sendmsg(4, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, \
	    msg_namelen=12, msg_iov=[{iov_base={{len=40, type=0x14 /* NLMSG_??? */, \
	    flags=NLM_F_REQUEST, seq=0, pid=0}, \
	    "\x01\x00\x00\x00\xff\xff\xff\xff\x83\x9f\x6b\x00\x05\x00\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff"}, \
	    iov_len=40}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) \
	    = 40
    recvmsg(4, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, \
	    msg_namelen=12, msg_iov=[{iov_base={{len=60, type=NLMSG_ERROR, \
	    flags=0, seq=0, pid=14881}, {error=-ENOENT, msg={{len=40, \
-----------------------------------------^^^^^^^^^^^^^	    
	    type=SOCK_DIAG_BY_FAMILY, flags=NLM_F_REQUEST, seq=0, pid=0}, \
	    {sdiag_family=AF_UNIX, sdiag_protocol=0, \
	    udiag_states=1<<TCP_ESTABLISHED|1<<TCP_SYN_SENT|1<<TCP_SYN_RECV|1<<TCP_FIN_WAIT1|1<<TCP_FIN_WAIT2|1<<TCP_TIME_WAIT|1<<TCP_CLOSE|1<<TCP_CLOSE_WAIT|1<<TCP_LAST_ACK|1<<TCP_LISTEN|1<<TCP_CLOSING|1<<TCP_NEW_SYN_RECV|0xffffe001, \
	    udiag_ino=7053187, udiag_show=UDIAG_SHOW_NAME|UDIAG_SHOW_PEER, \
	    udiag_cookie=[4294967295, 4294967295]}}}}, iov_len=8192}], \
	    msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 60
    write(3, "UNIX:[7053186]>, 30<", 20) = 20

kernel returns =-ENOENT for the inodes of socketpair.
Why?

I looked into the kernel side:

unix_diag_dump()@linux/net/unix/diag.c:

    static int unix_diag_dump(struct sk_buff *skb, struct netlink_callback *cb)
    {
	    struct unix_diag_req *req;
	    int num, s_num, slot, s_slot;
	    struct net *net = sock_net(skb->sk);

	    req = nlmsg_data(cb->nlh);

	    s_slot = cb->args[0];
	    num = s_num = cb->args[1];

	    spin_lock(&unix_table_lock);
	    for (slot = s_slot;
		 slot < ARRAY_SIZE(unix_socket_table);
		 s_num = 0, slot++) {
		    struct sock *sk;

		    num = 0;
		    sk_for_each(sk, &unix_socket_table[slot]) {
			    if (!net_eq(sock_net(sk), net))
				    continue;
		    ...

The last condition looks interesting.
The condition tells that the network namespace where the strace process is and
the network namespace where the target process must be the same.

To verify the impact of the condition, I run modified version of strace:

diff --git a/socketutils.c b/socketutils.c
index c3f28a1b..0a12eb08 100644
--- a/socketutils.c
+++ b/socketutils.c
@@ -240,6 +240,11 @@ receive_responses(struct tcb *tcp, const int fd, const unsigned long inode,
 		if (!is_nlmsg_ok(h, ret))
 			return false;
 		for (; is_nlmsg_ok(h, ret); h = NLMSG_NEXT(h, ret)) {
+			if (h->nlmsg_type == NLMSG_ERROR) {
+				char buf[256];
+				sprintf(buf, "ls -l /proc/self/ns/net /proc/%u/ns/net >> /tmp/NET", tcp->pid);
+				system (buf);
+			}
 			if (h->nlmsg_type != expected_msg_type)
 				return false;
 			const int rc = parser(NLMSG_DATA(h),

When the kernel returns =-ENOENT, the strace process prints both network namespaces.

I got following strace output by running chrome

    27008 socketpair(AF_UNIX, SOCK_STREAM, 0, [14<UNIX:[1566896]>, 15<UNIX:[1566897]>]) = 0
    27015 socketpair(AF_UNIX, SOCK_STREAM, 0, [24<UNIX:[1562213]>, 25<UNIX:[1562214]>]) = 0
    26534 socketpair(AF_UNIX, SOCK_STREAM, 0, [281<UNIX:[1561522->1561523]>, 282<UNIX:[1561523->1561522]>]) = 0

In the /tmp/NET:

For pid 27008:

lrwxrwxrwx. 1 yamato yamato 0 Sep 19 21:34 /proc/27008/ns/net -> net:[4026532596]
lrwxrwxrwx. 1 yamato yamato 0 Sep 19 21:34 /proc/self/ns/net -> net:[4026532001]

The net namespace for 27008 and that for the strace process are different.

For pid 27015:

lrwxrwxrwx. 1 yamato yamato 0 Sep 19 21:34 /proc/27015/ns/net -> net:[4026532596]
lrwxrwxrwx. 1 yamato yamato 0 Sep 19 21:34 /proc/self/ns/net -> net:[4026532001]

The net namespace for 27015 and that for the strace process are different.

For 26534, nothing is recorded in /tmp/NET because kernel doesn't return -ENOENT
for the socket (inode 1561522 and 15615230).

So I would like to say, this is not a bug.
This is the limitation of strace tracing a process running in a different namespace
from where strace runs.

On Wed, 19 Sep 2018 02:40:44 +0900 (JST), Masatake YAMATO <yamato at redhat.com> wrote:
>> Hi everyone,
>> 
>> I might come cross a bug for syscall 'socketpair' with option -ff.
>> 
>> When using -ff, strace is supposed to display the ip:port associated with
>> the sockfd.
>> 
>> According to my understanding, connected sockets should look like:
>> socketpair(AF_UNIX, SOCK_STREAM, 0, [27<UNIX:[7162769->7162770]>,
>> 28<UNIX:[7162770->7162769]>]) = 0
>> 
>> The syscall 'socketpair' always generates connected socket, but somehow,
>> strace also generates the following trace:
>> socketpair(AF_UNIX, SOCK_STREAM, 0, [223<UNIX:[7162686]>,
>> 224<UNIX:[7162687]>]) = 0
>> which looks like connectless sockets.
>> 
>> This happens everytime.
>> To triger this, by using $strace -f -yy -o chromium chromium-browser
>> 
>> Best,
>> Zhouyang
> -- 
> Strace-devel mailing list
> Strace-devel at lists.strace.io
> https://lists.strace.io/mailman/listinfo/strace-devel

Masatake YAMATO


More information about the Strace-devel mailing list