[PATCH v5 1/3] Introduce seccomp-assisted syscall filtering

Paul Chaignon paul.chaignon at gmail.com
Sat Sep 21 20:52:56 UTC 2019


On Sat, Sep 21, 2019 at 07:02:24PM +0300, Dmitry V. Levin wrote:
> From: Chen Jingpiao <chenjingpiao at gmail.com>
> 
> With this patch, strace can rely on seccomp to only be stopped at syscalls
> of interest, instead of stopping at all syscalls.  The seccomp filtering
> of syscalls is opt-in only; it must be enabled with the -n option.  Kernel
> support is first checked with check_seccomp_filter(), which also ensures
> the BPF program derived from the syscalls to filter is not larger than the
> kernel's limit.

[...]

> +#else /* !HAVE_LINUX_SECCOMP_H */
> +
> +# warning <linux/seccomp.h> is not available, seccomp filtering is not supported
> +
> +static void
> +check_seccomp_filter_properties(void)
> +{
> +	seccomp_filtering = false;
> +}
> +
> +void
> +init_seccomp_filter(void)
> +{
> +}
> +
> +int
> +seccomp_filter_restart_operator(const struct tcb *tcp)
> +{
> +	return PTRACE_SYSCALL;
> +}

Should these be made "static inline"?  They're only called at startup, so
it's probably fine.

> +
> +#endif
> +
> +void
> +check_seccomp_filter(void)
> +{
> +	check_seccomp_filter_properties();
> +
> +	if (!seccomp_filtering)
> +		error_msg("seccomp filter is requested but unavailable");
> +}
> +extern void init_seccomp_filter(void);
> +extern int seccomp_filter_restart_operator(const struct tcb *);
> +
> +#endif /* !STRACE_SECCOMP_FILTER_H */

[...]

> @@ -2650,6 +2679,13 @@ dispatch_event(const struct tcb_wait_data *wd)
>  	case TE_RESTART:
>  		break;
>  
> +	case TE_SECCOMP:
> +		if (seccomp_before_sysentry) {
> +			restart_op = PTRACE_SYSCALL;
> +			break;
> +		}
> +		ATTRIBUTE_FALLTHROUGH;
> +
>  	case TE_SYSCALL_STOP:
>  		if (trace_syscall(current_tcp, &restart_sig) < 0) {
>  			/*
> @@ -2665,6 +2701,42 @@ dispatch_event(const struct tcb_wait_data *wd)
>  			 */
>  			return true;
>  		}
> +		if (seccomp_filtering) {
> +			/*
> +			 * Syscall and seccomp stops can happen in different
> +			 * orders depending on kernel.  strace tests this in
> +			 * check_seccomp_order_tracer().
> +			 *
> +			 * Linux 3.5--4.7:
> +			 * (seccomp-stop before syscall-entry-stop)
> +			 *         +--> seccomp-stop ->-PTRACE_SYSCALL->-+
> +			 *         |                                     |
> +			 *     PTRACE_CONT                   syscall-entry-stop
> +			 *         |                                     |
> +			 * syscall-exit-stop <---PTRACE_SYSCALL-----<----+
> +			 *
> +			 * Linux 4.8+:
> +			 * (seccomp-stop after syscall-entry-stop)
> +			 *                 syscall-entry-stop
> +			 *
> +			 *         +---->-----PTRACE_CONT---->----+
> +			 *         |                              |
> +			 *  syscall-exit-stop               seccomp-stop
> +			 *         |                              |
> +			 *         +----<----PTRACE_SYSCALL---<---+
> +			 *
> +			 * Note in Linux 4.8+, we restart in PTRACE_CONT after
> +			 * syscall-exit to skip the syscall-entry-stop.  The
> +			 * next seccomp-stop will be treated as a syscall
> +			 * entry.
> +			 *
> +			 * The below line implements this behavior. Note

below line -> line below

There should also be a double white space before "Note".

> +			 * exiting(current_tcp) actually marks a
> +			 * syscall-entry-stop because the flag was inverted in
> +			 * the above call to trace_syscall.
> +			 */
> +			restart_op = exiting(current_tcp) ? PTRACE_SYSCALL : PTRACE_CONT;
> +		}
>  		break;
>  
>  	case TE_SIGNAL_DELIVERY_STOP:

[...]


More information about the Strace-devel mailing list