[GSoC][RFC]: seccomp-assisted syscall filtering

Tue Mar 27 13:29:00 UTC 2018

On 03/26 10:30, Dmitry V. Levin wrote:
> On Mon, Mar 26, 2018 at 10:05:57PM +0800, Chen Jingpiao wrote:
> > On 03/25 10:36, Dmitry V. Levin wrote:
> > > On Wed, Mar 21, 2018 at 10:17:08PM +0800, Chen Jingpiao wrote:
> > > > On 03/12 02:29, Eugene Syromiatnikov wrote:
> > > > > On Mon, Mar 12, 2018 at 10:38:37AM +0800, Chen Jingpiao wrote:
> > [...]
> > > It may be worth adding an option to explicitly enable/disable this
> > > seccomp-based filter while it's being developed.  When it's ready for
> > > non-experimental use, it will be enabled automatically depending on the
> > > kernel support and tracing options, but we might want to keep the option
> > > of disabling the feature explicitly.
> > 
> > Yes.
> > 
> > > 
> > > Please note the following important points of this project that
> > > I'd recommend to mention in the proposal:
> > > 
> > > - Runtime check for the seccomp semantics implemented by the kernel,
> > >   similar to the runtime check for PTRACE_SEIZE, with fallback to the
> > >   traditional filtering.
> > 
> > Yes, something like this [1]. I will do more research.
> > 
> > A demo:
> > 
> > bool
> > test_seccomp_filter(void)
> > {
> > #ifdef SECCOMP_MODE_FILTER
> > 	if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, NULL, 0, 0) < 0) {
> > 		debug_msg("SECCOMP_MODE_FILTER doesn't work: %s",
> > 			  strerror(errno));
> > 		return false;
> > 	}
> > 	return true;
> > #else
> > 	debug_msg("SECCOMP_MODE_FILTER doesn't work: not define SECCOMP_MODE_FILTER");
> > 	return false;
> > #endif
> > }
> 
> Also we can check at runtime whether the kernel implements 4.8+ or older
> seccomp because a simple check based on the kernel version is less
> reliable as features are backported to older kernels sometimes.
> 
> Another case where we will have to use traditional PTRACE_SYSCALL
> filtering is "strace -p".

Thank you. I missing this case.

> 
> > > - Optimization of the BPF code, for example, in some cases it's better
> > >   to list traced syscalls, in other cases - to list those syscalls that
> > >   are not traced.  Sometimes (e.g. -e trace=all) there is no point in
> > >   enabling a seccomp-based filter even if the kernel supports it.
> > 
> > I have concerned this problem, but I have not found a satisfying way to
> > deal with the problem.  One is count the numbers of traced syscall,
> > and compare with nsyscalls. Other is if number_set.not is set, we unable
> > seccomp filter (seems upcoming advanced filtering syntax delete the trace_set).
> > 
> > Second solution is not a good idea if use command:
> > 
> > $ strace -etrace=!%class[,/regex ...] PROG
> > 
> > $ grep -w TF linux/x86_64/syscallent.h | wc -l
> > 60
> 
> If the set is a negation, this is not an obstacle, bpf filters can check
> elements of the set almost the same way as with simple non-negated sets.
> 
> > Do you have any good suggestions? Thank you.
> 
> While it's good to have a good answer to this question,
> the official gsoc coding period hasn't started yet so
> you still have plenty of time to think about it.
> 
> > > With regards to the proposed timeline, please note the following subtasks
> > > may take more time than expected:
> > > - Integrating with the upcoming advanced filtering syntax (one of last
> > >   year gsoc projects that is not merged yet but will hopefully be merged
> > >   soon).
> > > - Reviewing and merging to master.
> > 
> > Ok.
> > 
> > Thank you, I updated the proposal [2].
> 
> I'm slowly getting used to your style of writing in English ;)
> but it would be great if you could make you proposal somewhat more readable.

Sorry.

--
Chen Jingpiao