Fault Injection in strace [GSOC 2016]. A sample implementation.

haris iqbal haris.phnx at gmail.com
Tue Mar 8 17:06:51 UTC 2016


On Tue, Mar 8, 2016 at 10:34 PM, haris iqbal <haris.phnx at gmail.com> wrote:
> On Tue, Mar 8, 2016 at 10:00 PM, Gabriel Laskar <gabriel at lse.epita.fr> wrote:
>> On Tue, 8 Mar 2016 18:30:37 +0530
>> haris iqbal <haris.phnx at gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I have implemented a very rudimentary option in strace to inject
>>> faults for certain system calls. It uses the code written for the -e
>>> option. One has to provide the system calls to be failed using the
>>> newly implemented -g option. It seems to be working.
>>>
>>> Since it is not a patch, how should I share the code for everyone to
>>> see.
>>>
>>
>> Can't you give us a diff?
>
> Yes. Here it is. I did this today evening, so the code might seem like
> very temporary. I just wanted to check whether the way I am thinking
> would work or not.
>
> diff --git a/strace.c b/strace.c
> index 49d6f3d..42cd9a6 100644
> --- a/strace.c
> +++ b/strace.c
> @@ -1487,6 +1487,11 @@ get_os_release(void)
>   * Don't want main() to inline us and defeat the reason
>   * we have a separate function.
>   */
> +
> +void fail_syscall(int);
> +void set_fail_flag(void);
> +void set_failing_parameters(char* str);
> +
>  static void ATTRIBUTE_NOINLINE
>  init(int argc, char *argv[])
>  {
> @@ -1523,7 +1528,7 @@ init(int argc, char *argv[])
>          "k"
>  #endif
>          "D"
> -        "a:e:o:O:p:s:S:u:E:P:I:")) != EOF) {
> +        "a:e:g:j:o:O:p:s:S:u:E:P:I:")) != EOF) {
>          switch (c) {
>          case 'b':
>              if (strcmp(optarg, "execve") != 0)
> @@ -1600,6 +1605,13 @@ init(int argc, char *argv[])
>          case 'e':
>              qualify(optarg);
>              break;
> +        case 'g':
> +            qualify(optarg);
> +            set_fail_flag();
> +            break;
> +        case 'j':
> +            set_failing_parameters(optarg);
> +            break;
>          case 'o':
>              outfname = xstrdup(optarg);
>              break;
> @@ -2322,7 +2334,8 @@ show_stopsig:
>       * This should be syscall entry or exit.
>       * Handle it.
>       */
> -    if (trace_syscall(tcp) < 0) {
> +    int temp = trace_syscall(tcp);
> +    if (temp < 0) {
>          /*
>           * ptrace() failed in trace_syscall().
>           * Likely a result of process disappearing mid-flight.
> @@ -2337,6 +2350,11 @@ show_stopsig:
>          return true;
>      }
>
> +    if(temp == 3)
> +    {
> +        fail_syscall(pid);
> +    }
> +
>  restart_tracee_with_sig_0:
>      sig = 0;
>
> diff --git a/syscall.c b/syscall.c
> index 6efcde5..44fe5f3 100644
> --- a/syscall.c
> +++ b/syscall.c
> @@ -785,12 +785,43 @@ static void get_error(struct tcb *, const bool);
>  static int getregs_old(pid_t);
>  #endif
>
> +/* custom for POC of fault injection*/
> +
> +void fail_syscall(int);
> +void set_fail_flag(void);
> +void set_failing_parameters(const char*);
> +
> +
> +
> +unsigned int fail_flag = 0;
> +int probability = 100;
> +void set_fail_flag()
> +{
> +    fail_flag = 1;
> +    return;
> +}
> +
> +void set_failing_parameters(const char* str)
> +{
> +    probability = atoi(str);
> +    return;
> +}
> +
> +/* END */
> +
> +
>  static int
>  trace_syscall_entering(struct tcb *tcp)
>  {
>      int res, scno_good;
>
>      scno_good = res = get_scno(tcp);
> +    if ((tcp->qual_flg & QUAL_TRACE) && fail_flag == 1)
> +    {
> +        // probablity needs to be added using randon number genaratiion
> +        tprintf("%s system called failed with probability %d\n",
> syscall_name(tcp->scno), probability);
> +        return 3;
> +    }
>      if (res == 0)
>          return res;
>      if (res == 1)
> @@ -1241,6 +1272,16 @@ get_regset(pid_t pid)
>  }
>  #endif /* ARCH_REGS_FOR_GETREGSET */
>
> +void fail_syscall(pid)
> +{
> +    void * fail_struct = calloc(1, sizeof(ARCH_IOVEC_FOR_GETREGSET));
> +    tprintf("done");
> +
> +    ptrace(PTRACE_SETREGS, pid, NT_PRSTATUS, fail_struct);
> +    free(fail_struct);
> +}

I used this technique just to fail the system call. I know this is not
a practical way. I just did this to test whether the idea would work
or not.

What should be the proper way to fail a given system call?

> +
> +
>  void
>  get_regs(pid_t pid)
>  {
>
>
>>
>> --
>> Gabriel Laskar
>
>
>
> --
>
> With regards,
>
> Md Haris Iqbal,
> Placement Coordinator, MTech IT
> NITK Surathkal,
> Contact: +91 8861996962



-- 

With regards,

Md Haris Iqbal,
Placement Coordinator, MTech IT
NITK Surathkal,
Contact: +91 8861996962




More information about the Strace-devel mailing list