Add Opaque

 

Break up code blocks by inserting opaque predicates. Requires that at least the InitOpaque transform has been previously issued and, preferably, that one or more UpdateOpaque transformations have been given.

Here's a LigerLabs video that discusses this transformation:

OptionArgumentsDescription
--Transform AddOpaque Add opaque predicates to split up control-flow.
--AddOpaqueCount INTSPEC How many opaques to add to each function. Default=1.
--AddOpaqueKinds call, bug, true, junk, fake, question, * Comma-separated list of the types of insertions of bogus computation allowed. Default=call,bug,true,junk,question.
  • call = if (false) RandomFunction()
  • bug = if (false) BuggyStatement else RealStatement
  • true = if (true) RealStatement
  • junk = if (false) asm(".byte random bytes")
  • fake = if (False) NonExistingFunction()
  • question = if (True || False) RealStatement else CopyOfRealStatement
  • * = Turns all options on.
--AddOpaqueObfuscate BOOL Perform some light obfuscation of copied code when using the 'question' opaque predicate. Default=true.
--AddOpaqueSplitBasicBlocks BOOL Split up basic blocks (sequences of statements without control flow) so that opaque predicates can be inserted between them. Default=false.
--AddOpaqueInline BOOL Inline the split out functions when using 'question' opaque predicates. Default=false.
--AddOpaqueSplitKinds top, block, deep, recursive, level, inside Comma-separated list specifying the order in which different split methods are attempted when --AddOpaqueKinds=question is specified. Default=top,block,deep,recursive.
  • top = split the top-level list of statements into two functions funcname_split_1 and funcname_split_2.
  • block = split a basic block (list of assignment and call statements) into two functions.
  • deep = split out a nested control structure of at least height>2 into its own function funcname_split_1.
  • recursive = same as block, but calls to split functions are also allowed to be split out.
  • level = split out a statement at a level specified by --AddOpaqueSplitLevel.
  • inside = split out a statement at the innermost nesting level.
--AddOpaqueSplitLevel INTSPEC Levels which could be split out when specifying --AddOpaqueSplitKinds=level Default=1.
--AddOpaqueStructs list, array, input, env, * Default=list,array.
  • list = Generate opaque expressions using linked lists
  • array = Generate opaque expressions using arrays
  • input = Generate opaque expressions that depend on input. Requires --Inputs to set invariants over input.
  • env = Generate opaque expressions from entropy. Requires --InitEntropy.
  • * = Same as list,array,input,env
 

Diversity

There are two sources of diversity:

  • The types of opaque predicates used, which is controlled by the InitOpaque transformation.
  • The location in the target function where the split takes places, which is randomized.
 

Usage

This is the code generated for the arguments options to --AddOpaqueKinds:

  • call:
    if expr=false then
       call to random existing function
    

  • fake:
    if expr=false then
       call to non-existing function
    

  • true:
    if expr=true then
       existing statement
    

  • bug:
    if expr=true then
       existing statement
    else
       buggified version of the statement
    

  • question:
    if expr=true || false then
       existing statement
    else
       obfuscated version of the statement
    

  • junk:
    if expr=false then
       asm(".byte RandomBytes")
    

 

Question-mark Predicates

Consider the following script:

tigress --Seed=0 \
   --Inputs="+1:int:42,-1:length:1?10" \
   --Transform=InitImplicitFlow \
   --Transform=InitEntropy \
   --Transform=InitOpaque \
      --Functions=main \
      --InitOpaqueCount=2 \
      --InitOpaqueStructs=list,array,input,env \
   --Transform=AddOpaque \
      --Functions=obf3 \
      --AddOpaqueKinds=question \
      --AddOpaqueSplitKinds=inside \
      --AddOpaqueCount=10 \
   arith.c --out=arith_out.c

The result will look something like this:

     {
      strcmp_result5 = (int )strlen(*(_4_main__argv + (_4_main__argc - 1)));
      }
      if (x >= strcmp_result5 + y) {
        {

        }
        if (_2entropy * 3 >= _4_main__opaque_list1_1->data) {
          {
          atoi_result10 = atoi(*(_4_main__argv + 1));
          }
          if (y >= atoi_result10 + x) {
            {

            }
            if (_4_main__opaque_array_0[(atoi_result10 & 2147483647) % 30] >= 22) {
              {

              }
              if (_4_main__opaque_list2_2->data <= (_2entropy | 6) + (_2entropy & 6)) {
                {
                strcmp_result15 = (int )strlen(*(_4_main__argv + (_4_main__argc - 1)));
                }
                if (x >= strcmp_result15 + y) {
                  {
                  atoi_result17 = atoi(*(_4_main__argv + 1));
                  }
                  if (y >= atoi_result17 + y) {
                    {
                    atoi_result19 = atoi(*(_4_main__argv + 1));
                    }
                    if (y >= atoi_result19 + y) {
                      {
                      strcmp_result21 = (int )strlen(*(_4_main__argv + (_4_main__argc - 1)));
                      }
                      if (y >= strcmp_result21 + atoi_result10) {
                        {

                        strcmp_result23 = (int )strlen(*(_4_main__argv + (_4_main__argc - 1)));
                        if (y <= strcmp_result23 + x) {
                          QUESTION_10_1(& z);
                        } else {
                          QUESTION_10_1_COPY(atoi_result24, & z);
                        }

                        }
                      } else {
                        {
                        QUESTION_9_1_COPY(& z, 6.);
                        }
                      }
                      {

                      }
                    } else {
                      {
                      QUESTION_8_1_COPY(& z, 1L);
                      }
                    }
                    {

                    }
                  } else {
                    {
                    QUESTION_7_1_COPY(& z, z);
                    }
                  }
                  {

                  }
                } else {
                  {
                  QUESTION_6_1_COPY(0, & z);
                  }
                }
                {

                }
              } else {
                {
                QUESTION_5_1_COPY(& z, 0);
                }
              }
              {

              }
            } else {
              {
              QUESTION_4_1_COPY(0, & z);
              }
            }
            {

            }
          } else {
            {
            QUESTION_3_1_COPY(& z, 0.);
            }
          }
          {

          }
        } else {
          {
          QUESTION_2_1_COPY(4L, & z);
          }
        }
        {

        }
      } else {
        {
        QUESTION_1_1_COPY(6L, & z);
        }
      }
      {

      }
    }
}

Notice that we attempt (and sometimes fail) to select opaque predicates expr1<=expr2 such that

  • all the expri are independent of each other (i.e. analyzing a function should require you to break all the predicates in it);
  • the expri depend on input (either because they are created from an entropy source or from invariants over the command line arguments); and
  • expr1 and expr2 are computed from different sources.

For the question-mark predicate we use the function splitter. You should experiment with different splitting methods to get the effect that you want, by setting --AddOpaqueSplitKinds=... and --AddOpaqueSplitLevel=.... --AddOpaqueObfuscate=true does some light obfuscation to the code in one of the branches; you can, of course, obfuscate the split out functions yourself.

 

Issues

  • --AddOpaqueKinds=fake will result in undefined symbols being generated. You need to coerce the linker to ignore such errors. With gcc you can use this option:

    -Wl,--unresolved-symbols=ignore-in-object-files 
    

    No similar option seems to exist for clang.

  • The --AddOpaqueKinds=question adds the code in outlined functions, for reasons I really can't remember at the moment. It should probably be fixed, but in the mean time you can just inline the calls:

          --Transform=AddOpaque \
             --Functions=... \
             --AddOpaqueKinds=question \
          --Transform=Inline \
             --Functions=/.*QUESTION.*/ \