Flatten

 
free web templates

This is a classic control-flow transformation that removes structured flow.

free web templates

Flattening is useful as a precursor to other transformations. For example, our AntiBranchAnalysis transformation flattens a function prior to replacing direct branches by calls to a branch function since this creates more opportunities for encoding. Also, the Merge transformation can optionally flatten functions before merging them, thus more thoroughly mixing their code.

OptionArgumentsDescription
--Transform Flatten Flatten a function using Chenxi Wang's algorithm
--FlattenDumpBlocks BOOLSPEC Print the linearized blocks. Default=false.
--FlattenSplitBasicBlocks BOOLSPEC If true, then basic blocks (sequences of assignment and call statements without intervening branches) will be split up into indiviual blocks. If false, they will be kept intact. Default=false.
--FlattenRandomizeBlocks BOOLSPEC If true, then basic block sequences will be randomized. Default=false.
--FlattenTrace bool Print a message before each block gets executed. Useful for debugging. Default=false.

 

Diversity

free web templates

There are several sources of diversity:

  • We support four kinds of "dispatch," i.e. how the next block is selected.
  • Basic blocks can be kept intact, or can be split up into individual statements.
  • The order of blocks can be randomized.
  • The next variable can be obfuscated using different kinds of opaque predicates.
  • Conditional branches can be encoded in several different ways.
 

Dispatch

free web templates

These are the four kinds of block dispatch we currently support, and what a reursive factorial function looks like after transformation:

  • In a switch dispatch, each block becomes a case in a switch statement, and the switch is wrapped inside an infinite loop. This is the traditional implementation.
    int fac(int x) { 
      int tmp ;
      unsigned long next ;
      next = 4;
      while (1) {
        switch (next) {
        case 4: 
        if (x == 1) {
          next = 3;
        } else {
          next = 2;
        }
        break;
        case 3: return (1); break;
        case 2: tmp = fac(x - 1); next = 1; break;
        case 1: return (x * tmp); break;
        }
      }
    }
    
  • In a goto dispatch, direct gotos are used to jump between blocks.
    int fac(int x) { 
      int tmp ;
      goto lab4;
      lab4: 
      if (! (x == 1)) {
        goto lab2;
      }
      lab3: 
      return (1);
      lab2: 
      tmp = fac(x - 1);
      goto lab1;
      lab1: 
      return (x * tmp);
    }
    
  • In an indirect dispatch, indirect gotos through a jump table are used to switch between blocks.
    int fac(int x) { 
      int tmp ;
      unsigned long next ;
      static void *jumpTab[4]  = {&& lab1, && lab2, && lab3, && lab4};
      next = 4;
      goto *(jumpTab[next - 1]);
      lab4:
      if (x == 1) {
        next = 3;
      } else {
        next = 2;
      }
      goto *(jumpTab[next - 1]);
      lab3: 
      return (1);
      goto *(jumpTab[next - 1]);
      lab2: 
      tmp = fac(x - 1);
      next = 1;
      goto *(jumpTab[next - 1]);
      lab1:
      return (x * tmp);
      goto *(jumpTab[next - 1]);
    }
    
  • In a call dispatch, each block becomes its own function, and indirect calls through a table of function pointers are used to switch between blocks.
    struct argStruct {
       int tmp ;
       int *x ;
       int returnv ;
       unsigned long next ;
    };
    
    void block_1(struct argStruct *arg ) { 
      arg->returnv = *(arg->x) * arg->tmp;
      arg->next = 5;
      return;
    }
    
    void block_2(struct argStruct *arg ) { 
      arg->tmp = fac(*(arg->x) - 1);
      arg->next = 1;
      return;
    }
    
    void block_4(struct argStruct *arg ) { 
      if (*(arg->x) == 1) {
        arg->next = 3;
      } else {
        arg->next = 2;
      }
      return;
    }
    
    void block_3(struct argStruct *arg ) { 
      arg->returnv = 1;
      arg->next = 5;
      return;
    }
    
    int fac(int x ){ 
      struct argStruct arg ;
      static void (*jumpTab[4])(struct argStruct *arg )  = {& block_1, & block_2,& block_3, & block_4};
      arg.next = 4;
      arg.x = & x;
      while (1) {
        if (arg.next > 4) {
          return (arg.returnv);
        } else {
          (*(jumpTab[arg.next - 1]))(& arg);
        }
      }
    }
    
OptionArgumentsDescription
--FlattenDispatch switch, goto, indirect, call, * Dispatch method. Comma-separated list. Default=switch.
  • switch = dispatch by while(1) {switch (next) {blocks}}
  • goto = dispatch by {labl1: block1; goto block2;}
  • indirect = dispatch by goto* (jtab[next])
  • call = each block is outlined into its own function
  • * = select a dispatch method at random.

 

Conditional Branches

free web templates

If the --FlattenConditionalKinds=flag is set, there will be no explicit conditional tests in the code. Rather, they will be transformed into an assembly code snippet (x86 only, at this point) that does the comparison, extracts the flag register, and then the address to jump to will be computed based on the individual flags set. Similar encodings are available for Merge and Virtualize transformations.

OptionArgumentsDescription
--FlattenConditionalKinds branch, compute, flag Ways to transform conditional branches. Default=branch.
  • branch = Use normal branches, such as if (a>b) goto L1 else goto L2
  • compute = Compute the branch, such as x=(a>b); goto *(expression over x)
  • flag = Compute the branch from the values of the flag register, such as asm("cmp a b;pushf;pop"); goto *(expression over flag register)

 

Obfuscating the Next Variable

free web templates

You can use opaque predicates and anti-implicit flow transformations to further obfuscate the next variable.

OptionArgumentsDescription
--FlattenObfuscateNext BOOLSPEC Whether the dispatch variable should be obfuscated with opaque expressions or not. Default=false.
--FlattenOpaqueStructs list, array, * Type of opaque predicate to use. Traditionally, for this transformation, array is used. Default=array.
  • list = Generate opaque expressions using linked lists
  • array = Generate opaque expressions using arrays
  • * = Same as list,array
--FlattenImplicitFlowNext bool Insert implicit flow between the computation of a an address and the subsequent indirect jump or call. This applies to call and indirect transformations only. Default=false.
--FlattenImplicitFlow S-Expression The type of implicit flow to insert. See --AntiTaintAnalysisImplicitFlow for a description. Default=none.

 

Issues

  • Consider this example taken from gcc's comp-goto-1.c torture test:
    goto *(base_addr + insn.f1.offset);
    
    This kind of arithmetic on the program counter is going to fail for transformations that completely restructure the code, such as flattening.
  • The --FlattenConditionalKinds=flag transformation doesn't work for floats, for reasons that escape me. I think maybe floats are left on the stack and then there's a SEGV when the function returns, but who knows? :(
  • On MacOS/llvm the --FlattenDispatch=call --FlattenConditionalKinds=flag transformation generates a segmentation fault, on Linux/gcc it does not. There are other similar issues that pop up occasionally on this platform. Presumably this is due to some compiler problem related to inline assembly. Update: I think this has been fixed now. Inline assembly with instructions that manipulate the stack pointer can apparently be problematic.
 

Acknowledgments

free web templates

Saumya Debray and Babak Yadegari suggested the --FlattenConditionalKinds=flag transformation.

 

References