Jit

This transformation translates a function F into a new function F' consisting of a sequence of intermediate code instructions such that, when F' is executed, F will be dynamically compiled to machine code. I.e., this is an example of runtime code generation, or just-in-time compilation, or dynamic unpacking.

For example, the first few lines of a jitted function may look like this:

int fac(int x ) { 
   ...
    p3 = jit_init();
    jit_enable_optimization(p3, 3L);
    label4 = jit_get_label(p3);
    jit_add_prolog(p3, & _3_obf_int_binary_I___foo, 0);
    localSize6 = jit_allocai(p3, 8L);
    jit_add_op(p3, JIT_DECL_ARG, ((0 << 4) | (2 << 2)) | 2, 0, 4, 0L, 0L, 0);
    jit_add_op(p3, JIT_DECL_ARG, ((0 << 4) | (2 << 2)) | 2, 0, 4, 0L, 0L, 0);
    jit_add_op(p3, JIT_ADD | 2, ((2 << 4) | (1 << 2)) | 3, 0 | ((0 << 1) | 
                   (1 << 4)), 0 | ((2 << 1) | (0 << 4)), (jit_value )(localSize6 + 4), 0L, 0);
    jit_add_op(p3, JIT_GETARG, ((0 << 4) | (2 << 2)) | 3, 0 | ((0 << 1) | 
                    (2 << 4)), 1L, 0L, 0L, 0);
    jit_add_op(p3, JIT_ST | 1, ((0 << 4) | (1 << 2)) | 1, 0 | ((0 << 1) | 
                    (1 << 4)), 0 | ((0 << 1) | (2 << 4)), 0L, 4, 0);
    jit_add_op(p3, JIT_ADD | 2, ((2 << 4) | (1 << 2)) | 3, 0 | ((0 << 1) | 
                    (3 << 4)), 0 | ((2 << 1) | (0 << 4)), (jit_value )(localSize6 + 0), 0L, 0);
    jit_add_op(p3, JIT_GETARG, ((0 << 4) | (2 << 2)) | 3, 0 | ((0 << 1) | 
                    (4 << 4)), 0L, 0L, 0L, 0);
    jit_add_op(p3, JIT_ST | 1, ((0 << 4) | (1 << 2)) | 1, 0 | ((0 << 1) | 
                    (3 << 4)), 0 | ((0 << 1) | (4 << 4)), 0L, 4, 0);
   ...
   jit_generate_code(p6);
   ...
   result58 = (*fac_foo)(x);
   return (result58);
}
 

Diversity

Intermediate code operators (the JIT_MOV, JIT_EQ, etc. in the example above) are randomized every time Tigress is invoked. Thus, the statement

op11 = jit_add_op(p6, JIT_JMP | 2, ((0 << 4) | (0 << 2)) | 2, 0, 0L, 0L, 0L);

may be compiled into

op11 = jit_add_op(p6, 42, 2, 0, 0L, 0L, 0L);

where "42" will be a different literal for every invocation.

There is no diversity in the dynamically generated code at this point; i.e., every time the program is run, the same code is generated.

 

Usage Prior to 4.0.9

In earlier versions of Tigress, in order to invoke this feature you had to add one of the following directives to the top of your C file, depending on which target you were compiling for:

The files can be found in Tigress' distribution directory. Starting with 4.0.9 this is no longer necessary.

If you set the option --JitFrequency=n for n>0, the function will be jitted every n:th time it is called. This means that every n:th time the function is called, its address trace (but not its instruction trace) will change. With --JitFrequency=0, the function will only be jitted the first time it is called.

To disrupt dynamic taint analysis, the option --JitImplicitFlow inserts implicit flow between where the function gets generated, and where it is invoked:

int fac(int x ) { 
  ...
  fac_foo1 = IMPLICIT_FLOW(fac_foo);
  result58 = (*fac_foo1)(x);
  return (result58);
}

As usual, this transformation needs to be combined with others to break up the predictable static structure. The Split and Virtualize transformations are particularly appropriate.

OptionArgumentsDescription
--Transform Jit Turn a function into a sequence of instructions that dynamically builds up the function at runtime.
--JitEncoding hard, soft How the jitted instructions are encoded. Default=hard.
  • hard = The jitted instructions are encoded as code.
  • soft = The jitted instructions are encoded as data (not implemented).
--JitFrequency INTSPEC How often to jit the code at runtime. 0=only the first time; n>0=Every n:th time the function is called. Default=0.
--JitOptimizeBinary INTSPEC Optimize the jitted binary code. 1=omit frame pointer, 2=omit unused assignments, 4=merge ADDs and MULs. Default=1|4=5.
--JitImplicitFlow S-Expression The type of implicit flow to insert. See --AntiTaintAnalysisImplicitFlow for a description. Default=none.
--JitCopyKinds counter, counter_signal, bitcopy_unrolled, bitcopy_loop, bitcopy_signal, * Comma-separated list of the kinds of implicit flow to insert. counter_signal and bitcopy_signal require that --Transform=InitImplicitFlow --InitImplicitFlowCount=... has been called to create the signal handlers. Default=all options.
  • counter = Copy a variable by counting up to its value.
  • counter_signal = Copy a variable by counting up to its value in a signal handler.
  • bitcopy_unrolled = Copy a variable bit-by-bit, each bit tested by an if-statement.
  • bitcopy_loop = Loop over the bits in a variable and copy each bit by testing in an if-statement.
  • bitcopy_signal = Loop over the bits in a variable and copy each bit in a signal handler.
  • * = Same as all options turned on.
--JitObfuscateHandle BOOLSPEC Add an opaque predicate to the generated function handle. Default=false.
--JitObfuscateArguments BOOLSPEC Add bogus arguments and opaque predicates to the jit_add_op function calls. Default=false.
--JitDumpOpcodes INTSPEC Print the jitter's bytecode. OR the numeric arguments together, or 0 for no dumping. Default=0.
  • 0x01 = JIT_DEBUG_OPS
  • 0x02 = JIT_DEBUG_CODE
  • 0x04 = JIT_DEBUG_COMBINED
  • 0x08 = JIT_DEBUG_COMPILABLE
  • 0x100 = JIT_DEBUG_LOADS
  • 0x200 = JIT_DEBUG_ASSOC
  • 0x400 = JIT_DEBUG_LIVENESS
--JitTrace INTSPEC Insert runtime tracing of instructions. Set to 1 to turn it on. Default=0.
--JitTraceExec BOOLSPEC Annotate each instruction, showing from where it was generated, and the results of execution. Default=false.
--JitDumpTree BOOLSPEC Print the tree representation of the function, prior to generating the jitting code." Default=false.
--JitDumpCFG BOOLSPEC Print the jitter's Control Flow Graph. Default=false.
--JitAnnotateTree BOOLSPEC Annotate the generated code with the corresponding intermediate tree code instructions." Default=false.
--JitDumpIntermediate BOOLSPEC Print the generated intermediate code at translation time." Default=false.
--JitRandomizeBlocks BOOLSPEC Randomize the order of basic blocks Default=true.
 

Issues

  • Soft encoding is not yet fully implemented.
  • The jit library contains much extra code. All jitting functions have a jit_ prefix. These functions need also to be obfuscated in order to protect the jitted function itself.
  • The jitting library contains debug routines that can give away much information. Future versions of Tigress will remove these routines from production code.
  • While opcodes are randomized, the frequency of instructions can give away much information. For example, if opcode 0x42 is very common, it's more likely to be an ADD than, say, a DIV instruction.
  • Note that once you've included a jitting transformation your code is no longer portable: your program will generate runtime code only for one particular target.
  • There seems to be some loss in floating point accuracy when using float types. I think this might be because there are extra conversion instruction generated my MyJit. The fix is to use double instead.
 

Permanent Issues

These are issues with jitted code that will probably never be resolved:

  • Functions that call __builtin_* functions cannot be jitted since these have to be called directly. This includes functions that call, for example, alloca implicitly (this is taken from gcc's torture test 920721-2.c):
  • f(){}
    main(){int n=2;double x[n];f();exit(0);}
    
  • Functions that pass structures as by value arguments to non-jitted functions cannot themselves be jitted. The reason is that the MyJit library does not support this functionality. Smaller structs (those no larger than 2 longs) may work. Sometimes. Maybe.
 

Acknowledgments

This feature of Tigress is based on the MyJit library by Petr Krajča. We are indebted to Petr for his hard work modifying MyJit to fit our needs.