One of the uses of Tigress is as an educational tool. The --Transform=RandomFuns
option will generate a random function that can subsequently be transformed using any
combination of Tigress obfuscations, and then given to students as a cracking target.
You can use this transformation to generate a program for the students to reverse engineer:
By using different seeds you can generate unique challenges for each student:
Depending on the sophistication of your students, you can vary the length of the transformation sequence, the difficulty of the transformations, the options to the transformations, the complexity of the generated challenge function, and either give them source to untangle (a good way to learn about particular transformations), or stripped compiled code (for a more real-world challenge).
We typically give each student 3 programs, of different levels of difficulty, to attack. Thanks to the diversity of Tigress-generated programs, each challenge is unique, making it harder for students to cheat.
Below is part of the script we use to generate take-home exams for our students. It contains two assets, a password check and an expired time check, and it's the students' job to disable these.
Start by creating this file, empty.c
:
#include "tigress.h"
#include <stdio.h>
#include <time.h>
#include <pthread.h>
Generate the cleartext challenge program. This is hidden from the students:
tigress --Verbosity=1 --Seed=$seed6 --Environment=x86_64:Darwin:Clang:5.1 \
--Transform=RandomFuns \
--RandomFunsName=SECRET \
--Verbosity=5 \
--RandomFunsFunctionCount=1 \
--RandomFunsTrace=0 \
--RandomFunsType=long \
--RandomFunsInputSize=1 \
--RandomFunsLocalStaticStateSize=1 \
--RandomFunsGlobalStaticStateSize=0 \
--RandomFunsLocalDynamicStateSize=0 \
--RandomFunsGlobalDynamicStateSize=0 \
--RandomFunsBoolSize=2 \
--RandomFunsLoopSize=1000 \
--RandomFunsCodeSize=100 \
--RandomFunsOutputSize=1 \
--RandomFunsTimeCheckCount=1 \
--RandomFunsActivationCodeCheckCount=1 \
--RandomFunsActivationCode=42 \
--RandomFunsPasswordCheckCount=1 \
--RandomFunsPassword=secret \
--RandomFunsFailureKind=segv \
--out=6-input.c empty.c
Next, generate an empty program with the same interface as the challenge program for the students to fill out:
tigress --Verbosity=1 --Seed=$seed6 \
--Transform=RandomFuns --RandomFunsName=SECRET \
--RandomFunsType=long \
--RandomFunsInputSize=1 \
--RandomFunsLocalStaticStateSize=1 \
--RandomFunsGlobalStaticStateSize=0 \
--RandomFunsLocalDynamicStateSize=0 \
--RandomFunsGlobalDynamicStateSize=0 \
--RandomFunsOutputSize=1 \
--RandomFunsCodeSize=0 \
--out=6-answer.c empty.c
Finally, obfuscate the challenge program:
tigress --Verbosity=1 --Seed=$seed6 --FilePrefix=obf \
--Transform=InitEntropy \
--Functions=main\
--Transform=InitOpaque \
--Functions=main --InitOpaqueCount=1 --InitOpaqueStructs=list,array\
--Transform=InitBranchFuns \
--InitBranchFunsCount=2\
--Transform=EncodeLiterals \
--Functions=SECRET --EncodeLiteralsKinds=string --EncodeLiteralsEncoderName=STRINGS\
--Transform=Virtualize \
--Functions=STRINGS --VirtualizeDispatch=switch --VirtualizeOperands=stack,registers \
--VirtualizeMaxMergeLength=2 --VirtualizeSuperOpsRatio=1.0 \
--Transform=AddOpaque \
--Functions=SECRET --AddOpaqueKinds=call,bug,true --AddOpaqueCount=4\
--Transform=Virtualize \
--Functions=SECRET --VirtualizeDispatch=indirect --VirtualizeOperands=stack,registers \
--VirtualizeMaxMergeLength=2 --VirtualizeSuperOpsRatio=1.0 \
--Transform=Virtualize \
--Functions=SECRET --VirtualizeDispatch=ifnest --VirtualizeOperands=stack,registers \
--VirtualizeMaxMergeLength=2 --VirtualizeSuperOpsRatio=1.0 --VirtualizeNumberOfBogusFuns=1\
--Transform=EncodeLiterals \
--Functions=SECRET --EncodeLiteralsKinds=integer \
--Transform=BranchFuns \
--Functions=SECRET --BranchFunsFlatten=true \
--Transform=CleanUp \
--CleanUpKinds=annotations,constants,names \
--out=6-challenge.c 6-input.c
To fulfill our needs, a random program generator must have certain characteristics:
The generated challenge programs all take the following simple form:
#include ≤stdio.h>
#include ≤stdlib.h>
void SECRET(unsigned long input[1] , unsigned long output[1] )
...
}
int main(int argc, char** argv) {
{
unsigned long input[1] ;
unsigned long output[1] ;
int i5 ;
unsigned long value6 ;
int i7 ;
}
i5 = 0;
while (i5 < 1) {
value6 = strtoul(argv[i5 + 1], 0, 10);
input[i5] = value6;
i5 ++;
}
SECRET(input, output);
i7 = 0;
while (i7 < 1) {
printf("%lu\n", output[i7]);
i7 ++;
}
}
That is, a challenge program reads one or more longs from standard in, and produces one or more longs on standard out.
Internally, generated SECRET
functions share the same basic structure:
an expansion phase, a mixing phase, and a compression phase:
Here is an example:
void SECRET(unsigned long input[1] , unsigned long output[1] ) {
unsigned long state[4] ;
unsigned long local1 ;
/* Expansion phase */
state[0UL] = input[0UL] + 762537946UL;
state[1UL] = input[0UL] | ((16601096UL << (state[0UL] % 16UL | 1UL)) | (16601096UL >> (64 - (state[0UL] % 16UL | 1UL))));
state[2UL] = (input[0UL] ^ 643136481UL) ^ (state[0UL] + 292656718UL);
state[3UL] = (input[0UL] << (((state[1UL] >> 4UL) & 15UL) | 1UL)) | (input[0UL] >> (64 - (((state[1UL] >> 4UL) & 15UL) | 1UL)));
/* Mixing phase */
local1 = 0UL;
while (local1 < 3UL) {
state[1UL] |= (state[2UL] & 15UL) << 3UL;
state[local1 + 1UL] = state[local1];
local1 += 2UL;
}
if ((state[0UL] | state[1UL]) > (state[2UL] | state[3UL])) {
state[3UL] |= (state[1UL] & 31UL) << 3UL;
} else {
state[2UL] = state[0UL];
state[3UL] |= (state[2UL] & 15UL) << 3UL;
}
state[0UL] = state[2UL];
/* Compression phase */
output[0UL] = (state[0UL] << (state[1UL] % 8UL | 1UL)) << ((((state[2UL] << (state[3UL] % 8UL | 1UL)) >> 1UL) & 7UL) | 1UL);
}
Option | Arguments | Description |
---|---|---|
--Transform | RandomFuns | Generate a random function useful as an attack target. |
--RandomFunsFunctionCount | INTSPEC | Number of functions to generate. Default=1. |
--RandomFunsTrace | INTSPEC | Insert tracing in the generated program. 0=no tracing. 1=trace function entry and exit. 2=as 1, but also trace control points. 3=as 2, but also trace individual statements. 4=as 3, but also trace the values of the STATE variable at each statement. Default=0. |
--RandomFunsTraceAssets | BOOLSPEC | Trace the state at each asset instance Default=0. |
--RandomFunsInputSize | INTSPEC | Size of input. Default=1. |
--RandomFunsLocalStaticStateSize | INTSPEC | Size of function-local static state. Default=1. |
--RandomFunsLocalDynamicStateSize | INTSPEC | Size of function-local dynamic state, i.e. the number of linked structures (lists, trees, etc) that can be built. Default=0. |
--RandomFunsGlobalStaticStateSize | INTSPEC | Size of global static state. Default=0. |
--RandomFunsGlobalDynamicStateSize | INTSPEC | Size of global dynamic state, i.e. the number of linked structures (lists, trees, etc) that can be built. Default=0. |
--RandomFunsOutputSize | INTSPEC | Size of output. Default=1. |
--RandomFunsCodeSize | INTSPEC | Size of the generated code. This is the number of nodes in generated Abstract Syntax Tree. Default=10. |
--RandomFunsLoopSize | INTSPEC | Maximum count of loop iterations. Used to control the length of execution. Default=None. |
--RandomFunsBoolSize | INTSPEC | Size of boolean expressions, specifically the number of conjunctions and disjunctions (&& and ||) that are used to build up expressions. Default=1. |
--RandomFunsType | char, short, int, long, float, double | Type of input/output/state. Default=long.
|
--RandomFunsName | string | The name of the generated function. Default=SECRET. |
--RandomFunsFailureKind | message, abort, segv, random, assign | The manner in which a triggered asset may fail. Comma-separated list. Default=segv.
|
--RandomFunsInputKind | argv, stdin | How inputs are read by the program, through the command line or stdin. Default=argv.
|
--RandomFunsInputType | int, float, string | What is the type of the input being read from the user. Default=int.
|
--RandomFunsDummyFailure | BOOLSPEC | Generates excatly the same code whether true or false, except the failure code is rendered impotent. In other words, --RandomFunsDummyFailure=true will have the failure code inserted, but inactive. Default=false. |
--RandomFunsTimeCheckCount | int | The number of checks for expired time (gettimeofday() > someTimeInThePast) to be inserted in the program. Default=0. |
--RandomFunsActivationCodeCheckCount | int | The number of checks for correct activation code to be inserted in the program. Default=0. |
--RandomFunsActivationCode | int | The code the user has to enter (as the first command line arguments) to be allowed to run the program. Default=42. |
--RandomFunsSecurityCheckCount | int | The number of security checks to be inserted in the program. Default=0. |
--RandomFunsSecurityCheckValues | S-Expression | List of ((asset# state index value) (asset# state index value) ...) where where 'state' is one of [input,output,local,global]. (5 local 3 42) specifies that the point where asset number 5 is in the code, local[3] may have the value 42. Default=0. |
--RandomFunsPasswordCheckCount | int | The number of checks for correct password to be inserted in the program. Probably only 0 and 1 make sense here, since the user will be prompted for a password once for every check. Default=0. |
--RandomFunsPassword | string | The password the user has to enter (read from standard input) to be allowed to run the program. Default="42". |
--RandomFunsControlStructures | S-Expression | If set, will define the nested control structures of the generated function. Otherwise, a random structure will be generated. The argument is an S-Expression, where each subexpression has one of the forms (bb INTSPEC) (for a basic block consisting of a certain number of assignment statements), (for S-expression) (for a for-loop with a given body), or (if S-expression S-expression) (for an if-statement with a given then and else part), or (switch S-expression S-expression) (for a switch-statement with a given list of cases and a default case). For example, --RandomFunsControlStructures='(for ((bb 4)))' will generate a body consisting of a for-loop with 4 assignment statements in the body. (for ((bb 4) (if ((bb 1)) ((bb 2))))) also puts an if-statement inside the loop. (for ((bb 4) (if ((bb 1)) ((for (bb 2)))))) puts a for-loop inside the else-part of the if-statement. (switch ((bb 4) (if ((bb 1)) (bb 4))) (bb 4)) creates a switch-statment with two cases (a basic block and an if-statement), and a basic block as the default case. Default=none. |
--RandomFunsBasicBlockSize | INTSPEC | The size of basic blocks, when control structures are not explicitly specified using RandomFunsControlStructures. Default=3. |
--RandomFunsForBound | constant, input, boundedInput, boundedAny | The allowable upper bound in a for-statement. Comma-separated list. Default=constant.
|
--RandomFunsOperators | PlusA, MinusA, Mult, Div, Mod, Shiftlt, Shiftrt, Lt, Gt, Le, Ge, Eq, Ne, BAnd, BXor, BOr, * | The allowable operators in expressions. Comma-separated list. Default=all.
|
--RandomFunsPointTest | BOOL | Add if (output[0] == 4242424242U) printf("You win!\n"); after the call to the generated function. The idea is to replace (by hand) 4242424242U with one actual output of the function. This can be used as another reverse engineering challenge: "Find an input for which the program prints "You win!. Default=false. |
If you set --RandomFunsPluginADTCount=...
, we will include calls to your own Abstract Data Types (ADTs).
You call Tigress like this:
tigress --Verbosity=1 --Seed=$seed6 --FilePrefix=obf \
--Transform=InitPlugins \
--InitPluginsContainerPrefix=Set \
--InitPluginsDictionaryPrefix=HashMap \
--Transform=RandomFuns \
--RandomFunsPluginADTCount=4
... \
--out=result.c plugins.c
where plugins.c
looks like this:
#include "tigress.h"
#include <stdio.h>
#include <time.h>
#include <pthread.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h>
// Types for Sets.
typedef ... Set_TYPE;
typedef ... Set_ELEMENT;
typedef ... Set_ITER;
// Create and destroy a set
Set_TYPE Set_CREATE(){ ... }
// Insert and delete elements from the set.
void Set_INSERT(Set_TYPE a, Set_ELEMENT b) { ... }
// Query the set
int Set_MEMBER(Set_TYPE a, Set_ELEMENT b){ ... }
int Set_SIZE(Set_TYPE a){ ... }
// Iterate through the elements of the set. Here's the pattern to use:
// Set_ITER i = Set_FIRST(s);
// while not Set_DONE(s, i) {
// Set_ELEMENT e = Set_GET(s,i);
// Do something with e;
// i = Set_NEXT(s,i);
// }
Set_ITER Set_FIRST(Set_TYPE a){ ... }
Set_ITER Set_NEXT(Set_TYPE a, Set_ITER n){ ... }
int Set_DONE(Set_TYPE a, Set_ITER n){ ... }
Set_ELEMENT Set_GET(Set_TYPE a, Set_ITER n){ ... }
This will insert random calls to the Set data structure in the generated code.
(:
--RandomFunsLoopSize=1000
which bounds the number of loop iterations.
--RandomFunsControlStructures="(if (bb 5) (bb 2))"
, then the problem goes away.
When these are generated automatically, however, there's always some risk that you'll
wind up with --RandomFunsControlStructures="(for (for (for (for (bb 4)))))"
which is
likely to run forever. I try to mitigate this by reducing the
loop size (set by --RandomFunsLoopSize=1000
) by a factor of 10 for every
nesting level, but this doesn't help if looping is interprocedural (I don't currently
check for this).