The goal of these transformations is to make it harder for automatic analysis tools (such as disassemblers) to determine the target of branches.
This transformation implements a simplistic version of Linn and Debray's branch functions. We doen't use perfect hash tables, as suggested in Linn and Debray's paper, since this is hard to do as a source-to-source transformation. Rather, we simply pass the offset to jump to as an argument to the branch function.
The generated code looks like this, where the call to the branch function bf
actually results in a direct jump to lab2
:
void bf(unsigned long offset) {
__asm__ volatile ("addq %0, 8(%%rbp)": : "r" (offset));
}
int main() {
bf((unsigned long)(&& lab2) - (unsigned long)(&& lab3));
lab3:
__asm__ volatile (".byte 0x76,0x9b,0x8e,0x1b,0x4d":);
...
lab2: ...;
}
By default, a function is flattened prior to direct jumps
being replaced by calls to a branch function. This creates more direct
jumps and hence more opportunities to apply the branch function
transformation. Turn this off with --BranchFunsFlatten=false
.
Before branches can be replaced by calls to a branch
function, at least one such function needs to be constructed,
using the --Transform=InitBranchFuns
transformation.
The branch function is not obfuscated and hence trivial to find. It's therefore a good idea to merge it with other functions in the program.
Option | Arguments | Description |
---|---|---|
--Transform | InitBranchFunctions | Initialize so that branch functions can be insered at a later time. |
--InitBranchFunsOpaqueStructs | list, array, input, env, * | Comma-separated list of the kinds of opaque constructs to use when obfuscating the branch function. Default=list,array.
|
--InitBranchFunsCount | INTSPEC | How many branch functions to add. |
--InitBranchFunsObfuscate | BOOLSPEC | Whether to obfuscate the branch function. Default=false. |
We implement two standard branch obfuscations used by many packers:
push target
call lab
ret
lab:
ret
and
push target
ret
The --AntiBranchAnalysisKinds=goto2nopSled
switch turns this code
goto L
...
L:
into this code
goto *(R+expression)
...
R:
nop
nop
...
nop
L:
The expression is opaque such that the branch falls somewhere within the nop sled. The intention is to combine this transformation with input-dependent opaque predicates so that the actual jump address will be random and input dependent:
tigress --Input=... \
--Transform=InitOpaque
--InitOpaqueKind=Input \
--Transform=AntiBranchAnalysis \
--AntiBranchAnalysisKinds=goto2nopSled \
--AntiBranchAnalysisOpaqueStructs=Input
The current nop-sled is trivial, consisting of random lists of x86 bytes that have no effect:
cmc
std
cld
nop
stc
cmc
clc
stc
wait
...
Option | Arguments | Description |
---|---|---|
--Transform | AntiBranchAnalysis | Replace branches with other constructs. |
--AntiBranchAnalysisKinds | branchFuns, goto2call, goto2push, goto2nopSled, * | Comma-separated list of the kinds of constructs branches can be replaced with. Default=branchFuns.
|
--AntiBranchAnalysisOpaqueStructs | list, array, input, env, * | Comma-separated list of the kinds of opaque constructs to use. Default=list,array.
|
--AntiBranchAnalysisObfuscateBranchFunCall | BOOLSPEC | Obfuscate the body of the branch function. Default=false. |
--AntiBranchAnalysisBranchFunFlatten | BOOLSPEC | Flatten before replacing jumps. This opens up more opportunities for replacing unconditional branches. Default=false. |
--AntiBranchAnalysisBranchFunAddressOffset | integer | The offset (in bytes) of the return address on the stack, for branch functions. May differ based on operating system, word size, and compiler. Default=8 on x86_64, 0 on Arm. |
This transformation has many issues, and should only be used with great care:
goto2push
and goto2call
will often cause
clang
to generate the wrong code.
gcc 4.6
appears to do the right thing.
gcc 4.8
appears to occasionally hang when compiling our generated code.
gcc
have an
asm goto
construct which ought to help with this.
Clang
lacks this feature. --Environment=...
option appropriately
if you are going to use goto2push
and goto2call
and
test the generated code thoroughly. goto2push
and goto2call
are turned off by default. The Branch Function transformation implements a simplistic version of Linn and Debray's Obfuscation of Executable Code to Improve Resistance to Static Disassembly, Linn and Debray's algorithm replaces direct jumps with calls to a special branch function which sets the return address to the target of the original branch, and then returns.
There are many attacks published on branch functions, including Static Disassembly of Obfuscated Binaries by Christopher Kruegel, William Robertson, Fredrik Valeur and Giovanni Vigna, and Deobfuscation: Reverse engineering obfuscated code by Sharath Udupah, Saumya Debray, and Matias Madou.
Kevin A. Roundy and Barton P. Miller's survey paper Binary-code obfuscations in prevalent packer tools is a good source of information on techniques used by current obfuscation tools.