February 19, 2021. Opaque Predicate is maybe the best known code obfuscation primitive. It's an easy concept to explain, it makes perfect sense as "something that makes code hard to understand," and it is supported by most code obfuscation tools (including Tigress). In practice opaque predicates may be mostly useless in thwarting real attacks but, at least to academics, they have instant appeal: they're easy to formalize, they have a mathematical flavor, and you can first write a paper on how to construct them and then another one on how to defeat them!
But, where does the term "opaque predicate" come from?
Well, for my PhD thesis (Flexible Encapsulation, get your copy now - only $95 on Amazon!) I designed a language I called Zuse, in which I examined the concept of Opaque Types. (Side note: I wrote to Konrad Zuse in my best hish school German and asked permission to use his name for my language. But, by the time he responded with a "No, because I don't want people to confuse this language with the Plankalkül!" the thesis was already in print. Ooops... Unfortunately, I lost the letter in one of the many intercontinental moves.) I had learned about opaque types from Niklaus Wirth's Modula-2, my favorite imperative language at the time. But, where did Wirth get the term? Well, during his sabbatical at Xerox Park he had learned about the Mesa language and brought the opaque type concept back home with him. The joke at the time was that Modula-2 was what Wirth remembered of Mesa after his transatlantic flight home to Switzerland, with too much turbulence and one too many cocktails. Yes, those were simpler times, with lamer jokes.
Anyway, when I started thinking about code obfuscation (this was back at my first job at the University of Auckland, New Zealand), I needed a term for an "expression that always evaluates to the same value, is easy for a defender to construct, but hard for an attacker to (statically) determine." I'm not entirely sure that, at the time, I properly understood the litteral meaning of "opaque" (I was still working on my English), but it somehow seemed to fit the bill - hence, Opaque Predicates! This led first to TR148 and eventually to the first published paper on the topic, Manufacturing cheap, resilient, and stealthy opaque constructs.
So, from Mesa, to Modula-2, to Zuse, to the fundamental term of code obfuscation! But, maybe the term Opaque Type predates Mesa? Was it used in the C community prior to the Mesa design? Does anyone know?
February 13, 2020. Tigress version 3.1 is now available for download. It contains numerous bugfixes.
December 28, 2019. I sometimes get asked "which software protection tool/algorithm/product should I be using?" That is the wrong question to ask, or, at least, the wrong end in which to start asking the question. Software protection isn't a thing, it's a process. This process starts with a detailed attack model: what are your attackers after, how good are they, what tools are they likely to use, etc. Then you need to think about what overhead you are willing to absorb; the more protection you add, the more slowdown/space increase you can expect. And, even when you have settled on a particular set of protection tools, the process doesn't end there; you have to put a plan in place for when the attackers still break through your protections. A vendor who tells you to "set it and forget it, just apply our tool and you're done," is simply lying. A serious and trustworthy vendor is one who instead tells you that "all our techniques will eventially succumb to a serious adversary, but we monitor hacker groups to learn what techniques are about to become obsolete, and when they do, we have new tricks in our back pocket to roll out."
December 28, 2019. I would like to maintain an up-to-date list of all companies in the software protection space. This list has been moved here.
December 27, 2019. Getting started with Tigress can seem like a daunting task, given that there are some 250 different options to choose from! I added a page with "recipes" here. Hopefully this should give you some ideas on how to get started. If you use Tigress for something useful and would like to share your experiences, please send me your script and I will add it to the recipes page!
December 14, 2019. Because software protection in industry is mostly clouded in secrecy, it is hard for those of us in academia to get a grip on what are acceptable overheads and typical levels of protection afforded by industrial-strength tools. Here are two pieces of information, one from a Wikileaked document (a memo from Cloakware to Sony), and one from a presentation by my good friend Gu Yuan, formerly of Cloakware/IRDTO:
December 10, 2019. We managed to obfuscate and compile a small test program for Android NDK. No guarentees it will work well on a real platform for now, but we'd love to have some Android developers try it out! Have a look under Platforms/Androd for more details.
December 2, 2019. In Problems in Cryptocurrency: Five Years Later, Vitalik Buterin writes: "... we want to come up with a way to 'encrypt' a program so that the encrypted program would still give the same outputs for the same inputs, but the 'internals' of the program would be hidden. ... A solution to code obfuscation would be very useful to blockchain protocols. ... Unfortunately this continues to be a hard problem. ... these paths are still quite far from creating something viable and known to be secure."
It is interesting to compare the goals of cryptographically secure obfuscation with those of language-based obfuscation. Those of us who work In the latter are plagued by ridiculous security requirements paired with ridiculous performance requirements. For example, in A compiler-based infrastructure for software-protection, my friends Liem, Gu, and Johnson at IRDETO/Cloakware state that in one typical case, their performance budget (CPU and memory) was 50% over baseline. Now, we get a lot of flack from the crypto community (in the paper In Pursuit of Clarity In Obfuscation cited by Buterin, for example, the author writes: "Well, maybe those commercial products for program obfuscation work in practice. Let me try to break one. ... ten minutes later... Oh, nevermind."), but in practice it is hard to protect a program for more than 10 minutes when you're only allowed a miniscule reduction in performance.
I'm not aware of any work in language-based obfuscation that examines what happens when you substantially loosen the performance requirements. This makes sense since most of the applications have been in areas where performance matters, such as DRM. So, this is the question: if you were allowed not 50% overhead, but, say, 5 orders of magnitude overhead, what could you accomplish? Would you be able to supply useful levels of security for high-value assets, such as smart contracts?
November 28, 2019. The old site tigress.cs.arizona.edu has now been deprecated in favor of our new site, tigress.wtf. At the same time, we have completely reorganized the code and fixed numerous problems. Tigress is now built on top of the latest version of CIL, CIL 1.7.3 goblint, by Gabriel Kerneis. Tigress should now be easier to install - there is now just one (fat) package to download, containing binaries for all platforms.
November 20, 2019. We have added a transformation, SelfModify, which transforms a function into one that modifies itself at runtime. This transformation is currently only available for X86/64 targets. It can, of course, be combined with other transformations, but it is probably best run at the very end of the transformation chain. One interesting aspect of this transformation is that when combined with virtualization with a direct or indirect threaded dispatch, the indirect jumps to the instruction handlers are replaced with (self-modified) direct jumps! This should confuse analyses that rely on indirect branches to locate the instruction handlers. For dynamic analyses, this transformation shouldn't do very much; the instructions that are executed will be the same, except for the ones that modify the code. Analyses that assume that instruction addresses uniquely identify unique code will, of course, also be confused when a code location is reused for different instructions.
November 13, 2019. Tigress can now obfuscate C code that can then be compiled to WebAssaembly using the Emscripten compiler. This means that if you want to protect the confidentiality or integrity of code running in your clients' browser, you can write it in C, obfuscate it with Tigress, providing much higher levels of protection than obfuscated Javascript, and with (sometimes), lower overhead. With my student Yang Yang Lu and colleague Jasvir Nagra we're working on a paper on applications and performance aspects of this technology.
October 4, 2019. My PhD student Claire Taylor presented her paper Getting RevEngE: A System for Analyzing Reverse Engineering Behavior at MALCON 2019. She also received an Outstanding Paper Award, Congratulations, Claire! RevEngE uses Tigress to generate Reverse Engineering challenges.