Skip to content
Umar edited this page Sep 28, 2013 · 31 revisions

Download Jar Files

The first step is to download .jar file. You can always download the latest release version of Soot from the official Soot download page. There are a bunch of different options to choose but usually you will be needing the following files:

  • sootclasses-x.y.z.jar (the main Soot distribution)
  • jasminclasses-x.y.z.jar (the bytecode assembler that Soot uses to create .class files)
  • polyglotclasses-a.b.c.jar (the compiler front-end that Soot uses to parse .java files)

To download these files please click on the link below:

http://www.sable.mcgill.ca/soot/soot_download.html

For the really brave among you, Ondrej Lhotak provides a nightly build that is directly drawn from our Subversion repository. Usually the latest nightly build is the most stable version of Soot because tend to test code before we commit it. However, this may not always be true. For nightly build downloads please follow the link below:

http://vandyk.st.informatik.tu-darmstadt.de/abc/

We download these three files and now we are ready to give Soot a try:

Using Soot as Command Line

1. Processing single files

Soot in general processes a bunch of classes. These classes can come in one of three formats:

  • Java source code, i.e. .java files,
  • Java bytecode, i.e. .class files, and
  • Jimple source, i.e. .jimple files.

In case you don’t know yet, Jimple is Soot’s primary intermediate representation, a three-address code that is basically a sort of simplified version of Java that only requires around 15 different kinds of statements. You can instruct Soot to convert .java or .class files to .jimple files or the other way around. You can even generate .jimple file from .java, and modify the .jimple with a normal text editor and then convert your .jimple file to .class, virtually hand-optimizing your program. So it means you can simplify your task by Soot easily.

For brevity, in the following, we will abbreviate the classpath… as, sootclasses-2.3.0.jar:jasminclasses-2.3.0.jar:polyglotclasses-1.3.5.jar

The principle way to have Soot process two classes A and B is just to add them to the command line, which makes them “application classes”:

Error: Here an important detail is missing: Soot has its own classpath! So we have to define the class path.

2. Soot’s classpath

Soot has it’s own classpath and will load files only from JAR files or directories on that path. By default, this path is empty and therefore in the above example Soot does not “see” the classes A and B although they exist. So let’s just add the current directory “.”:

What’s wrong now? Apparently Soot was able to find A and B (at least it doesn’t complain about these any more) but now it’s missing java.lang.Object.

Why does Soot care about java.lang.Object anyway? In order to do anything meaningful with your program, Soot needs to have typing information and in particular it needs to reconstruct types for local variables and in order to do so it needs to know the complete type hierarchy of the classes you want to process.

Regarding the exception, there are three ways to resolve it:

  • add rt.jar to your classpath
  • add the –pp option, given your CLASSPATH variable comprises rt.jar or JAVA_HOME is set correctly
  • use the –allow-phantom-refs option (not recommended)

In the first option you add your JDK’s rt.jar to Soot’s classpath (not the JVM’s classpath!). This JAR file contains the class java.lang.Object:

This seems to have worked. (yes) Soot successfully processed the two .java files and placed resulting .class files into the sootOutput folder. Note that in general, Soot will process all classes you name on the command line and all classes referenced by those classes.

Beware, though, a common mistake is the following: What went wrong? Well, you tried to use “” because that points to your home directory, no? Well yes, but the problem is that usually “” is expanded by the shell, but not in this case. Soot gets the raw “~” string as a command line option and currently Soot is unable to expand that string into the right string for your home directory. So always use full or relative paths in Soot’s classpath.

The second option is to use –pp: Wow, that was much easier than adding this dawn classpath all the time, wasn’t it? Exactly and that’s why we added this option. -pp stands for “prepend path” and it means that Soot automatically adds the following to it’s own classpath (in that order):

  • the contents of your current CLASSPATH variable,
  • ${JAVA_HOME}/lib/rt.jar,
  • if you are in whole-program mode (i.e. the –w option is enabled; more to come) then it also adds * ${JAVA_HOME}/lib/jce.jar

The third way (not recommended) to make Soot sort of happy is the option –allow-phantom-refs: So what does that do? Basically this option tells Soot: “Well, I really don’t want to give you the classes you are missing (maybe because you just don’t have those classes) but please make a best effort even without them.” Soot creates a “phantom class” for each class that it cannot resolve and tells you about it. Note that this approach is very limited and in many cases does not lead to the results you need. Only use this option if you know what you are doing.

3. Processing entire directories

You can also process entire directories or JAR files using Soot, using the –process-dir option: To process a JAR file, just use the same option but provide a path to a JAR instead of a directory. Nice, eh? Be careful, though: If you apply the very same command again to the very same folder you will run into a problem now:

What happened? Well, as I noted earlier, Soot places the generated .class files into the folder sootOutput, which resides in the current directory “.”. Therefore Soot now processed the previously generated files, at the same time complaining about the fact that a class of name “A” resides at location ./sootOutput/A and therefore should actually have the name sootOutput.A, i.e. be in the sootOutput package. Therefore, when using the –process-dir option also use the –d option to redirect Soot’s output:

This redirects Soot’s output to /tmp/sootout, which is not a sub-directory of the current directory. Voila.

4. Processing certain types of files (.class / .java / .jimple)

Assume you have a directory that contains both A.java and A.class and you invoke Soot as before. In this case Soot will load the definition of A from the file A.class. This may not always be what you want. The –src-prec option tells Soot which input type it should prefer over others. There are four options:

  • c or class (default): favour class files as Soot source,
  • only-class: use only class files as Soot source,
  • J or jimple: favour Jimple files as Soot source, and
  • java: favour Java files as Soot source.

So e.g. -src-prec java will load A.java in the above example.

5. Application classes vs. library classes

Classes that Soot actually processes are called “application classes”. This is opposed to “library classes”, which Soot does not process but only uses for type resolution. Application classes are usually those explicitly stated on the command line or those classes that reside in a directory referred to via –process-dir.

When you use the -app option, however, then Soot also processes all classes referenced by these classes. It will not, however, process any classes in the JDK, i.e. classes in one of the java.* and com.sun.*packages. If you wish to include those too you have to use the special –i option, e.g. -i java.. See the guide for this and other command line options.

6. Output of .jimple or .java files

Soot cannot only produce .class files, it can also produce .jimple and .java files and others. You can select the output format using the –f option. If you use –f dava to decompile to Java please make sure that the file /lib/jce.jar is on Soot’s classpath.

7. Phase options

Soot supports hundreds of very fine grained options that allow you to tune all the analyses and optimizations to your needs, directly from the command line.

The general format of these command line options is -p PHASE OPT:VAL. To find the complete document of all phase options please click on the following link below:

http://www.sable.mcgill.ca/soot/tutorial/phase/

For instance, let’s say that we want to preserve the names of local variables (if possible) when performing ana analysis within Soot. Then we can add the command line option -p jb use-original-names:true. A shortcut is -p jb use-original-names, where the true is implicitly assumed.

Here are some commands which can be used mostly:

Soot is invoked as follows:

java javaOptions soot.Main [ sootOption* ] classname*

1. General Options

-h, -help

Display the textual help message and exit immediately without further processing.

-pl, -phase-list

Print a list of the available phases and sub-phases, then exit.

-ph phase, -phase-help phase

Print a help message about the phase or sub-phase named phase, then exit. To see the help message of more than one phase, specify multiple phase-help options.

-version

Display information about the version of Soot being run, then exit without further processing.

-v, -verbose

Provide detailed information about what Soot is doing as it runs.

-interactive-mode

Runs interactively, with Soot providing detailed information as it iterates through intra-procedural analyses.

-unfriendly-mode

With this option, Soot does not stop even if it received no command-line options. Useful when setting Soot options programmatically and then calling soot.Main.main() with an empty list.

-app

Run in application mode, processing all classes referenced by argument classes.

-w, -whole-program

Run in whole program mode, taking into consideration the whole program when performing analyses and transformations. Soot uses the Call Graph Constructor to build a call graph for the program, then applies enabled transformations in the Whole-Jimple Transformation, Whole-Jimple Optimization, and Whole-Jimple Annotation packs before applying enabled intraprocedural transformations.

  • Note that the Whole-Jimple Optimization pack is normally disabled (and thus not applied by whole program mode), unless you also specify the Whole Program Optimize option.

-ws, -whole-shimple

Run in whole shimple mode, taking into consideration the whole program when performing Shimple analyses and transformations. Soot uses the Call Graph Constructor to build a call graph for the program, then applies enabled transformations in the Whole-Shimple Transformation and Whole-Shimple Optimization before applying enabled intraprocedural transformations.

  • Note that the Whole-Shimple Optimization pack is normally disabled (and thus not applied by whole shimple mode), unless you also specify the Whole Program Optimize option.

-validate

Causes internal checks to be done on bodies in the various Soot IRs, to make sure the transformations have not done something strange. This option may degrade Soot's performance.

-debug

Print various debugging information as Soot runs, particularly from the Baf Body Phase and the Jimple Annotation Pack Phase.

-debug-resolver

Print debugging information about class resolving.

2. Input Options

-cp path, -soot-class-path path, -soot-classpath path Use path as the list of directories in which Soot should search for classes. path should be a series of directories, separated by the path separator character for your system.

If no classpath is set on the command line, but the system property soot.class.path has been set, Soot uses its value as the classpath.

If neither the command line nor the system properties specify a Soot classpath, Soot falls back on a default classpath consisting of the value of the system property java.class.path followed java.home/lib/rt.jar, where java.home stands for the contents of the system property java.home and / stands for the system file separator.

-pp, -prepend-classpath

Instead of replacing the default soot classpath with the classpath given on the command line, prepent it with that classpath. The default classpath holds whatever is set in the CLASSPATH environment variable, followed by rt.jar (resolved through the JAVA-UNDERSCORE-HOME environment variable). If whole-program mode is enabled, jce.jar is also appended in the end.

-process-path dir, -process-dir dir

Add all classes found in dir to the set of argument classes which is analyzed and transformed by Soot. You can specify the option more than once, to add argument classes from multiple directories. You can also state JAR files.

If subdirectories of dir contain .class or .jimple files, Soot assumes that the subdirectory names correspond to components of the classes' package names. If dir contains subA/subB/MyClass.class, for instance, then Soot assumes MyClass is in package subA.subB.

-ast-metrics

If this flag is set and soot converts java to jimple then AST metrics will be computed.

-src-prec format (default value: c)

Sets format as Soot's preference for the type of source files to read when it looks for a class.

Possible values:

c, class

Try to resolve classes first from .class files found in the Soot classpath. Fall back to .jimple files only when unable to find a .class file.

only-class

Try to resolve classes first from .class files found in the Soot classpath. Do not try any other types of files even when unable to find a .class file.

J, jimple

Try to resolve classes first from .jimple files found in the Soot classpath. Fall back to .class files only when unable to find a .jimple file.

java

Try to resolve classes first from .java files found in the Soot classpath. Fall back to .class files only when unable to find a .java file.

-full-resolver

Normally, Soot resolves only that application classes and any classes that they refer to, along with any classes it needs for the Jimple typing, but it does not transitively resolve references in these additional classes that were resolved only because they were referenced. This switch forces full transitive resolution of all references found in all classes that are resolved, regardless of why they were resolved.

In whole-program mode, class resolution is always fully transitive. Therefore, in whole-program mode, this switch has no effect, and class resolution is always performed as if it were turned on.

-allow-phantom-refs

Allow Soot to process a class even if it cannot find all classes referenced by that class. This may cause Soot to produce incorrect results.

-no-bodies-for-excluded

Prevents Soot from loading method bodies for all excluded classes (see exclude option), even when running in whole-program mode. This is useful for computing a shallow points-to analysis that does not, for instance, take into account the JDK. Of course, such analyses may be unsound. You get what you are asking for.

-j2me

(default value: false)

Use J2ME mode. J2ME does not have class Cloneable nor Serializable, so we have to change type assignment to not refer to those classes.

-main-class class

By default, the first class encountered with a main method is treated as the main class (entry point) in whole-program analysis. This option overrides this default.

-polyglot

(default value: false)

Use Java 1.4 Polyglot frontend instead of JastAdd, which supports Java 5 syntax.

**3. Output Options **

-d dir, -output-dir dir

(default value: ./sootOutput) Store output files in dir. dir may be relative to the working directory.

-f format, -output-format format

(default value: c) Specify the format of output files Soot should produce, if any.

  • Note that while the abbreviated formats (jimp, shimp, b, and grimp) are easier to read than their unabbreviated counterparts (jimple, shimple, baf, and grimple), they may contain ambiguities. Method signatures in the abbreviated formats, for instance, are not uniquely determined.

Possible values:

J, jimple

Produce .jimple files, which contain a textual form of Soot's Jimple internal representation.

j, jimp

Produce .jimp files, which contain an abbreviated form of Jimple.

S, shimple

Produce .shimple files, containing a textual form of Soot's SSA Shimple internal representation. Shimple adds Phi nodes to Jimple.

s, shimp

Produce .shimp files, which contain an abbreviated form of Shimple.

B, baf

Produce .baf files, which contain a textual form of Soot's Baf internal representation.

b

Produce .b files, which contain an abbreviated form of Baf.

G, grimple

Produce .grimple files, which contain a textual form of Soot's Grimp internal representation.

g, grimp

Produce .grimp files, which contain an abbreviated form of Grimp.

X, xml

Produce .xml files containing an annotated version of the Soot's Jimple internal representation.

n, none

Produce no output files.

jasmin

Produce .jasmin files, suitable as input to the jasmin bytecode assembler.

c, class

Produce Java .class files, executable by any Java Virtual Machine.

d, dava

Produce .java files generated by the Dava decompiler.

t, template

Produce .java files with Jimple templates.

-outjar, -output-jar

Saves output files into a Jar file instead of a directory. The output Jar file name should be specified using the Output Directory (output-dir) option. Note that if the output Jar file exists before Soot runs, any files inside it will first be removed.

-xml-attributes

Save in XML format a variety of tags which Soot has attached to its internal representations of the application classes. The XML file can then be read by the Soot plug-in for the Eclipse IDE, which can display the annotations together with the program source, to aid program understanding.

-print-tags, -print-tags-in-output

Print in output files (either in Jimple or Dave) a variety of tags which Soot has attached to its internal representations of the application classes. The tags will be printed on the line succeeding the stmt that they are attached to.

-no-output-source-file-attribute

Don't output Source File Attribute when producing class files.

-no-output-inner-classes-attribute

Don't output inner classes attribute in class files.

-dump-body phaseName
Specify that phaseName is one of the phases to be dumped. For example -dump-body jb -dump-body jb.a would dump each method before and after the jb and jb.a phases. The pseudo phase name ``ALL'' causes all phases to be dumped.

Output files appear in subdirectories under the soot output directory, with names like className/methodSignature/phasename-graphType-number.in and className/methodSignature/phasename-graphType-number.out. The in'' and out'' suffixes distinguish the internal representations of the method before and after the phase executed.

-dump-cfg phaseName

Specify that any control flow graphs constructed during the phaseName phases should be dumped. For example -dump-cfg jb -dump-cfg bb.lso would dump all CFGs constructed during the jb and bb.lso phases. The pseudo phase name ``ALL'' causes CFGs constructed in all phases to be dumped.

The control flow graphs are dumped in the form of a file containing input to dot graph visualization tool. Output dot files are stored beneath the soot output directory, in files with names like: className/methodSignature/phasename-graphType-number.dot, where number serves to distinguish graphs in phases that produce more than one (for example, the Aggregator may produce multiple ExceptionalUnitGraphs).

-show-exception-dests (default value: true)

Indicate whether to show exception destination edges as well as control flow edges in dumps of exceptional control flow graphs.

-gzip (default value: false)

This option causes Soot to compress output files of intermediate representations with GZip. It does not apply to class files output by Soot.

4. Processing Options

-p phase opt:val, -phase-option phase opt:val Set phase's run-time option named opt to value.

This is a mechanism for specifying phase-specific options to different parts of Soot. See Soot phase options for details about the available phases and options.

-O, -optimize

Perform intraprocedural optimizations on the application classes.

-W, -whole-optimize

Perform whole program optimizations on the application classes. This enables the Whole-Jimple Optimization pack as well as whole program mode and intraprocedural optimizations.

-via-grimp

Convert Jimple to bytecode via the Grimp intermediate representation instead of via the Baf intermediate representation.

-via-shimple

Enable Shimple, Soot's SSA representation. This generates Shimple bodies for the application classes, optionally transforms them with analyses that run on SSA form, then turns them back into Jimple for processing by the rest of Soot. For more information, see the documentation for the shimp, stp, and sop phases.

-throw-analysis arg (default value: pedantic)

This option specifies how to estimate the exceptions which each statement may throw when constructing exceptional CFGs.

Possible values:

pedantic

Says that any instruction may throw any Throwable whatsoever. Strictly speaking this is correct, since the Java libraries include the Thread.stop(Throwable) method, which allows other threads to cause arbitrary exceptions to occur at arbitrary points in the execution of a victim thread.

unit

Says that each statement in the intermediate representation may throw those exception types associated with the corresponding Java bytecode instructions in the JVM Specification. The analysis deals with each statement in isolation, without regard to the surrounding program.

-omit-excepting-unit-edges

When constructing an ExceptionalUnitGraph or ExceptionalBlockGraph, include edges to an exception handler only from the predecessors of an instruction which may throw an exception to the handler, and not from the excepting instruction itself, unless the excepting instruction has potential side effects.

Omitting edges from excepting units allows more accurate flow analyses (since if an instruction without side effects throws an exception, it has not changed the state of the computation). This accuracy, though, could lead optimizations to generate unverifiable code, since the dataflow analyses performed by bytecode verifiers might include paths to exception handlers from all protected instructions, regardless of whether the instructions have side effects. (In practice, the pedantic throw analysis suffices to pass verification in all VMs tested with Soot to date, but the JVM specification does allow for less discriminating verifiers which would reject some code that might be generated using the pedantic throw analysis without also adding edges from all excepting units.)

-trim-cfgs

When constructing CFGs which include exceptional edges, minimize the number of edges leading to exception handlers by analyzing which instructions might actually be executed before an exception is thrown, instead of assuming that every instruction protected by a handler has the potential to throw an exception the handler catches.

-trim-cfgs is shorthand for -throw-analysis unit -omit-excepting-unit-edges -p jb.tt enabled:true.

5. Application Mode Options

-i pkg, -include pkg

Designate classes in packages whose names begin with pkg (e.g. java.util.) as application classes which should be analyzed and output. This option allows you to selectively analyze classes in some packages that Soot normally treats as library classes.

You can use the include option multiple times, to designate the classes of multiple packages as application classes.

If you specify both include and exclude options, first the classes from all excluded packages are marked as library classes, then the classes from all included packages are marked as application classes.

-x pkg, -exclude pkg

Excludes any classes in packages whose names begin with pkg from the set of application classes which are analyzed and output, treating them as library classes instead. This option allows you to selectively exclude classes which would normally be treated as application classes

You can use the exclude option multiple times, to designate the classes of multiple packages as library classes.

If you specify both include and exclude options, first the classes from all excluded packages are marked as library classes, then the classes from all included packages are marked as application classes.

-include-all

Soot uses a default list of packages (such as java.) which are deemed to contain library classes. This switch removes the default packages from the list of packages containing library classes. Individual packages can then be added using the exclude option.

-dynamic-class class

Mark class as a class which the application may load dynamically. Soot will read it as a library class even if it is not referenced from the argument classes. This permits whole program optimizations on programs which load classes dynamically if the set of classes that can be loaded is known at compile time.

You can use the dynamic class option multiple times to specify more than one dynamic class.

-dynamic-dir dir

Mark all class files in dir as classes that may be loaded dynamically. Soot will read them as library classes even if they are not referenced from the argument classes.

You can specify more than one directory of potentially dynamic classes by specifying multiple dynamic directory options.

-dynamic-package pkg

Marks all class files belonging to the package pkg or any of its subpackages as classes which the application may load dynamically. Soot will read all classes in pkg as library classes, even if they are not referenced by any of the argument classes.

To specify more than one dynamic package, use the dynamic package option multiple times.

**6. Input Attribute Options **

-keep-line-number (default value: true)

Preserve line number tables for class files throughout the transformations.

-keep-bytecode-offset, -keep-offsetMaintain bytecode offset tables for class files throughout the transformations.

    1. Annotation Options

-annot-purity

Purity anaysis implemented by Antoine Mine and based on the paper A Combined Pointer and Purity Analysis Java Programs by Alexandru Salcianu and Martin Rinard.

-annot-nullpointer

Perform a static analysis of which dereferenced pointers may have null values, and annotate class files with attributes encoding the results of the analysis. For details, see the documentation for Null Pointer Annotation and for the Array Bounds and Null Pointer Check Tag Aggregator.

-annot-arraybounds

Perform a static analysis of which array bounds checks may safely be eliminated and annotate output class files with attributes encoding the results of the analysis. For details, see the documentation for Array Bounds Annotation and for the Array Bounds and Null Pointer Check Tag Aggregator.

-annot-side-effect

Enable the generation of side-effect attributes.

-annot-fieldrwEnable the generation of field read/write attributes.

**8. Miscellaneous Options **

-time

Report the time required to perform some of Soot's transformations.

-subtract-gc
Attempt to subtract time spent in garbage collection from the reports of times required for transformations.

Clone this wiki locally