Repository | |
---|---|
Docker image | |
Reserarch Object Badge | |
Open Research Badge |
This repository contains the artifact for the following paper:
- A Precise Framework for Source-Level Control-Flow Analysis, Idriss Riouak 🔗, Christoph Reichenbach 🔗, Görel Hedin 🔗 and Niklas Fors 🔗. To appear at the 21st IEEE International Working Conference on Source Code Analysis & Manipulation, 2021 (SCAM 2021 🔗).
The repository contains:
- A snapshot of IntraCFG (192c02c), i.e., a language-independent framework for building control-flow graphs (CFGs) using reference attribute grammars, as described in section 2 in the paper.
- A snapshot of IntraJ (479e927), i.e., a tool that applies IntraCFG to construct control-flow graphs (CFGs) for Java source programs, as described in section 3 in the paper. IntraJ is built as an extension to the ExtendJ Java compiler, which is in turn implemented using JastAdd (a metacompiler supporting reference attribute grammars).
- In addition to CFG construction, IntraJ contains two example client analyses that make use of the CFG, as described in section 4 in the paper: DAA - detection of Dead Assignments in the codebase, and NPA - detection of occurrences of Null Pointer exceptions.
- Test cases and evaluation scripts are also included. Using this artifact you can rerun the experiments presented in section 5 in the paper. This includes running IntraJ on a suite of subject codebases, and running competing tools (JastAddJ-Intraflow and SonarQube) on the same codebases.
You can reuse this artifact in various ways. For example:
- You can run IntraJ on other Java codebases (in Java-4, Java-5, Java-6, and Java-7) in order to construct CFGs and get DAA and NPA analysis results. More can be read about reusability here.
We provide three different ways of getting and running IntraJ:
- You can download the pre-built Docker image (recommended).
- Build your own Docker image using the Dockerfile script.
- Download and build IntraJ from the artifact source code.
We provide a Docker image that contains IntraJ and evaluation scripts, packaged together with all the necessary dependencies. To run such an image, make sure to install the relevant tools:
-
For Windows and OS X systems, follow the guidelines on the Docker desktop download site
-
On Linux-based systems, install the docker command-line tool. This tool may be provided by the docker.io and/or docker-ce packages. If your distribution does not provide these packages, follow the steps here:
Download the pre-built image here. Then, anywhere in your workspace run
docker load << Downloads/intraj_scam21.tar.gz
Clone the IntraJSCAM2021 repository by running the following command:
git clone https://github.com/lu-cs-sde/IntraJSCAM2021.git
Once you have cloned the repository
cd IntraJSCAM2021/Docker
docker build -t intraj:scam21 .
It might take several minutes to build the Docker image. |
Run the image using:
docker run -it --network="host" --expose 9000 --expose 9001 --memory="10g" --memory-swap="16g" intraj:scam21
❗️ Very Important ❗️ |
---|
SonarQube requires a high amount of memory. We tested the container with 10GByte of memory and 10GByte of swap memory. If you are running the container from Windows or Mac, the command-line options related to the available memory in the container (i.e., --memory="10g" , --memory-swap="4g" ) are ignored. Please, set these two parameters from the GUI. Read more about it here: Windows - Mac |
You will be logged in with the user SCAM21. Once logged in, run the following commands to launch the evaluation:
cd workspace/intraj/
./eval.sh 50 50
The results are saved in: ~/workspace/intraj/evaluation/YYYYMMDD_HHMMSS
The command eval.sh 50 50 will run IntraJ 2500 times for each analysis. Therefore, computing the evaluation can take several hours. |
❗️ Very Important ❗️ |
---|
Do not close the bash nor kill the container! The results will be lost! |
To save the results in your own machine, run the following commands in a new bash:
> docker ps
This will print:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4d882c86b5ab intraj:scam21 "bash" x Up x seconds random_name
With your CONTAINER ID run the following command:
docker cp 4d882c86b5ab:workspace/intraj/evaluation/YYYYMMDD_HHMMSS /PATH/IN/YOUR/MACHINE
We have run IntraJ on the following Java version:
- Java SDK version 7. (tested with SDK 7.0.292-zulu. See sdkman).
If you also want to run the competing tool SonarQube (for checking the evaluation section), you will additionally need the following Java version:
- Java SDK version 11 (tested with SDK 11.0.9.fx-zulu. See sdkman).
It is possible to generate PDFs that show the CFGs visually. For this you need:
- Dot (graphiz) - PDF generation
- Vim - PDF generation
- Python3.x with the following dependencies:
- PyPDF2 v1.26.0 - PDF generation
- numpy v1.20.1 - Evaluation and Plots generation
- pandas v1.2.4 - Evaluation and Plots generation
- matplotlib v3.3.4 - Evaluation and Plots generation
- seaborn v0.11.1 - Evaluation and Plots generation
- ipython v7.26.0 - Evaluation and Plots generation
The evaluation script uses sdkman
.
To run the evaluation you need:
-
The scripts
eval.sh
andevaluation/run_eval.sh
usessdkman
. If you don't havesdkman
installed but have Java SDK 7 installed, you can comment all the lines starting withsdk
ineval.sh
and inevaluation/run_eval.sh
. You installsdkman
by running the following commands:curl -s "https://get.sdkman.io" | bash source "$HOME/.sdkman/bin/sdkman-init.sh" sdk install java 7.0.292-zulu sdk use java 7.0.292-zulu
To install all the necessary Python dependencies, you can run the instruction described in the next section.
To clone the IntraJ code, run, in your working directory:
git clone https://github.com/lu-cs-sde/IntraJSCAM2021.git
Move to the IntraJ directory:
cd IntraJSCAM2021
To generate all the JARs necessary for the evaluation, execute
./gradlew build
To run all the tests, execute:
./gradlew test
To install Python dependencies, you can execute the following instruction:
cd resources
pip3 install - requirements.txt
The top-level structure of the repository:
.
├── build # Compiled files
├── evaluation # Scripts and dependencies for evaluation
├── extendj # ExtendJ source code
├── resources # Scripts and logo
├── src # IntraJ source code
| ├── jastadd
| | ├── CFG # CFG spec in Jastadd
| | └── DataFlow # Data flow analyses spec
| └── java
| ├── utils # General helpers for visualisation
| └── test # JUnit test spec
├── tools # IntraJ source code
| └── jastadd-2.3.6-custom # Custom version of Jastadd
├── testfiles # Automated test files
| ├── DataFlow
| └── CFG
├── eval.sh # Evaluation entry point
├── LICENSE
└── README.md
The entry point of IntraJ (main) is defined in:
extendj/src/fronted-main/org/extendj/IntraJ.java
.
The directory is structured as follow:
.
├── antlr-2.7.2 # ANTLR Benchmark (Paper §5)
├── pmd-4.2.5 # PMD Benchmark (Paper §5)
├── jfreechar-1.0.0 # JFC Benchmark (Paper §5)
├── fop-0.95 # FOP Benchmark (Paper §5)
├── Results.xlsx # Analyses results in Excel (Paper §5)
├── Results.htm # Analyses results in HTML
├── plots.py # Script that generates plots
├── run_eval.sh # Called by ../eval.sh
└── YYYYMMDD_HHMMSS # Evaluation results
.
└── jastadd
├── CFG
| ├── IntraCFG
| | ├── CFG.ast # Lang-independent nodes
| | └── IntraCFG.jrag # IntraCFG spec in Jastadd (Paper §2.b)
| ├── java4 # (Paper §3)
| | ├── Cache.jrag # Cache settings
| | ├── Exception.jrag # Exception spec (Paper §3.c)
| | ├── Initializer.jrag # Initializers spec (Paper §3.b)
| | ├── Java4.jrag # Java4 spec
| | └── ImplictNodes.ast # Reified nodes
| ├── java5 # (Paper §3)
| | └── Java5.jrag # Java5 spec
| └── java7 # (Paper §3)
| └── Java7.jrag # Java7 spec
└── DataFlow # Data flow analyses spec (Paper §4)
├── Analysis.jrag # Collection attributes
├── DeadAssignment.jrag # DAA spec (Paper §4.c)
├── LiveVariableAnalysis.jrag # LVA spec (Paper §4.b)
└── NullAnalysis.jrag # NPE sepc (Paper §4.a)
There is no subdirectory for java6 , since features introduced in Java 6 do not affect the construction of the CFG. |
-help
: prints all the available options.-genpdf
: generates a pdf with AST structure of all the methods in the analysed files. It can be used combined with-succ
,-pred
.-succ
: generates a pdf with the successor relation for all the methods in the analysed files. It can be used combined with-pred
.-pred
: generates a pdf with the predecessor relation for all the methods in the analysed files. It can be used combined with-succ
.-statistics
: prints the number of CFGRoots, CFGNodes and CFGEdges in the analysed files.-nowarn
: the warning messages are not printed.
-------------- ANALYSIS OPTIONS --------------------
Available analyses:
DAA
: Detects unused dead assignmentsNPA
: Detects occurrences of Null Pointer Dereferencing
Options (where id
corresponds to one of the analyses above):
-Wid
: enable a given analysis, e.g.,-WDAA
-Wall
: enables all the available analyses-Wexcept=id
: enable all the available analyses exceptid
, e.g.,-Wexcept=DAA
Suppose you would like to analyze a file Example.java
located in your workspace:
public class Example {
int example() {
Integer m = null;
m.toString();
int x = 0;
x = 1;
return x;
}
}
By running the following command:
java -jar intraj.jar PATH/TO/Example.java -Wall -succ -statistics
IntraJ will print the following information
[NPA - PATH/TO/Example.java:4,4] The object 'm' may be null at this point.
[DAA - PATH/TO/Example.java:5,9] The value stored in 'x' is never read.
[INFO]: CFG rendering
[INFO]: DOT to PDF
[INFO]: PDF file generated correctly
[STATISTIC]: Elapsed time (CFG + Dataflow): 0.11s
[STATISTIC]: Total number
[STATISTIC]: Number roots:3
[STATISTIC]: Number CFGNodes:16
[STATISTIC]: Number Edges:13
[STATISTIC]: Largest CFG in terms of nodes:12
[STATISTIC]: Largest CFG in terms of edges:11
And the following PDF is generated:
- Follow the instructions in
"Prerequisites"
and"Build"
above. - Run the command
./gradlew build
. This generates the following jar files:- intraj.jar
- intraj_bl.jar
- intraj_cfg.jar
- intraj_cfgdda.jar
- intraj_dda.jar
- Start the evaluation by executing
"zsh eval.sh N_iter_outerloop N_iter_innerloop"
. For the paper we usedN_iter_outerloop = N_iter_innerloop = 50
.
All the results are stored in evaluation/YYYYMMDD_HHMM
.
- 🗄 IntraJ: main repository for IntraJ (control-flow analysis for Java)
- 🗄 IntraCFG: main repository for IntraCFG (language-independent framework for control-flow analysis)
- 🔗 JastAdd: meta-compilation system that supports Reference Attribute Grammars. We used a custom JastAdd version which better supports interfaces.
- 🔗 ExtendJ: extensible Java compiler built using JastAdd. We built IntraJ as an Static Analysis Extension of ExtendJ. More can be found here.
- 🔗 SonarQube: platform developed by SonarSource for continuous inspection of code quality
- 🗄 JastAddJ-Intraflow: An earlier approach to implementing intra-procedural control flow, dataflow, and dead assignment analysis for Java, also using JastAdd.