BuiScout is an implementation of BCIA, presented in our paper, Understanding the Implications of Changes to Build Systems (ASE'24).
BCIA is an approach to detect the locations across a CMake
build system that are affected by a change in the build specifications.
BuiScout relies on 3 components that need to be setup and configured with each other. For ease of use, we have made a docker image of the working environment (at the version used for our ASE'24 paper) available online. Please refer to our replication package to find instrucitons on how to use this image.
If you need to clone this repository for extension and run it locally, you may choose to either run BuiScout in a docker container (recommended) or run it in your local environment.
Note: If you intend to extend BuiScout and run it in a contaier, you must rebuild the image after you make changes to the code base.
Note: If your changes lead to adding a new directory in the root directory of the project, you must add this directory to the list of the Allowed files and directories in the
.dockerignore
file. To do so, simply add a line containing!/[Name_of_the_directory]
to the list.
Because this project relies on other repositories, you must take the following steps before you start setting up BuiScout:
-
Create a directory which will be the top-level directory of the setup. You can run the following command in your Bash command line at your prefered working directory to create this directory with the name
BCIA
:$ mkdir BCIA/
-
Download the
gumtree-ASE2024.zip
file from this version of this repository and unzip it into theBCIA
folder.Note: Do NOT use the upstream repository. The changes made in this version, although minor, are essential for BuiScout's successful execution.
Note: Alternatively, you can clone the latest version of this repository in the
BCIA
directory to use the latest versoin. But note that you will have to build this project from source code and unzip the built project in the same directory as the zip file. -
Download the
tree-sitter-parser-ASE2024.zip
file from this version of this repository and unzip it into theBCIA
folder.Note: Do NOT use the upstream repository. This version uses an improved and more robust CMake Tree-Sitter parser.
Note: Alternatively, you can clone the latest version of this repository in the
BCIA
directory to use the latest versoin. But note that you will have to clone this recursively and then setup thetree-sitter-cmake
submodule accordingly.
-
Make sure you have Docker installed and running.
Note: We used Docker version 27.1.1, build 6312585 in our ASE`24 submission version.
-
In your Bash command line, change your working directory to the root directory of the BuiScout project by running:
$ cd [Path_to_BuiScout_Location]/BuiScout
-
Make sure the files
BuiScout_build_image.sh
,BuiScout_run_container.sh
,process.sh
, andconvert.sh
are executable. To make these files executable, in your Bash command line and in the same working directory, run:$ chmod +x ./*.sh
-
Run the
BuiScout_build_image.sh
script, located in the root directory of the project by running the following command in your Bash command line and in the same working directory.$ ./BuiScout_build_image.sh [tag]
Note: You can pass an optional argument to this script to select a specific tag for the docker image to be build. If tag is passed, the image
buiscout:tag
is built. By default, it uses thelatest
tag.Note: This script first builds the
buiscout
image and then runs it. Once the script is successfully run, you will be logged into the container.Note: This script creates a directory named
_BuiScout_mountpoint
in the parent directory of the BuiScout project, if it does not already exist. You must place your config.json file in this direcotry and, if you choose to analyze a local subject repository, clone it into this directory.Note: Runing this script from any working directory works fine, as long as you point to the script file. For example, if your peresent working is the parent directory of the BuiScout project, running the following will yield the same outcome:
$ ./BuiScout/BuiScout_build_image.sh
-
For future executions, rebuilding the image is not necessary unless changes are made to BuiScout. To run the docker container without rebuilding the image, run the
BuiScout_run_container.sh
, located in the root directory of the project by running the following command in your Bash command line and in the same working directory.$ ./BuiScout_run_container.sh [tag]
Note: You can pass an optional argument to this script to select a specific tag for the docker image to be rin. If tag is passed, the image
buiscout:tag
is run. By default, it uses thelatest
tag.Note: Once the script is successfully run, you will be logged into the container.
Note: This script creates a directory named
_BuiScout_mountpoint
in the parent directory of the BuiScout project, if it does not already exist. You must place your config.json file in this direcotry and, if you choose to analyze a local subject repository, clone it into this directory.Note: Runing this script from any working directory works fine, as long as you point to the script file. For example, if your peresent working is the parent directory of the BuiScout project, running the following will yield the same outcome:
$ ./BuiScout/BuiScout_run_container.sh
Note: When using the scripts
BuiScout_build_image.sh
andBuiScout_run_container.sh
, you can keep yourconfig.json
file in the root directory of the BuiScout project for ease of modifications. The scripts first attemt to copy aconfig.json
file from this directory into../_BuiScout_mountpoint/
directory, which will be mounted as a volume to the container. The scripts will print appropriate messages to remind you of this point.
Note: This options is not recommended. You must setup your local environment for all underlying projects.
Note: The commands on this page were tested on
Ubuntu 22.04
. You will need to follow different instructions on different operating systems/architectures.
-
Make sure you have
python3
andpip
installed. It is recommended to create a python environment for the project. -
Run the following commands to install
openjdk-11-jre
andgraphviz
.-
System updates:
$ apt update -y $ apt upgrade -y
-
Installations:
$ apt install -y openjdk-11-jre graphviz graphviz-dev
-
-
Download the
gumtree-ASE2024.zip
file from this version of this repository and follow the instructions on the page to install gumtree.Note: Do NOT use the upstream repository. The changes made in this version, although minor, are essential for BuiScout's successful execution.
Note: Alternatively, you can clone the latest version of this repository to use the latest versoin. But note that you will have to build this project from source code, unzip the built project, and add the
/bin
directory in the unzipped directory to your path.Note: The location of GumTree on your system does not matter for a local run.
-
Run the following commands to install
Node.js
andNPM
.- Preparing your system:
$ apt install -y curl $ curl -fsSL https://deb.nodesource.com/setup_16.x
- Intallation:
$ apt install -y nodejs $ apt install -y npm ```
- Preparing your system:
-
Download the
tree-sitter-parser-ASE2024.zip
file from this version of this repository and follow the instructions on the page to install prepare the parser.Note: Do NOT use the upstream repository. This version uses an improved and more robust CMake Tree-Sitter parser.
Note: Alternatively, you can clone the latest version of this repository in the
BCIA
directory to use the latest versoin. But note that you will have to clone this recursively and then install this tool AND thetree-sitter-cmake
submodule accordingly.
-
Download this version of BuiScout or clone its latest version.
-
In your Bash command line at the root directory of BuiScout, run the following command:
$ pip install -r requirements.txt
-
Make sure the files
process.sh
, andconvert.sh
are executable. To make these files executable, in your Bash command line and in the same working directory, run:$ chmod +x ./*.sh
-
[Optional] Run the following command to create a persistent alias for BuiScout:
$ echo 'alias scout="python3 <absolute path to the directory BuiScout is cloned into>/BuiScout/scout.py"' >> $HOME/.bashrc $ source $HOME/.bashrc
Note: By running this command, you will be to run BuiScout from any working directory by simply running the command
scout
in your terminal. If you choose to skip this step, you can run BuiScout by running the following command in your terminal instead ofscout
:$ python3 <absolute or relative path to the directory BuiScout is cloned into>/BuiScout/scout.py
Note: If you skipped Setup > Setup Local Environment to Run Locally > Setin Up Local Environment for BuiScout > step 4. , you must replace
scout
command with `python3 <path to BuiScout/scout.py>.
Once you have the you complete the setup of your choice, you will be in a command line where you can access BuiScout using the scout
command.
Running sout how-to
will print out detailed instructions on initializing, testing, and running BuiScout.
The scout
command supports the following options:
-
scout --help
: Prints supported options. -
scout how-to
: Prints instructions to successfully run/test BuiScout. -
scout init
: Initializes the environment by creating the<BuiScout_ROOT_DIR_PARENT>/_BuiScout_mountpoint/
directory. -
scout test
: Runs BuiScout on the configurations in the<BuiScout_ROOT_DIR/test/>
directory. -
scout run
: Runs BuiScout on the configurations and repositories available to it in the<BuiScout_ROOT_DIR_PARENT/_BuiScout_mountpoint>
directory. This will be the volume you mount on the container if you are running BuiScout in a Docker container.
This section is a manual to setup your config.json file.
Note:
<BuiScout_ROOT_DIR>/test/config.json
provides and example.
The json object in your config.json file must include the following configurations:
-
OPTIONS
(Dictionary(Boolean)
, Required)To enable any option, set its value to
true
, and to disable it, set its value tofalse
. The options are:-
If enabled, assigns each commit to a process thread so that once the commit is processed, all resources are released before moving to the next commit.
Note: No parallelization is provided, therefore this can be used together with
COMMIT_SERIES
(not mutually exclusive).Note: This is under test and is not compatible with
arm
architecures.Hint: Disable if maximum recursion errors encountered.
-
If enabled, the commits listed in
COMMITS
are treated as a series of conscutive build commits and the source code and AST differencing information is cached from each previous commit for the next commit. In this case, only the modified files in the next commit are re-processed.If disabled, computes AST differences for all files in each commit independently.
Note: This is mutually exclusive with
AST_DIFFS_REUSE
and overwrites this option if both are selected. -
If enabled, uses the the pre-existing GumTree output. Assumes that there was a GumTree error if data does not exist.
Note: This can only be used if BuiScout has previously analyzed a commit wihtout
COMMIT_SERIES
being enabled andSTORE
is the same path for both runs.Note: This is incompatible with
COMMIT_SERIES
and will be disabled ifCOMMIT_SERIES
is enabled. -
If enabled, clears the progress into the list
COMMITS
and re-processes all listed commits. -
If enabled, the verbose logs will be printed to the terminal.
-
If enabled, instead of considering the build system globally, only the modified files get analyzed and the impact detected will be contained to the modified files.
-
If enabled, no differencing is applied and the build system in the updated version will be analyzed (data-flow and IKG construction, IKG in this mode is currently invalid).
-
If enabled, the body of each user-defined callable entity will be imported to each call site.
-
If enabled, looks for a folder named
PROJECT
in theproject_specific_support
folder where an extension of BuiScout is implemented to support specific features for the subject project. -
Opt10:
INITIALIZE_WITH_BUILD_COMMITS
Warning: Not recommended due to errors in underlying libraries.
If enabled, only commits with build specifications are selected from the repository. This might throw an error as we access parent commits and they might be excluded.
-
-
RELATIVE_RESULT_PATH
(String(Path)
, Required)The path to where the results must be stored. Must be inside the docker's mountpoint and relative to the mountpoint.
-
PROJECT
(String
, Required)The name of the subject project.
-
REPOSITORY
(String(Path)
orString(URL)
, Required)The local path or url pointing to the subject repository. If a local path is provided, it must be inside the docker's mountpoint and relative to the mountpoint.
Note: A path to a local clone of the repository is recommended for faster analysis.
-
BRANCH
(String
, Required)The branch to consider the commits from. Use the keyword
ALL
to consider all branches.Note: Use of the default branch is recommended.
-
COMMITS
(List[String(Commit Hash)]
or keywordALL
, Required)List of commit hashes to analyze. Use the keyword 'ALL' to analyze all commits.
-
EXCLUDED_COMMITS
(List[String(Commit Hash)]
, Required)List of commit hashes to analyze. Pass an empty list to exclude no commits.
-
BUILD_TECHNOLOGY
(String
, Required)The build technology supporting the build system, e.g.,
CMake
orMaven
. -
ENTRY_FILES
(List[String(Path)]
, Required)The list of paths to the build system's entry point. Must be relative the the root directory of the subject project, and at least one path is required.
-
PROJECT_SPECIFIC_INCLUDES
(Dictionary(Dictionary(List[String]))
, Required)Use the following template and fill the lists as needed. Simply pass an empty dictionary to skip this configuration.
{"include": { "starts_with": [ "List of the begining of the paths (relative to root directory in the subject project, no / at the beginning) to include (added to the default patterns for naming conventions in a given build technology) in the analysis." ], "ends_with": [ "List of the ending of the paths (treated similar to a file extension) to include (added to the default patterns for naming conventions in a given build technology) in the analysis." ] } }
-
PROJECT_SPECIFIC_EXCLUDES
(Dictionary(Dictionary(Dictionary(List[String])))
, Required)Use the following template and fill the lists as needed. Simply pass an empty dictionary to skip this configuration.
{<INSERT_LANGUAGE_LOWER_CASE>: { "exclude": { "starts_with": [ "List of the begining of the paths (relative to root directory in the subject project, no / at the beginning) to exclude (even if the qualify based on default patterns for naming conventions in a given build technology) from the analysis." ], "ends_with": [ "List of the ending of the paths (treated similar to a file extension) to exclude (even if the qualify based on default patterns for naming conventions in a given build technology) from the analysis." ] } } }
-
PROJECT_SPECIFIC_PATH_RESOLUTION
(List[Dictionary]
, Required)Use the following template and populate the list as needed. Simply pass an empty lists to skip this configuration.
Note: The specified resolutions overwrite the default file resolution techniques.
[ { "caller_file_path": "Path to the file in which the caller command resides. Use '*' to apply the same resolutions for everywhere in code.", "callee_file_path": "Path specified in the caller command to refer to the callee file.", "callee_resolved_path": [ "A list of relative paths to the callee file (relative to the root directory of the subject project). List can have >= 1 path. Use keyword 'SKIP' (without list) to skip." ] } ]