Project Website
|
Conference
|
Paper
The code is designed to facilitate the analysis of open-source projects. The instructions provided below are provided to help reproduce the results of RQ1-RQ3 from the paper. Note that the code can be easily adapted for analyzing other open-source projects with minimal modifications to the configuration file.
To run scripts locally, it is suggested to prepare Python 3.8
and Go 1.21.5
(or newer version) on the system.
git clone https://github.com/fish98/CRS_Bugs.git
cd CRS_Bugs
mkdir Source_Code Test Sample Collection # For storing experiment data
pip install -r requirements.txt
-
Replace ALL the placeholder
{ROOT_DIR}
in the source with the path to the source code of this project. (e.g.,root_dir = "/root/CRS_Bugs
) -
The config file for the project could be changed for analyzing different open-source project. The example is shown below:
{
"access_token": "ghp_xxxxxxxxxxxxxxxxxx", // Github personal access token.
"project_author": "opencontainers", // Github project author/organization name.
"project_name": "runc", // Github project name.
"query_type": "commits", // Collect commit data on Github. Could support PRs and Issues on Github.
"page_num": 1, // The number of the starting page for collecting the data.
"per_page": 100, // The number of commits to display on a page.
"google_doc_sheet": 7, // The number of sheet of the google sheet would be updated.
"limit_time_d": "2021-06-01", // The collected data would be later than this date.
"limit_time_u": "2023-06-01" // The collected data would be earlier than this date.
}
To collect all the commit information from Github:
-
Change the project config in the
config.json
, especially for the Github Access Token config"access_token": "ghp_xxxxxxxxxxxxxxxxxx"
. How to get Github personal access token? -
Run the following command:
python src/collect_github_commits.py
As the collecting speed is subject to the Github api accesss limit. For the convenienance of reproducing our expereimetn data, you can directly download the result of the collected commits data from our anoymous Google Drives, and Extracted the content into the Collection
directory for further analysis.
- Run the following command:
python src/filter_commits.py
- Replace the
repoName
(e.g., runc) inAST/extract_ast.go
with the corresponding project name. - Pull the project source code with Git into
Source_Code/{project_name}
- Run the following command:
python -u AST/bug2test.py 1 > bug2test-{project_name}.log 2>&1
- Note that
{project_name}
is just the placeholder for the real project name.
- Run the following command:
python read_sample_commits.py
- Change
sheet_name
insrc/update_gdoc.py
and prepare the google API account - Set the shared user permission of google API, and prepare the Google Sheet Number with the official json file (e.g., test_google_api.json)
- Replace the
google_doc_sheet
as the google sheet id inconfig.json
- Run the following command:
python src/update_gdoc.py
Cite as below if you find this repository is helpful to your project:
@inproceedings {yu2024bugs,
title = {Bugs in Pods: Understanding Bugs in Container Runtime Systems},
author = {Jiongchi Yu and Xiaofei Xie and Cen Zhang and Sen Chen and Yuekang Li and Wenbo Shen},
booktitle = {Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA)},
year = {2024}
}