Silent Vulnerable Dependency Alert Prediction with Vulnerability Key Aspect Explanation

This repository shows the labels and codes of paper Silent Vulnerable Dependency Alert Prediction with Vulnerability Key Aspect Explanation. We only show part of training data here as the full data is too big, please contact authors to get the full data.

BART-CodeBERT Model Structure

We combine the pre-trained BART model with pre-trained CodeBERT model to test the capability of model containing both natural language and code information. One drawback of the pre-trained Transformer model is that the dimensions of its input and parameters are fixed, so we cannot simply concatenate the two encoders’ outputs, which is incompatible with the dimensions of the pre-trained decoder’s input. To solve the problem, we use cross-model self-attention layer, using query states (Q) of BART encoder with key (K) and value states (V) of CodeBERT encoder to calculate importance of each CodeBERT output token to the BART encoder’s output tokens. The input of BART decoder is output of residual structure between BART encoder and cross-model self-attention layer.

Note

The code is based on CodeBERT project. The path of all files need to be changed.

Quality of Silent Dependency Alert Classification

According to paper Section 4.1, the classification results are shown below:

Results of Silent Dependency Alert Classification by Different Models

	CodeBERT	BERT	BERT-CodeBert	Transformer	LSTM
AUC	0.91	0.89	0.57	0.80	0.71

CodeBERT has the best result (0.91 AUC), followed by BERT (0.89 AUC). Both of them have much better performance than non-pre-trained models (Transformer and LSTM).

Results of Silent Dependency Alert Classification with Different Types of Inputs

	Commit Message	Added & Deleted Code Segments	All Code Segments	Commit Message & Added & Deleted Code Segments	Commit Message & All Code Segments
AUC	0.55	0.67	0.62	0.80	0.91

The input results with both commit messages and code segments are better than the results without message or code segment (0.80-0.91 AUC versus 0.55-0.67 AUC), indicating that both commit messages and code segments are useful for the silent dependency alert detection.

Quality of Explainable Silent Dependency Alert Generation

According to paper Section 4.2, the generation results are shown below:

Results of Different Models of Aspect Generator

The results of pre-trained model based models (BART, CodeBERT and BART-CodeBERT) are much better than non-pre-trained Transformer model and the LSTM in all four key aspects. This demonstrates the advantages of model pre-training.

Results of CodeBERT-based Aspect Generator for Different Types of Inputs

Inputs with both commit messages and code contents have much better results than the ones using only commit message or code contents. This indicates both commit messages and code contents are important to the generation task.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
code		code
data		data
images		images
README.md		README.md

File	Description
code/clasifier	Codes for classifier of BERT-CodeBERT model with cross model self-attention layer
code/generator	Codes for generator of BART-CodeBERT model with cross model self-attention layer
data/clasifier	Labels for training classifier
data/generator	Labels for training generator

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Silent Vulnerable Dependency Alert Prediction with Vulnerability Key Aspect Explanation

Contents

BART-CodeBERT Model Structure

Note

Quality of Silent Dependency Alert Classification

Results of Silent Dependency Alert Classification by Different Models

Results of Silent Dependency Alert Classification with Different Types of Inputs

Quality of Explainable Silent Dependency Alert Generation

Results of Different Models of Aspect Generator

Results of CodeBERT-based Aspect Generator for Different Types of Inputs

About

Releases

Packages

Languages

aisec-dev/aspect_generation

Folders and files

Latest commit

History

Repository files navigation

Silent Vulnerable Dependency Alert Prediction with Vulnerability Key Aspect Explanation

Contents

BART-CodeBERT Model Structure

Note

Quality of Silent Dependency Alert Classification

Results of Silent Dependency Alert Classification by Different Models

Results of Silent Dependency Alert Classification with Different Types of Inputs

Quality of Explainable Silent Dependency Alert Generation

Results of Different Models of Aspect Generator

Results of CodeBERT-based Aspect Generator for Different Types of Inputs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages