Skip to content

Commit

Permalink
Use HTML directly for correctly displaying links
Browse files Browse the repository at this point in the history
  • Loading branch information
Maspital committed May 31, 2024
1 parent 2650aa8 commit 4b34662
Showing 1 changed file with 10 additions and 9 deletions.
19 changes: 10 additions & 9 deletions content/statistics.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,29 +10,30 @@ This figure presents the distribution of currently surveyed datasets over time,
Datasets containing data from more than one year are represented accordingly.
Additionally, data sources and label availability are shown:
Data sources are grouped into "Network Data" (e.g., packet captures or network flows), "Host Data" (e.g., system logs or syscalls), and "Both" (any combination of the previous two);
label availability for each dataset has been classified into either "Labeled" (explicit labels for at least a subset of data), "Ground Truth" (meta-information allowing for manual labeling), or "No Labels".
label availability for each dataset has been classified into either "Labeled" (explicit labels for at least a subset of data), "Ground Truth" (meta-information allowing for manual labeling), or "No Labels".

Even though this simplifies certain aspects, the figure provides a reasonably broad overview of the current landscape of IDS-related datasets.
As an example, while the [DARPA '98](/intrusion-detection-datasets/content/datasets/darpa98) and [CSE-CIC-IDS2018](/intrusion-detection-datasets/content/datasets/cse_cic_ids2018) datasets contain both network and host data and are visualized as such, only their network data is labeled and thus typically used by other publications.
Still, declaring these datasets to contain only network data would go beyond the purpose of a survey, as it is up to other researchers to decide whether the (in this case host) data can be utilized for their purposes.

![Figure 1: Distribution of datasets in time]({{ "/assets/data/plots/datasets_over_years.png" | relative_url }})
<p style="text-align: center;">
<img src="{{ "/assets/data/plots/datasets_over_years.png" | relative_url }}" alt="Figure 1: Distribution of datasets in time" />
</p>

<p style="text-align: center;font-size:0.8em;">
[Download PDF]({{ site.baseurl }}/assets/data/plots/datasets_over_years.pdf) / [Download PNG]({{ site.baseurl }}/assets/data/plots/datasets_over_years.png)
<a href="{{ site.baseurl }}/assets/data/plots/datasets_over_years.pdf" download>Download PDF</a>
</p>


### Dataset characteristics

This figure lists various characteristics of surveyed datasets, grouped into five categories: Source of network data, source of host data, how benign activity was generated, which operating systems were included, and how many systems in total were part of the scenario.
Except for the final category, these classifications are not mutually exclusive -- consequently, the sum of a specific category might not align with the total number of datasets surveyed.
Except for the final category, these classifications are not mutually exclusive -- consequently, the sum of a specific category might not align with the total number of datasets surveyed.
This discrepancy occurs because some datasets, for example, do not include network data, while others may include multiple operating systems, affecting the sums respectively.

![Figure 2: Characteristics of surveyed datasets, grouped into categories.]({{ "/assets/data/plots/datatypes_count.png" | relative_url }})

[Download PDF]({{ site.baseurl }}/assets/data/plots/datatypes_count.pdf)
<p style="text-align: center;">
<img src="{{ "/assets/data/plots/datatypes_count.png" | relative_url }}" alt="Figure 2: Characteristics of surveyed datasets, grouped into categories." />
</p>

<p style="text-align: center;font-size:0.8em;">
[Download PDF]({{ site.baseurl }}/assets/data/plots/datatypes_count.pdf) / [Download PNG]({{ site.baseurl }}/assets/data/plots/datatypes_count.png)
<a href="{{ site.baseurl }}/assets/data/plots/datatypes_count.pdf" download>Download PDF</a>
</p>

0 comments on commit 4b34662

Please sign in to comment.