Minerva-OSN

Official repository of the paper "Elephant in the Room: Surveying (and Rescuing) Online Social Network Research".

In this repo, you can find:

The MINERVA-OSN dataset, which is compressed in a zip file
A copy of the survey we shared among OSN researchers.

Minerva-OSN info

This repository contains the Minerva-OSN dataset. In the dataset, you can find scientific articles metadata such as Title, Year, Venue, Authors, Abstract that are related to Online Social Networks. Starting from 1 million articles gathered by Computer Science and Engineering Venues (conferences and journals) published after 2006, we looked for OSN (e.g., Facebook, Twitter) mentioned in the Abstract. The Minerva-OSN dataset is the result of this process, and it contains all those scientific articles with at least one mention. We highlight that, therefore, each paper in Minerva-OSN is linked with one or more Online Social Networks.

NOTE: we do not guarantee that our heuristic for gathering OSN scientific articles is perfect, and, therefore some articles might be missing or, vice-versa, we might include articles not relevant to the scope of this research. Two experts manually evaluated the heuristic by inspecting the retrieved articles: this process allowed us to spot and address pitfalls such as homonyms (the name of OSN might have multiple meanings -- like Faces). We kept articles containing homonyms if they also contained keywords related to OSN (e.g., social). Therefore, the manual inspection phase reduced the introduction of erroneous articles in our dataset, making our contribution more robust.

Reproducibility

You can find the code to reproduce the topic modeling (ipynb). We executed the code in the following machine:

OS: Ubuntu 22.04 LTS
Machine Model: DELL ALIENWARE AURORA R15
Processor: 13th Gen Intel(R) Core(TM) i7-13700F 2.10 GHz
GPU: NVIDIA GeForce RTX 4070 Ti

The code can be executed in approximately 10 minutes. Note that we utilize LLAMA-2, to which you need to get access through HuggingFace. (https://huggingface.co/docs/transformers/main/model_doc/llama2)

NOTE: all our analyses involves unsupervised machine learning techniques. The various hyperparameters involved have been set by looking at state-of-the-art techniques and with manual analyses of two experts in the field.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
CollectedVenues.csv		CollectedVenues.csv
ICSWN_TopicModel.ipynb		ICSWN_TopicModel.ipynb
MINERVA-OSN.zip		MINERVA-OSN.zip
README.md		README.md
osn_list.csv		osn_list.csv
osn_occurrences.csv		osn_occurrences.csv
survey_example_anonym.pdf		survey_example_anonym.pdf
topic_model_viz.html		topic_model_viz.html
topics_over_timev2_full.pdf		topics_over_timev2_full.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Minerva-OSN

Minerva-OSN info

Reproducibility

About

Releases

Packages

Languages

pajola/Minerva-OSN

Folders and files

Latest commit

History

Repository files navigation

Minerva-OSN

Minerva-OSN info

Reproducibility

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages