Repository for the experiments described in "SOBR: A Corpus for Stylometry, Obfuscation, and Bias on Reddit" to be presented at LREC-COLING 2024. Code is released under the MIT license. If you use anything related to the corpus, repository or paper, please cite the following work:
@inproceedings{emmery-etal-2024-sobr,
title = "{SOBR}: A Corpus for Stylometry, Obfuscation, and Bias on Reddit",
author = "Emmery, Chris and
Miotto, Maril\`{u} and
Kramp, Sergey and
Kleinberg, Bennett",
booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources, and Evaluation",
month = may,
year = "2024",
address = "Turin, Italy",
publisher = "European Language Resources Association"
}
To be updated.
For the time being, please contact @cmry if you're interested in the data and have read the Ethical Considerations part of the paper.