This codebase contains the python scripts for HYPHEN, the model for the ACL 2022 paper, "HYPHEN: Hyperbolic Hawkes Attention For Text Streams". This work was done with the FinTech lab at Georgia Tech. The FinTech lab aims to be a hub for finance education, research and industry in the Southeast. The lab acts as a platform to connect and bring together faculty and students across Georgia Tech with the financial services industry and FinTech entrepreneurs.
This codebase contains the python scripts for HYPHEN.
Dependencies are mentioned in the requirements.txt. Just run the below shell script to setup your environment. Please ensure that the data is in the current directory as of the train_hyphen.py
file.
./set_env.sh
We have yaml files for the datasets that we experimented with and can be referred to for the different scripts. Please modify the yaml files accordingly for train/test scripts.
python train_hyphen_suicide.py
python test_hyphen_suicide.py
Consider citing our work if you use our codebase
@inproceedings{Agarwal-etal-2022,
title = "HYPHEN: Hyperbolic Hawkes Attention For Text Streams",
author = "Agarwal, Shivam and
Sawhney, Ramit and
Ahuja, Sanchit, and
Soun, Ritesh and
Chava, Sudheer"
booktitle = "Proceedings of the 60th Annual Meeting of The Association of Computational Linguistics",
month = may,
year = "2022",
address = "Dublin",
publisher = "Association for Computational Linguistics"}
We use the dataset released by [1] that consist of parliamentry debates, date of the debate, vote of the MP etc. We use a static dump of BERT embeddings for parliamentery debates
stored in a npy
format along with the europarl dataset.
For Suicide Ideation, we followed the exact way of preprocessing and preparation of data as done here
We have used couple of datasets for stock price prediction, i.e. Chinese Stock Exchange and S&P dataset. We follow exactly the same preprocessing as done here [7].
- Abercrombie, Gavin (2020), “ParlVote: Corpora for Sentiment Analysis of Political Debatess”, Mendeley Data, V2, doi: 10.17632/czjfwgs9tm.2
- Ramit Sawhney, Harshit Joshi, Saumya Gandhi, and Rajiv Ratn Shah. 2021. Towards Ordinal Suicide Ideation Detection on Social Media. Proceedings of the 14th ACM International Conference on Web Search and Data Mining. Association for Computing Machinery, New York, NY, USA, 22–30. DOI:https://doi.org/10.1145/3437963.3441805
- Kochurov, Max, Rasul Karimov, and Serge Kozlukov. "Geoopt: Riemannian optimization in pytorch." arXiv preprint arXiv:2005.02819 (2020).
- Hyrnn code: https://github.com/ferrine/hyrnn
- Manifolds and RAdam optimizer: https://github.com/HazyResearch/hgcn
- Ramit Sawhney, Shivam Agarwal, Megh Thakkar, Arnav Wadhwa, and Rajiv Ratn Shah. 2021. Hyperbolic Online Time Stream Modeling. Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, New York, NY, USA, 1682–1686. DOI:https://doi.org/10.1145/3404835.3463119
- https://github.com/midas-research/fast-eacl