Skip to content

mohit2b/MO-Sarcation

Repository files navigation

Your tone speaks louder than your face! Modality Order Infused Multi-modal Sarcasm Detection: MO-Sarcation

This repository contains the code for our ACM Multimedia 2023 paper "Your tone speaks louder than your face! Modality Order Infused Multi-modal Sarcasm Detection". In Proceedings of the 31st ACM International Conference on Multimedia (MM ’23), October 29-November 3, 2023, Ottawa, ON, Canada.

Figurative language is an essential component of human communication, and detecting sarcasm in text has become a challenging yet highly popular task in natural language processing. As humans, we rely on a combination of visual and auditory cues, such as facial expressions and tone of voice, to comprehend a message. Our brains are implicitly trained to integrate information from multiple senses to form a complete understanding of the message being conveyed, a process known as multi-sensory integration. The combination of different modalities not only provides additional information but also amplifies the information conveyed by each modality in relation to the others. Thus, the infusion order of different modalities also plays a significant role in multimodal processing. In this paper, we investigate the impact of different modality infusion orders for identifying sarcasm in dialogues. We propose a modality order-driven module integrated into a transformer network, MO-Sarcation that fuses modalities in an ordered manner. Our model outperforms several state-of-the-art models by 1-3% across various metrics, demonstrating the crucial role of modality order in sarcasm detection. The obtained improvements and detailed analysis show that audio tone should be infused with textual content, followed by visual information to identify sarcasm efficiently.

If you consider this work useful, please cite it as

@inproceedings{tomar2023your,
  title={Your tone speaks louder than your face! Modality Order Infused Multi-modal Sarcasm Detection},
  author={Tomar, Mohit and Tiwari, Abhisek and Saha, Tulika and Saha, Sriparna},
  booktitle={Proceedings of the 31st ACM International Conference on Multimedia},
  pages={3926--3933},
  year={2023}
}

This code is adapted from the following Github repository https://github.com/LCS2-IIITD/MAF

Contact

For any queries, feel free to contact Mohit Tomar ([email protected])

About

Multimodal Sarcasm Identification

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published