The Official data repository for COLING 2022 paper "Are Visual-Linguistic Models Commonsense Knowledge Bases?".
We release the two datasets used in the commonsense knowledge probing experiemnts:
(1) CWWV_IMG (2) CWWV_CLIP.
-
CWWV_IMG (Download):
is automatically generated by following the procedures proposed by Ma et al. (2019).
Additionaly, we rely on an effcient image retrieval process to compensate the missing image sources (please refer to our paper for details).
Dimensions Counts part-whole 1,165 taxonomic 1,323 distinctness 828 similarity 644 quality 1,840 utility 2,090 creation 100 temporal 1,889 spatial 1,599 desire 1,781 total 13,259 -
CWWV_CLIP (Download):
is a subset of CWWV_IMG that contains higher quality of image-word pair according to CLIPScore.
Dimensions Counts part-whole 170 taxonomic 85 distinctness 86 similarity 188 quality 143 utility 120 creation 8 temporal 154 spatial 144 desire 91 total 1,189
If you find this dataset useful for your research, please cite:
@inproceedings{
yang-2022,
title={Are Visual-Linguistic Models Commonsense Knowledge Bases?},
author={Hsiu-Yu Yang and
Carina Silberer},
booktitle={Proceedings of the 29th International Conference on Computational
Linguistics, {COLING} 2022, Gyeongju, Republic of Korea, October 12-17,
2022},
year={2022}
}