title | venue | names | tags | link | author | categories | layout | |
---|---|---|---|---|---|---|---|---|
Word Sketches for Turkish |
LREC |
Bharat Ram Ambati, Siva Reddy, A. Kilgarriff |
|
Siva Reddy |
Publications |
paper |
{{ page.names }}
{{ page.venue }}
{% include display-publication-links.html pub=page %}
Word sketches are one-page, automatic, corpus-based summaries of a word's grammatical and collocational behaviour. In this paper we present word sketches for Turkish. Until now, word sketches have been generated using a purpose-built finite-state grammars. Here, we use an existing dependency parser. We describe the process of collecting a 42 million word corpus, parsing it, and generating word sketches from it. We evaluate the word sketches in comparison with word sketches from a language independent sketch grammar on an external evaluation task called topic coherence, using Turkish WordNet to derive an evaluation set of coherent topics.