Skip to content

Latest commit

 

History

History
21 lines (16 loc) · 1017 Bytes

2012-01-01-L12-1332.md

File metadata and controls

21 lines (16 loc) · 1017 Bytes
title venue names tags link author categories layout
Word Sketches for Turkish
LREC
Bharat Ram Ambati, Siva Reddy, A. Kilgarriff
LREC
Siva Reddy
Publications
paper

{{ page.names }}

{{ page.venue }}

{% include display-publication-links.html pub=page %}

Abstract

Word sketches are one-page, automatic, corpus-based summaries of a word's grammatical and collocational behaviour. In this paper we present word sketches for Turkish. Until now, word sketches have been generated using a purpose-built finite-state grammars. Here, we use an existing dependency parser. We describe the process of collecting a 42 million word corpus, parsing it, and generating word sketches from it. We evaluate the word sketches in comparison with word sketches from a language independent sketch grammar on an external evaluation task called topic coherence, using Turkish WordNet to derive an evaluation set of coherent topics.