-
Notifications
You must be signed in to change notification settings - Fork 0
/
chrisdyer_shouldlinguisticmodelsreflectstructure_2016.txt
72 lines (47 loc) · 2.12 KB
/
chrisdyer_shouldlinguisticmodelsreflectstructure_2016.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
- something to watch out for: RNNs overfit easily
- CharLSTM > Lookup/word2vec for dependency parsing
Morphological Typology
https://en.wikipedia.org/wiki/Morphological_typology#Synthetic_languages
------------------
1. Agglutinative (char RNN seems to win by a lot) - crazy morphology
turkish, finnish, basque, etc.
2. Fusional https://en.wikipedia.org/wiki/Fusional_language (models are equivalent)
english
3. Analytic https://en.wikipedia.org/wiki/Analytic_language (models are equivalent)
vectors from charRNN
------------------
- can query using non-existing words and return reasonable looking results
Language/character models
-------------------
He featured:
1. Google Brain
2. harvard/NYU - Kim, Rush, Sontag
3. NYU/FB - document representation "from scratch"
4. CMU - Dyer/Ruslan/Cohen - twitter, morphologically rich languages
Finite state transducers - structure aware words (plural, etc.)
---------------------
- root + affixes
Open Library Models
---------------------
- RNN, but not using word2vec lookups. Use morphology structure/seq as input
Summary of charRNN vs lookup word2vec
-------------------
- for morphologically "tame" langauges like english and chinese, perplexity metrics are equivalent
- would be useful for situations with a lot more variability
#######################################################################################################################################
Modeling Syntax
-----------------
RNN Grammar
------------------
- extension of Socher's work
- words and constituents embedded in the same space
- back prop "through structure"
- capture linguistic notion of "headedness"
- support any number of children
- Intuition:
1. generate symbols using RNN
2. Add some control symbols to rewrite the history periodically
- periodically "compress" symbols into a single "constituent"
- augment RNN with an operation to compress recent history into a single vector ("reduce", similar to shift-reduce)
- RNN generates symbols based on history of compressed and non-compressed terminals ("shift"/generate)
- RNN must also predict "control symbols" that decide how big the constituents are