You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
in my research, i work a lot with word alignments – links between the index of a word in one sentence, and the index of the rquivalent word in a translation of the same sentence.
勇気 は どこ に ?きみ の 胸 に !
2 1 0 -1 3 5 5 6 4 7
where is courage ? in your heart !
these alignments are meant to link two translations of the same sentence together, but could just as well be pointed to a different sentence entirely.
for nanogenmo 2024, i want to write a little program that traverses a corpus of aligned sentences, starting at the first word of the first sentence and following its alignment to the next sentence in the corpus, weaving fragments of each sentence into a connected string of words.
since all aligned corpora are (at least) bilingual, two weaves can be made at the same time – woven in different languages, but formed by fragments from the same sentences in the same order – starting from the same point, diverging quicker the more different the two languages are, sometimes briefly woven back together by chance.
the resulting weaves will end up being mostly ungrammatical and incomprehensible on their own, but maybe more meaningful together.
The text was updated successfully, but these errors were encountered:
in my research, i work a lot with word alignments – links between the index of a word in one sentence, and the index of the rquivalent word in a translation of the same sentence.
these alignments are meant to link two translations of the same sentence together, but could just as well be pointed to a different sentence entirely.
for nanogenmo 2024, i want to write a little program that traverses a corpus of aligned sentences, starting at the first word of the first sentence and following its alignment to the next sentence in the corpus, weaving fragments of each sentence into a connected string of words.
since all aligned corpora are (at least) bilingual, two weaves can be made at the same time – woven in different languages, but formed by fragments from the same sentences in the same order – starting from the same point, diverging quicker the more different the two languages are, sometimes briefly woven back together by chance.
the resulting weaves will end up being mostly ungrammatical and incomprehensible on their own, but maybe more meaningful together.
The text was updated successfully, but these errors were encountered: