This repository contains structured datasets for the most popular names and surnames across various languages, including Ukrainian, Russian, Belarusian, and others. Each dataset includes transliterations, equivalents in Polish and English, informal diminutives, and other variants. The primary goal is to assist in transliteration accuracy, typo detection, and cross-linguistic studies.
The repository currently includes:
-
Ukrainian Names and Surnames
- Female Names
- Male Names
- Surnames
-
Russian Names and Surnames
- Female Names
- Male Names
- Surnames
-
Planned Additions
- Additional languages such as Belarusian, Vietnamese, and more.
- Frequency of occurrence of the given names and surnames.
Each dataset contains circa 50 female names, 50 male names, and 50 surnames. The data is organized into tables with the following columns:
- Original Name: The name or surname in its original script.
- Polish Transliteration: Transliteration based on Polish orthography.
- Alt Polish Trans: Alternative Polish transliterations.
- Polish Equivalent: Direct equivalent in Polish.
- English Equivalent: Standard English equivalent.
- Informal Diminutive: Common informal or diminutive forms.
- Other Variants: Additional regional or historical forms.
Each file is formatted as a XLSX table, and an example structure is shown below:
Original Name | Polish Transliteration | Alt Polish Trans | Polish Equivalent | English Equivalent | Informal Diminutive | Other Variants |
---|---|---|---|---|---|---|
Олена | Olena | Olona | Helena | Helen | Olenka | Alyona |
Наталія | Natalia | Nataliya | Natalia | Natalie | Nata | Natasha |
Original Name | Polish Transliteration | Alt Polish Trans | Polish Equivalent | English Equivalent | Informal Diminutive | Other Variants |
---|---|---|---|---|---|---|
Шевченко | Szewczenko | Szevczenko | Szewczyk | Shevchenko | Chevtchenko | |
Мельник | Melnyk | Melnik | Mielnik | Melnyk | Melnykov |