Долой неоднозначность!
Down with ambiguity!
Regardless of language pair, it is essential for a bilingual dictionary to organize the inherently messy nature of words and present this information in a way that is easy to understand. This includes categories like:
- ambiguous inflections: English read, lay
- homographs, especially across parts of speech: lie, lead, produce, round
- inflectional subtleties related to the above: hung vs. hanged as the past tense of hang
This is all the more true for a language as complex as Russian, whose inflectional system is orders of magnitude more complex than English, and that is why I created Slovarish.
I wanted to help learners of this language (including myself!) tackle some of its most confusing aspects, and this page lists a series of test cases that demonstrate how the dictionary handles them.
I should add here that none of this is the result of fine-tuning a neural network, LLM output, or anything like that. Word meanings (both from the Wiktionary and Smirnitsky datasets) were matched up with ambiguous inflection sets entirely manually, and any further fine-tuning was done by literally updating the text of the definitions themselves in the database, so you can be confident in the dictionary data.
Test cases: Nouns
Completely homographic in all cases with different stress
Completely identical in all cases, except one form has different stress
One form is the nominative plural of two different nouns
Homographic inflections of a single noun
Inflections of a single noun that are homographic when ё is not used (let’s call them “ё-mographic” inflections)
Ё-mographic nouns that are otherwise identical
Polysemous nouns with different plural forms for different meanings
Homographic nouns distinguished by only one case
Homographic nouns of different genders
Nouns whose stress is affected by the presence of a preposition
Test cases: Adjectives
Homographic adjectives, distinguished by stress
Homographic adjectives, distinguished by comparative
Homographic adjectives, distinguished by short form spelling
One form is the nominative plural (and possibly genitive feminine singular) of two different adjectives
Test cases: Multiple parts of speech
Homographs across uninflected parts of speech
Homographs across inflected parts of speech
Ё-mographs across different parts of speech
Test cases: Verbs and participles
Homographic verbs in the infinitive, distinguished by stress
Homographic verbs in the infinitive, distinguished by nonpast forms
Homophonous verbs in the infinitive, distinguished by nonpast stress class
Verbs whose aspect partners are homographs of themselves in the infinitive
Verbs that are homophones but of different aspects, with different aspect partners (or none)
Homographic verbs in the infinitive (possibly homophonous, and also possibly homographic in nonpast tense), distinguished by aspect partner
Given verb pairs A1 A2 and B1 B2, where A1 and A2 are identical in the infinitive and A2 and B2 are identical in the infintive, but B1 and B2 have different stress either in nonpast or past tense
Nonpast forms that could be an inflection of one verb or of its aspect partner if stress is not marked or ё is not used
Homographic inflected forms of unrelated verbs
A verb has a past passive participle P that is homographic (or ё-mographic) with adjective A (whose meaning is related but not identical), and the adjective A follows the pattern -ен(ный), -енна while participle P follows the pattern -ён(ный), -ена́, or vice-versa (let’s call this the yenny-yonny problem)
Similar to the above, but the stress changes instead of the form