Introduction to the Nooj module for the Romanian language 

Author: Maria-Diana MANESCU
Contact: diana.manescu@gmail.com
Institution: Politehnica University of Bucharest, Romania in partnership with Grenoble Alpes University, France

The present module was developed in 2018. It contains a contemporary Romanian language corpus, and a morphological dictionary covering invariable parts of speech and verbs. 
It can be used for didactic purposes, for the development of various NLP applications (for instance, a conjugator), and even for rudimentary autocorrection modules.

Lexical Analysis folder contains:
- a ro_lexique.dic file listing 7.522 verbs, 80 conjunctions, 158 prepositions, 637 interjections, and 1.885 adverbs
- a ro_lexique-flx.dic file listing 263.705 inflected forms 
- a ro_lexique.nof file listing 192 inflection paradigms for Romanian verbs categorized by conjugation group (108 for the 1st group, 10 for the 2nd group, 47 for the 3rd group, 27 for the 4th group, and 3 for the auxiliary verbs)
- a ro_lexique.nod file: the dictionary compiled to the Nooj format which recognizes 263.705 word forms
- a ro_corpus_all_files.noc file : a 160 text files corpus composed of 441.825 word forms

Syntactic Analysis folder:
No work has been conducted yet, but developing disambiguation grammars would be an essential contribution.


Meaning of the labels used in the description:

Parts of speech:
ADV = adverb
CONJC = coordination conjunction
CONJS = subordinating conjunction
INTJ = interjection 
PREP = preposition
V = verb

Persons & Numbers:
1 = first person
2 = second person
3 = third person 
sg = singular
pl = plural

Genders:
m = masculine 
f = feminine
n = neuter

Conjugation groups:
gr1 = 1st group
gr2 = 2nd group
gr3 = 3rd group
gr4 = 4th group

Verbal moods and tenses:
G = Gerund 
P = Participle 	 	
PR = Present indicative	
IP = Imperative affirmative	
PS = Simple perfect (preterite)	
INF = Infinitive 	
IMP = Imperfect 	
PQP = Pluperfect 	
S_PR = Present subjunctive

Word structure:
smpl = simple
MOTCOMPOSE = compound







