LiRA-NLG Workshop @ INLG 2017

UP Programme

Monday, September 4, 2017 - at the School of Engineering, ETSE Universidade de Santiago de Compostela Location: Campus Vida. Lope Gómez de Marzoa Street.- Room A2

Time	Authors	Paper
8:30 - 9:00	Registration (Entrance Hall)
Chair	Peter Machonis
9:00 - 9:10	LiRA @ NLG Welcomes you
9:10 - 9:35	Max Silberztein	From FOAF to English: Linguistic Contribution to Web Semantics Abstract This paper presents a linguistic module capable of generating a set of English sentences that correspond to a Resource Description Framework (RDF) statement; I discuss how a generator can control the linguistic module, as well as the various limitations of a pure linguistic framework.
9:35 - 10:00	Silvia García Méndez, Milagros Fernández Gavilanes, Enrique Costa Montenegro, Jonathan Juncal Martínez and Francisco Javier González Castaño	Lexicon for Natural Language Generation in Spanish Adapted to Alternative and Augmentative Communication Abstract In this paper we present Elsa, the first lexicon for Spanish with morphological, syntactic and semantic information automatically generated from a well-known pictogram resource and especially tailored for Augmentative and Alternative Communication (AAC). This lexicon, focusing on that specific icon set widely used within AAC applications, is motivated by the need to improve Natural Language Generation (NLG) systems to aid people who have been diagnosed to suffer from communication disorders. In addition, we design an automatic lexicon extension procedure by means of a training process to complete the linguistic data. For this we used a dataset composed of novels and tales in Spanish, with pictogram representations, since the lexicon is meant for AAC applications for children with disabilities. Moreover, we provide the algorithms used to build our lexicon and a use case of Elsa within an NLG system to observe the usability of our proposal.
10:00 - 10:25	Essia Bessaies, Slim Mesfar and Henda Ben Ghezala	Generating Answering Patterns from Factoid Arabic Questions Abstract This works deals with Arabic factoid Question Answering systems (QA). Commonly, the task of QA is divided into three phases: question analysis, answer pattern generation, and answer extraction. Each phase plays a crucial role in overall performance. In this paper, we focus on the two first phases: Question Analysis and Answer Pattern Generation. We used the NooJ platform which represents a valuable linguistic development environment. The first evaluations show that the actual results are encouraging and could be deployed for more types of questions other than factoid ones.
10:25 - 10:50	Kristina Kocijan, Božo Bekavac and Krešimir Šojat	Language Generation from DB Query Abstract This paper demonstrates how to generate natural language sentences from the pieces of data found in databases in the domain of flight tickets. By using NooJ to add context to specific customer data found in customer data sets, we are able to produce sentences that give a short textual summary of each customer, providing a list of possible suggestions how to proceed. In addition, due to the rich morphology of Croatian, we are giving special attention to matching gender, number and case information where appropriate. Thus we are able to provide individualized and grammatically correct text in spite of the customer gender or the number of tickets bought and inquiries made. We believe that such short NL overviews can help ticket sellers get a quicker assessment of the type of a customer and allow for the exchange of information with more confidence and greater speed.
11:00 - 11:30	Coffee break
Chair	Kristina Kocijan
11:30 - 11:55	Peter Machonis	Using Electronic Dictionaries and NooJ to Generate Sentences Containing English Phrasal Verbs Abstract This paper attempts to explore NooJ’s “generation” mode to automatically produce transformations of sentences containing English Phrasal Verbs (PV). We exploit the same electronic dictionary and grammar previously used to recognize PV in large corpora (Machonis 2010, 2012), but have had to design a specific grammar for generating sentences, following the examples in Silberztein (2016), which showed how NooJ could generate over two million transformations or parallel sentences from the simple sentence Joe likes Lea. We created a grammar that can generate variations of a single phrase containing one of the PV found in the NooJ PV dictionary. For the moment the grammar only handles singular nouns in the present and past tense, but it is capable of applying a succession of transformations – particle movement, preterit, negation, clefting, modal insertion, aspect introduction, question formation, and passive voice, along with various combinations of these transformations – to over 1,200 PV from the electronic dictionary.
11:55 - 12:20	Hela Fehri and Sondes Dardour	Generating Text with Correct Verb Conjugation: Proposal for a New Automatic Conjugator with NooJ Abstract This paper describes a system that generates texts with correct verb conjugation. The proposed system integrates a conjugator developed using a linguistic approach. This latter is based on dictionaries and transducers built with the NooJ linguistic platform. The conjugator treats three languages: Arabic, French and English. It recognizes all verbs and allows their conjugation in different tenses. The results obtained are satisfactory and can easily be improved upon by processing other forms, such as the negative.
12:20 - 12:45	Jouda Ghorbel	Formalization of Speech Verbs with NooJ for Machine Translation: the French Verb accuser Abstract The mediocrity of sentences generated by online translators prompts us to try to find a solution to have more reliable translations. This is a very difficult task due to the ambiguity of natural languages and especially the deficiencies of translation systems in terms of syntactic and semantic knowledge. How can we make automatic translation more reliable and unambiguous? Our main objective will be to generate a text where the translation of French verbs into Arabic will be without ambiguities. In this contribution, we attempt to formalize a particular class of verbs, namely the socalled verbs of speech. We shall limit ourselves to the treatment of the verb accuser ‘to accuse’ as presented in the Dubois & Dubois-Charlier (1997) electronic dictionary, Les verbes français. We shall take this verb as a prototype to show how NooJ can perform a reliable machine translation and generate a good text without ambiguities.
12:45 - 13:10	Ikram Bououd and Rania Fafi	Using Serious Games to Correct French Dictations: Proposal for a New Unity3D/NooJ Connector Abstract The remarkable growth in serious game use has gradually pushed them to be present in every single domain. However, in language learning we did not find any reliable games developed for dictation exercises, commonly used for the teaching of French. This involves natural language processing in the form of an interactive game that can automatically generate corrections and assess game users. In order to fill this research gap, we propose to take advantage of the assets provided by the NooJ platform and develop a game combining NooJ and the 3D game platform Unity3D.
13:10 - 13:35	Ritamari Bucciarelli and Raffaele Marcone	Linguistic Resources for Automatic Natural Sign Language Generation Abstract Work-Tools (WT) is a Linguistic Resources for Automatic Natural Sign Language Generation software for automatic textual analysis that describes and transforms language into morphemes, lexemes, and fixed phrases. It involves building a communicative model of switching non-verbal natural languages L1 to verbal L2. The WT software is structured for complex activities in natural language such as "parsing" to recognize and generate texts. It provides a man-machine interaction in the production, questioning, and evaluation of the construction of texts. It is used for didactic purposes and aids in the transformation of languages. WT consists of: (1) a search engine or database for data entry, in which the data described for cognitive areas are transformed into acronyms and indexed; (2) a writing corpus in which we organize the text in free sentences and fixed phrases, in accordance with the grammatical rules and the syntactic relations of the natural language that we propose with the data transfer from L1 to L2; (3) a transcoding corpus and faithful translation of text or textual parts from L1 to L2; and (4) a scroll bar where new text is transmitted in real time.
13:35	Max Silberztein	LiRA@NLG Workshop Wrap-Up

UP Call for papers

In conjunction with INLG 2017 in Santiago de Compostela, we are organizing a half-day workshop on LiRA-NLG (Linguistic Resources for Automatic Natural Language Generation) Workshop.

This workshop aims to bring together linguists who are interested in developing large-coverage linguistic resources and researchers with an interest in developing real-world NLG software. These two communities have been working separately for many years: NLG researchers are typically more focused on technical issues specific to text generation, where good performance (e.g. recall and precision) is crucial, whereas linguists tend to focus on problems related to the development of exhaustive and precise resources that are mainly 'neutral' vis-a-vis any NLP application (e.g. parsing or generating sentences), using various grammatical formalisms such as NooJ, TAG or HPSG.

However, recent progress in both fields is reducing many of these differences, with large-coverage linguistic resources being more and more used by robust NLP software. For instance, NLG researchers can now use large dictionaries of multiword units and expressions, and several linguistic experiments have shown the feasibility of using large phrase-structure grammars (a priori used for text parsing) in 'generation' mode, to automatically produce paraphrases of sentences that are described by grammars.

By encouraging members of both communities to discuss work in related topics with each other, we hope to move towards better joint understanding of the problems involved. This workshop focuses on the following questions:

How to develop 'neutral' linguistic resources (dictionaries, morphological, phrase-structure and transformational grammars) that can be used both to parse and generate texts automatically.
Is it possible to generate grammatical sentences by using linguistic data alone, i.e. with no statistical methods to remove ambiguities? What are the limitations of rule-based systems, as opposed to stochastic ones?

Topics can relate to any aspect of NLG, such as:

large-coverage linguistic resources
lexicalization
Machine-Translation
NLG for real-world application
paraphrase generation
phraseology of specialized languages
rule-based approaches to generation
comparison between rule-based and statistical approaches to NLG
surface realization
text-to-text generation and summarization
transformational analysis and generation.

We encourage participants to submit papers at the general INLG2017 conference as well

UP Submission Information

Authors are invited to submit short papers describing original, unpublished work, be it completed or in progress. The papers should be maximally 2 pages of main content, with additional pages allowed for references and appendices. All accepted papers will be presented as talks.

Abstract submission will be electronic in PDF format through the EasyChair conference management system.

Abstract submission page will close on June 2nd, 2017 at 23:00 Standard European Time

For full papers, please use INLG Text Formatting Style.

^[NEW]Workshop Proceedings are included in ACL Anthology.^[NEW]

Reviewing Policy

Reviewing will be single-blind, so authors do not need to conceal their identity. The paper should include the authors' names and affiliations. Self-references are also allowed.

UP Registration

Regular: EUR 75 (early); EUR 125 (late); EUR 150 (on-site)
Student: EUR 50 (early); EUR 75 (late); EUR 100 (on-site);
free registration for student helpers

For more information on registration fees and how to become student-helper, refer to INLG website.

UP Workshop Organizers

Kristina Kocijan, Assistant Professor of Information and Communication Sciences, University of Zagreb (Croatia)
Peter Machonis, Professor of French and Linguistics, Florida International University (USA)
Max Silberztein, Professor of Computer Science and Linguistics, Université de Franche-Comté (France)

UP Scientific Committee

Héla Fehri (University of Gabes, Tunisia)
Yuras Hetsevich (United Institute of Informatic Problems, Belarus)
Kristina Kocijan (University of Zagreb, Croatia)
Elena Lloret Pastor (Universidad de Alicante, Spain)
Peter Machonis (Florida International University, USA)
Slim Mesfar (University of Carthage, Tunisia)
Simon Mille (Universitat Pompeu Fabra, Spain)
Max Silberztein (Université de Franche-Comté, France)

LiRA-NLG 2017

Linguistic Resources for Automatic Natural Language Generation Workshop

@ INLG 2017 Conference Santiago de Compostela, Spain

UP Important dates