Volume 3, Issue 1 (1-2017)                   ITCMS 2017, 3(1): 1-11 | Back to browse issues page

XML Print

Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Doumi N, Lehireche A, Maurel D, Toumouh A, Khelifa C. FSM-based Free Resources and Tools for MSA Processing. ITCMS. 2017; 3 (1) :1-11
URL: http://itcms.europeansp.org/article-11-169-en.html
Computer Science Dept., University of Saïda Algeria
Abstract:   (508 Views)
We present in this paper a set of resources and tools designed and implemented using the finite-state machine technology. These resources and tools are designed to process the Modern Standard Arabic textual corpora. The resources are kind of lexical electronic dictionaries containing lemmas, full diacritized word forms and their morph-syntactic features. The features can be extended to encompass the semantic category. The tools are kind of tokenizer, concordance tool, lemmatizer, POS-tagger, morphological analyzer, sentence segmentation and local grammars. Somme of these tools are hardcoded in programming language and the rest are designed as finite-state transducers and recursive transition networks. All these resources and tools are freely accessible in the web under the name of Arabic package in Unitex/GramLab platform.
Full-Text [PDF 216 kb]   (291 Downloads)    
Type of Study: Research |
Received: 2019/08/8 | Accepted: 2019/08/8 | Published: 2019/08/8

Add your comments about this article : Your username or Email: