Evaluation of Arabic-Based Contextualized Word Embedding Models

Yagi, Sane MoMansour, YoussefKamalov, FiruzElnagar, Ashraf2022-05-192022-05-19© 20212021Yagi, S. M., Mansour, Y., Kamalov, F., & Elnagar, A. (2021). Evaluation of arabic-based contextualized word embedding models. 2021 International Conference on Asian Language Processing (IALP), pp. 200 - 206. https://doi.org/10.1109/IALP54817.2021.9675208978-166548311-7https://doi.org/10.1109/IALP54817.2021.9675208http://hdl.handle.net/20.500.12519/645This conference paper is not available at CUD collection. The version of scholarly record of this paper is published in 2021 International Conference on Asian Language Processing (IALP) (2021), available online at: https://doi.org/10.1109/IALP54817.2021.9675208.The distributed representation of words, as in Word2Vec, FastText, and GloVe, results in the production of a single vector for each word type regardless of the polysemy or homonymy that many words may have. Context-sensitive representation as implemented in deep learning neural networks, on the other hand, produces different vectors for the multiple senses of a word. Several contextualized word embeddings have been produced for the Arabic language (e.g., AraBERT, QARiB, AraGPT, etc.). The majority of these were tested on a few NLP tasks but there was no direct comparison between them. As a result, we do not know which of these is most efficient and for which tasks. This paper is a first step in an endeavor to establish evaluation criteria for them. It describes 24 such embeddings, then conducts exploratory intrinsic and extrinsic evaluation of them. Afterwards, it tests relational knowledge in them, covering four semantic relations: colors of fruits, capitals of countries, causation, and general information. It also evaluates the utility of these models in Named Entity Recognition and Sentiment Analysis tasks. It has been demonstrated here that AraBERTv02 and MARBERT are the best on both types of evaluation; therefore, both are recommended for fine-tuning Arabic NLP tasks. The ultimate conclusion is that it is feasible to test higher order reasoning relations in these embeddings. © 2021 IEEEen-USPermission to reuse abstract has been secured from Institute of Electrical and Electronics Engineers Inc.BERTExtrinsic evaluationIntrinsic evaluationLanguage ModelsEvaluation of Arabic-Based Contextualized Word Embedding ModelsConference PaperCopyright : © 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.