Evaluation of Arabic-Based Contextualized Word Embedding Models
dc.contributor.author | Yagi, Sane Mo | |
dc.contributor.author | Mansour, Youssef | |
dc.contributor.author | Kamalov, Firuz | |
dc.contributor.author | Elnagar, Ashraf | |
dc.date.accessioned | 2022-05-19T09:57:04Z | |
dc.date.available | 2022-05-19T09:57:04Z | |
dc.date.copyright | © 2021 | |
dc.date.issued | 2021 | |
dc.description | This conference paper is not available at CUD collection. The version of scholarly record of this paper is published in 2021 International Conference on Asian Language Processing (IALP) (2021), available online at: https://doi.org/10.1109/IALP54817.2021.9675208. | |
dc.description.abstract | The distributed representation of words, as in Word2Vec, FastText, and GloVe, results in the production of a single vector for each word type regardless of the polysemy or homonymy that many words may have. Context-sensitive representation as implemented in deep learning neural networks, on the other hand, produces different vectors for the multiple senses of a word. Several contextualized word embeddings have been produced for the Arabic language (e.g., AraBERT, QARiB, AraGPT, etc.). The majority of these were tested on a few NLP tasks but there was no direct comparison between them. As a result, we do not know which of these is most efficient and for which tasks. This paper is a first step in an endeavor to establish evaluation criteria for them. It describes 24 such embeddings, then conducts exploratory intrinsic and extrinsic evaluation of them. Afterwards, it tests relational knowledge in them, covering four semantic relations: colors of fruits, capitals of countries, causation, and general information. It also evaluates the utility of these models in Named Entity Recognition and Sentiment Analysis tasks. It has been demonstrated here that AraBERTv02 and MARBERT are the best on both types of evaluation; therefore, both are recommended for fine-tuning Arabic NLP tasks. The ultimate conclusion is that it is feasible to test higher order reasoning relations in these embeddings. © 2021 IEEE | |
dc.identifier.citation | Yagi, S. M., Mansour, Y., Kamalov, F., & Elnagar, A. (2021). Evaluation of arabic-based contextualized word embedding models. 2021 International Conference on Asian Language Processing (IALP), pp. 200 - 206. https://doi.org/10.1109/IALP54817.2021.9675208 | |
dc.identifier.isbn | 978-166548311-7 | |
dc.identifier.uri | https://doi.org/10.1109/IALP54817.2021.9675208 | |
dc.identifier.uri | http://hdl.handle.net/20.500.12519/645 | |
dc.language.iso | en_US | |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | |
dc.relation | Authors Affiliations : Yagi, S.M., Dept. of Foreign Languages, University of Sharjah, Sharjah, United Arab Emirates; Mansour, Y., Dept. of Computer Science, University of Sharjah, Sharjah, United Arab Emirates; Kamalov, F., Dept. of Electrical Engineering, Canadian University of Dubai, Dubai, United Arab Emirates; Elnagar, A., Dept. of Computer Science, University of Sharjah, Sharjah, United Arab Emirates | |
dc.relation.ispartofseries | 2021 International Conference on Asian Language Processing (IALP) | |
dc.rights | Permission to reuse abstract has been secured from Institute of Electrical and Electronics Engineers Inc. | |
dc.rights.holder | Copyright : © 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | |
dc.rights.uri | https://www.ieee.org/publications/rights/rights-policies.html | |
dc.subject | BERT | |
dc.subject | Extrinsic evaluation | |
dc.subject | Intrinsic evaluation | |
dc.subject | Language Models | |
dc.title | Evaluation of Arabic-Based Contextualized Word Embedding Models | |
dc.type | Conference Paper | |
dspace.entity.type |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Access Instruction 645.pdf
- Size:
- 56.32 KB
- Format:
- Adobe Portable Document Format
- Description: