Dealing with randomness and concept drift in large datasets

Mwitondi, Kassim S.; Said, Raed A.

Dealing with randomness and concept drift in large datasets

dc.contributor.author	Mwitondi, Kassim S.
dc.contributor.author	Said, Raed A.
dc.date.accessioned	2021-08-10T13:56:31Z
dc.date.available	2021-08-10T13:56:31Z
dc.date.copyright	© 2021
dc.date.issued	2021-07
dc.description.abstract	Data-driven solutions to societal challenges continue to bring new dimensions to our daily lives. For example, while good-quality education is a well-acknowledged foundation of sustainable development, innovation and creativity, variations in student attainment and general performance remain commonplace. Developing data-driven solutions hinges on two fronts-technical and appli-cation. The former relates to the modelling perspective, where two of the major challenges are the impact of data randomness and general variations in definitions, typically referred to as concept drift in machine learning. The latter relates to devising data-driven solutions to address real-life challenges such as identifying potential triggers of pedagogical performance, which aligns with the Sustainable Development Goal (SDG) #4-Quality Education. A total of 3145 pedagogical data points were obtained from the central data collection platform for the United Arab Emirates (UAE) Ministry of Education (MoE). Using simple data visualisation and machine learning techniques via a generic algorithm for sampling, measuring and assessing, the paper highlights research pathways for educa-tionists and data scientists to attain unified goals in an interdisciplinary context. Its novelty derives from embedded capacity to address data randomness and concept drift by minimising modelling variations and yielding consistent results across samples. Results show that intricate relationships among data attributes describe the invariant conditions that practitioners in the two overlapping fields of data science and education must identify. © 2021 by the authors. Licensee MDPI, Basel, Switzerland.	en_US
dc.description.sponsorship	Assess United Arab Emirates United Nations World Data Forum Education Principal Component Polar Environment Data Science Centre	en_US
dc.identifier.citation	Mwitondi, K. S., & Said, R. A. (2021). Dealing with Randomness and Concept Drift in Large Datasets. Data, 6(7), 77. https://doi.org/10.3390/data6070077	en_US
dc.identifier.issn	23065729
dc.identifier.uri	https://doi.org/10.3390/data6070077
dc.identifier.uri	http://hdl.handle.net/20.500.12519/422
dc.language.iso	en	en_US
dc.publisher	MDPI AG	en_US
dc.relation	Authors Affiliations : Mwitondi, K.S., College of Business, Technology & Engineering, Sheffield Hallam University, Industry & Innovation Research Institute, 9410 Cantor Building, City Campus, 153 Arundel Street, Sheffield, S1 2NU, United Kingdom; Said, R.A., Faculty of Management, Canadian University Dubai, Al Safa Street-Al Wasl, City Walk Mall, P.O. Box 415053, Dubai, United Arab Emirates
dc.relation.ispartofseries	Data;Volume 6, Issue 7
dc.rights	Creative Common Attribution 4.0 International (CC BY 4.0) License
dc.rights.holder	Copyright : © 2021 by the authors. Licensee MDPI, Basel, Switzerland.
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Artificial neural networks (ANNs)	en_US
dc.subject	Big Data	en_US
dc.subject	Concept drift	en_US
dc.subject	Data science	en_US
dc.subject	Supervised modelling	en_US
dc.subject	Sustainable development goals	en_US
dc.subject	Unsupervised modelling	en_US
dc.title	Dealing with randomness and concept drift in large datasets	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Access Instruction 422.pdf
Size:: 56.77 KB
Format:: Adobe Portable Document Format
Description:

Download

Name:: 422.pdf
Size:: 1.7 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

International Business