Abstract:
In the modern learning landscape, educational videos and Massive Open Online Courses (MOOCs) have become a central tool for delivering content, offering a vibrant and an effective way to engage learners. These videos offer an interactive medium for disseminating educational content, meeting the needs of various learning styles and preferences. However, as the availability and accessibility of educational videos on digital educational platforms continue to expand, ensuring their quality and effectiveness becomes a critical challenge for the students, educators and instructional workforce. Currently, there is no standardized criterion for evaluating and rating of educational videos. This lack of assessment can lead to inconsistent quality in the educational material, making the evaluation process time consuming and potentially undermining the learning experience. This paper proposes a novel data-driven framework for evaluating and scoring educational videos based on their metadata (number of views, likes/ dislikes, comments sentiment etc) and the content of their transcripts (spoken content within the video). The goal is to create a framework that enables automated evaluation of video content, providing learners, educators, and content creators with a more comprehensive understanding to enhance the learning experience. To support this analysis, specialized datasets are created for both metadata and transcript, which focuses on these important factors of educational videos. A scoring mechanism has been devised through user feedback supported by statistical techniques for establishing a baseline for educational videos. Machine learning regression-based models and deep learning models are used to predict the scores, with their accuracy checked using Mean Squared Error (MSE), Mean Absolute Error (MAE) and R squared. The XGBoost model emerged as the most effective model for metadata-based predictions, while Support Vector Regressor (SVR) excelled in transcript-based data. After predicting scores for metadata and transcript for each video, an overall score is determined by averaging these values. This overall score serves as a reliable indicator of the educational value of the video, considering both its popularity and the strength of its content. These results highlight the potential of machine learning models to effectively predict and rate educational video quality, offering a robust framework for a more objective assessment of digital educational content and contributing to the enhancement of global education standards.