A Novel Multimodal Online News Popularity Prediction Model based on
Ensemble Learning
Abstract
The prediction of news popularity is having substantial
importance for the digital advertisement community in terms of selecting
and engaging users. Traditional approaches are based on empirical data
collected through surveys and applied statistical measures to prove a
hypothesis. However, predicting news popularity based on statistical
measures applied to past data is highly questionable. Therefore, in this
paper, we predict news popularity using machine learning classification
models and deep residual neural network models. Articles are usually
made up of textual content and in many cases, images are also used.
Although it is evident that the appropriate amount of textual data is
required to extract features and create models, image data is also
helpful in gaining useful information. In this paper, we present a novel
multimodal online news popularity prediction model based on ensemble
learning. This research work acts as a guide for extensive feature
engineering, feature extraction, feature selection, and effective
modeling to create a robust news popularity Prediction Model. Three
kinds of features – meta features, text features, and image features
are used to design an influential and robust model. The performance
measure Root Mean Squared logarithmic error (RMSLE) is used to validate
the outcome of the proposed model. Further, the most important features
are sought out for the proposed model to verify the dependence of the
model on text and image features.