Under the commonalities found in the goals of two areas, neural network compression and feature selection for dimension reduction, this research focused on finding a new method to address both issues: a method that can lead to easier feature selection, and an enhancement in the capacity of information flow control of neural network compression techniques, especially clustering based compression. Specifically, this research focused on creating a novel and effective framework to transform the weight matrix between the input layer and the first hidden layer in neural networks to be optimal. In other words, a method that can make the weight matrix's structure itself optimal for information extraction. By proposing a simple, yet powerful weight clipping + GMM based method called an In-and-Out Weight Box that can intrinsically act similar to filtering while increasing the possibility of getting better results in compression, the main aim of research was found to be satisfied. Using Glioma Grading data from the UCI Repository for checking performance of the In-and-Out Weight box in fitting neural networks, it was found that significantly better compression results can be achieved in terms of weight sharing via clustering. This research also suggests a new feature selection method based on the In-and-Out Weight box constraint called IOW-FI, which can lead to solving limitations or problems of filtering techniques such as setting the number of components to be selected as efficient features or considering joint distributions of feature space.
Dimensionality reduction problems in deep learning architectures have become a crucial task in the field of data science due to increasing burdens in computational costs or complexity issues generated in processing massive big data. To handle this issue, methodologies such as neural network(NN) weight matrix compression based on clustering or network weight pruning based on information criterions were introduced to reduce memory issues or inefficiencies within fitted neural networks. However, though network compression methods itself have been suggested in various research, there still lies certain limitations in optimizing compressed NN to avoid over-fitting issues and achieve robustness in unobserved data. To solve this limitation, this research constructed a novel weight clipping method called Bayesian Weight Clipping based Neural Network Compression Optimization(BaWcNN) that can guide existing compression methods to find optimal solutions by providing information boundaries within fitted weight matrix prior to compression. Specifically, by implementing a double-changepoint model to probe for weight information bounds under MCMC sampling, and clipping controversial weights to information boundaries based on Euclidean distance, BaWcNN was found to substantially increase neural network compression results in classification performance or regression analysis for unobserved data under UCI Repository real datasets: Heart Disease Data, Parkinson's Disease Telemonitoring Data. Through BaWcNN weight clipping in weight matrices, it is expected that network compression for deep learning based models can leap towards high efficiency, which can avoid generations of non-relevant weight centroids in compression results deduced through weight sharing via clustering, or increase model robustness by eliminating controversial weights in parameter space.