Efficient Discrete Cosine Transform (DCT) coefficients for Images Using Deep Learning

Khan, Zirak

DSpace Home
→
E-Theses
→
CEME
→
Computer Software Engineering
→
MS
→
View Item

Efficient Discrete Cosine Transform (DCT) coefficients for Images Using Deep Learning

Khan, Zirak

URI: http://10.250.8.41:8080/xmlui/handle/123456789/36185

Date: 2019

Abstract:

Digital audio, videos and images corresponds to huge amount of data. The storage and transmission of this data requires significant amount of bandwidth and memory respectively. Compressing this digital data is a field which has been researched upon for decades. Many state of the art compression algorithms have been proposed to cater the storage and transmission requirements of digital data. Aiming at the digital image compression, Discrete Cosine Transform is a widely used transform to explore the frequencies present in a digital image. During the quantization step the less significant frequencies are discarded and only the more important frequencies are retained. This quantization results in the reduced representation of the image hence compression is achieved. The image reconstructed from this reduced frequency set is an approximation of the original image and hence it results in lossy image compression. Lately, a significant amount of research work has been conducted based on the use of neural networks for image compression. This thesis presents a detailed literature review to thoroughly analyze the existing literature and methods. In this thesis we target a deep neural network that can estimate the most important DCT coefficients for an image and then we utilize these most significant DCT Coefficients for the classification task. The estimation of DCT coefficients is targeted by a Multi-Layered Perceptron (MLP) model and a Deep Convolutional Neural Network (DCNN) model. The experimentation showed promising results and revealed that MLP models have relatively lower error rate between actual and predicted results, as compared to DCNN models. Later on, MNIST image dataset is applied to the proposed deep learning models for the prediction of its most significant DCT coefficients and the predicted results are then used for digits classification. The experimental results support the DCT based digits classification with an accuracy of 95%, which is quiet promising. In future, the proposed technique also leverages the use of compressed images for tackling different image classification and regression problems. Moreover, the proposed deep neural networks can be further generalized to support videos and color representations.