Abstract:
Cloud computing has become an important part of both industry and academia
in the last few years. With the aim to provide
exible services to the users in
a transparent manner, all services are located in the \Cloud", which is a col-
lection of software and hardware resources accessible over the internet. One
of the prevalent services provided by the Cloud is data storage. Before the
advent of Cloud, data storage has been established as one of the main con-
cerns in information technology. Network based applications have further led
this concept to distributed storage from single server storage. Data security
is the foundation of information security and research in data security in the
Cloud is still in its infancy. In addition, traditional data security systems are
not robust enough to secure data in distributed storage applications, speci -
cally in the Cloud computing environment. There is no e ective approach to
verify that data host nodes are under complete security protection in a dis-
tributed database environment. Furthermore, the activity of data owner is
not controlled by himself in the Cloud, which makes data prone to attackers
once a data node is compromised. Therefore, the con dentiality and integrity
of data is violated when an internal or external malicious entity controls a
node.
Moreover, advancements in database technology have led to new data
storage paradigms like NoSQL (Not only SQL) and Object Oriented databases
etc. NoSQL data stores are non-relational databases specially designed to
provide high availability, reliability and scalability with big data processing
capabilities. Additionally, sharding is one of the main advantages of NoSQL
databases. Database Sharding is a highly scalable approach for improving
the throughput and overall performance of high-transaction, large database-
centric business applications deployed specially on Cloud platform. The main
idea behind sharding is to partition database/collection horizontally among
various nodes known as shards. These sharded NoSQL databases when de-
ployed on Cloud as Database as a Service (DBaaS) service model impose
further challenges of security and privacy besides managing their own core
functionalities. In DBaaS model, clients sensitive data has to reside on the
2 Abstract
cloud providers domain. This needs for extra security measures on providers
side to guard clients data privacy as most big clients (corporations, enter-
prises etc.) view their data as a valuable asset.
Encryption and authorization are the two primary ways for ensuring data
con dentiality in any sharded database environment. When encryption is
used for the measurement of data con dentiality, management of crypto-
graphic keys also becomes a challenging security function. These security
challenges become more complex in Cloud environment where data is dis-
tributed among various physical and logical storage media, Cloud actors and
Cloud administrators etc. In our thesis, we have explored and addressed
above mentioned issues by rst structuring the knowledge in the domain of
Cloud databases into a well organized taxonomy. Secondly, we have made use
of the concept of secure NoSQL sharding to horizontally scale large amount
of data securely among various nodes or shards on Cloud platform. Sharded
data needs to be protected using various security controls of con dentiality
and data protection. Particularly, the security controls like data encryp-
tion, key management and access controls are the critical requirements of
many regulatory compliance like HIPAA HITECH, PCI DSS, FERPA and
EU data Protection Directive etc. In this thesis, we have proposed secure
sharding architecture of NoSQL database MongoDB, which makes use of en-
cryption scheme to encrypt chunks of data before saving it into the shard
while keeping track of their encryption keys e ciently. Moreover, ne grain
access control is implemented for guaranteed data integrity and authorized
access in the Cloud environment. Our model has also performed automatic
key management of the cryptographic keys produced and saved on our plat-
form. Hence, our proposed method has helped the NoSQL databases to
provide data con dentiality to its users while providing high throughput due
to sharding capabilities.
A rigorous testing of our proposed architecture has also been done by
using NIST and CSA (Cloud Security Alliance) bases qualitative and quan-
titative criteria. These standards are used to analyze security and func-
tionality aspects of our system. Functionality of the proposed system is
certi ed through user de ned test cases while security analysis is done by
using CSET tool (Cyber Security Evaluation Tool). The results of our eval-
uation has con rmed the signi cant increase in security and functionality of
proposed system.