Abstract:
The ongoing integration of the Internet of Things (IoT) with evolving wireless communica-tion
technology leads to the generation of multimodal data. The rising popularity of cloud
computing is instrumental in providing on-demand and ubiquitous storage and computa-tion
resources for handling this multimodal Big Data. The growing acceptance of cloud
computing results from its inherent flexibility and economic advantages gained through out sourcing. This inclination is not a matter of preference but a deliberate design decision for
various organizational and personnel-level applications. Nevertheless, in the context of cloud
computing, the privacy and security aspects emerge as significant apprehensions when
individuals and organizations entrust their private data to public cloud servers for access by
authenticated users. The cloud server is typically viewed as an untrusted entity, leading to
potential data exposure risks to third parties or even the cloud service provider.
While searchable encryption, as a cryptographic primitive, has effectively addressed se curity concerns by enabling searches on encrypted data, it introduces certain limitations.
Firstly, it imposes a challenge for non-expert users by requiring precise recall of the key word. Secondly, in today’s multimodal data scenario, this primitive operates exclusively on a
single data type. Deep learning natural language processing (NLP) models can enable
semantic-aware searches, interpreting queries based on context and intent rather than just
exact keywords, even for those without domain expertise. However, contemporary schemes
integrating deep learning-based models into searchable encryption (SE) predominantly op-erate
within single-owner/single-user settings. This prevalent approach significantly restricts the
adaptability and versatility of cloud storage solutions. At the same time, the solidified SE
framework for multimodal data is still a challenging issue in the multi-owner/multi-user (M/M)
setting.
This thesis presents a generalized and solidified semantic-aware SE framework using
attribute-based encryption. The proposed framework enables information retrieval from
layperson search query formulation, which lacks specialized domain knowledge. It rep resents the first achievement of the M/M setting in the deep-learning model using the secure
transfer learning technique. Furthermore, by adopting the deep learning model according to
the underlying data structure, our proposed framework can seamlessly transition between
structured (i.e., text/documents) and unstructured (i.e., graph and image data) data.
Based on the proposed framework, this thesis presents four SE schemes. The first pro-posed
SE scheme, considering the domain-specific jargon and conceptual overlapping be xi
tween words that require expert interpretation for their search, utilizes the deep learning based Doc2Vec model with access control. It captures the common data user intuition behind
its search query in the multi-user setting. Considering the graph structure data capability to
model complicated structural data, the second scheme solves the problem of structure-aware
full graph similarity search in privacy-preserving cloud computing for the first time, with
access control, by utilizing the neural Graph2Vec model. Our third solution deploys CNN
for better accuracy and fine-grained access control, providing a secure mechanism to ensure
an identical feature space for image users in the verifiable multi-owner multi-user setting.
Our last solution leverages the capabilities of blockchain’s smart contract mechanism to es tablish a multi-attribute authority SE scheme. Integrating smart contracts avoids dependence
on a single trusted entity within an ABE infrastructure. Instead, it facilitates the consensus based generation of user private keys and system-wide global parameters in a mutual distrust
scenario (M/M setting).