Fuzzy Keyword Search overEncrypted Data in Cloud Computing Pranjuli Yavatkar, NikitaPatil, Sneha KaleDepartment of ComputerEngineering, Bharati Vidyapeeth College of Engineering, Navi Mumbai,Maharashtra  Abstract -As Cloud Computing is highly dominating technology in recent days, entiresensitive information is being stored onto the cloud.

For maintaining data confidentiality,sensitive data are generally encrypted, which makes effective data utilizationa very complex task. The Existing searchable encryption schemes provides a wayfor secure search over encrypted data using keywords and retrieving thenecessary files. Whereas these techniques support only exact keyword search.That is, there is no acceptance of slight typos and format inconsistencieswhich are typical user searching behavior. Because of this drawback, theexisting techniques becomes incompatible in cloud computing, affecting thesystem usability. This makes the user searching experiences very frustratingand results in low system efficiency. This paper includes the formalization andsolution of the problem of effective fuzzy keyword search over encrypted clouddata as well as preserving keyword privacy. Fuzzy keyword search helps toenhance the system usability by generating the matching files when users’searching inputs exactly match the predefined keywords or the closest possiblematching files based on keyword similarity semantics, when exact match fails.

 KEYWORDS: Encryption, Fuzzy Keyword, Cloud Computing I.                   INTRODUCTION As Cloud Computingis highly dominating technology in recent days, entire sensitive information isbeing stored onto the cloud, such as emails, healthrecords, government documents, personal data etc. By putting away theirinformation into the cloud, the data owners can be relieved from the burden ofdata storage and maintenance so as to enjoy the on-demand high quality datastorage service. However, the cloud server may no longer be fully trusted. The sensitivedata usually should be encrypted prior to outsourcing for data privacy andpreventing unsolicited accesses. However, data encryption makes effective datautilization a very challenging task given that there could be a large amount ofoutsourced data files.

Besides, in Cloud Computing, information owners mayshare their outsourced information with countless. The individual users mightwant to only retrieve certain specific data files they are interested in duringa given session. One of the most popular ways is to selectively retrieve filesthrough keyword-based search instead of retrieving all the encrypted files backwhich is completely impractical in cloud computing scenarios. Suchkeyword-based search technique allows users to selectively retrieve files ofinterest and has been widely applied in plaintext search scenarios, such asGoogle search. Unfortunately, data encryption restricts user’s ability toperform keyword search and thus makes the traditional plaintext search methodsunsuitable for Cloud Computing. Besides this, data encryption also demands theprotection of keyword privacy since keywords usually contain importantinformation related to the data files. Although encryption of keywords can protectkeyword privacy, it further renders the traditional plaintext search techniquesuseless in this scenario. To securely search over encrypted data, searchableencryption techniques have been developed in recent years.

Best services for writing your paper according to Trustpilot

Premium Partner
From $18.00 per page
4,8 / 5
4,80
Writers Experience
4,80
Delivery
4,90
Support
4,70
Price
Recommended Service
From $13.90 per page
4,6 / 5
4,70
Writers Experience
4,70
Delivery
4,60
Support
4,60
Price
From $20.00 per page
4,5 / 5
4,80
Writers Experience
4,50
Delivery
4,40
Support
4,10
Price
* All Partners were chosen among 50+ writing services by our Customer Satisfaction Team

Searchableencryption schemes usually build up an index for each keyword of interest andassociate the index with the files that contain the keyword. By integrating thetrapdoors of keywords within the index information, effective keyword searchcan be realized while both file content and keyword privacy are well-preserved.Although allowing for performing searches securely and effectively, theexisting searchable encryption techniques do not suit for cloud computingscenario since they support only exact keyword search. That is, there is no toleranceof minor typos and format inconsistencies.It is quite common that users’ searching input mightnot exactly match those pre-set keywords due to the possible typos,representation inconsistencies, and/or her lack of exact knowledge about thedata. The naive way to support fuzzy keyword search is through simple spellcheck mechanisms. However, this approach does not completely solve the problemand sometimes can be ineffective due to the following reasons: on the one hand,it requires additional interaction of user to determine the correct word fromthe candidates generated by the spell check algorithm, which unnecessarilycosts user’s extra computation effort; on the other hand, in case that useraccidentally types some other valid keywords by mistake (for example, searchfor “hat” by carelessly typing “cat”), the spell check algorithm would not evenwork at all, as it can never differentiate between two actual valid words.Thus, the drawbacks of existing schemes signifies the important need for new techniquesthat support searching flexibility, tolerating both minor typos and formatinconsistencies.

In this paper, we focus on enabling effective yetprivacy-preserving fuzzy keyword search in Cloud Computing. To the best of ourknowledge, we formalize for the first time the problem of effective fuzzykeyword search over encrypted cloud data while maintaining keyword privacy.Fuzzy keyword search greatly enhances system usability by returning thematching files when users’ searching inputs exactly match the predefinedkeywords or the closest possible matching files based on keyword similaritysemantics, when exact match fails. More specifically, we use edit distance toquantify keywords similarity and develop a novel technique, i.e., awildcard-based technique, for the construction of fuzzy keyword sets. Thistechnique eliminates the need for enumerating all the fuzzy keywords and theresulted size of the fuzzy keyword sets is significantly reduced. Based on theconstructed fuzzy keyword sets, we propose an efficient fuzzy keyword searchscheme.

 II.                RELATED WORK Plaintext fuzzykeyword search: Recently, the importance of fuzzysearch has received attention in the context of plaintext searching ininformation retrieval community 11–13. They addressed this problem in thetraditional information access paradigm by allowing user to search withoutusing try-and-see approach for finding relevant information based onapproximate string matching.

At the first glance, it seems possible for one todirectly apply these string matching algorithms to the context of searchableencryption by computing the trapdoors on a character base within an alphabet.However, this trivial construction suffers from the dictionary and statisticsattacks and fails to achieve the search privacy. Searchableencryption: Traditional searchable encryption hasbeen widely studied in the context of cryptography.

Among those works, most arefocused on efficiency improvements and security definition formalizations. Thefirst construction of searchable encryption was proposed by Song et al., inwhich each word in the document is encrypted independently under a specialtwo-layered encryption construction. Goh proposed to use Bloom filters toconstruct the indexes for the data files. To achieve more efficient search,Chang et al. and Curtmola et al.

both proposed similar “index” approaches,where a single encrypted hash table index is built for the entire filecollection. In the index table, each entry consists of the trapdoor of akeyword and an encrypted set of file identifiers whose corresponding data filescontain the keyword. As a complementary approach, Boneh et al. presented apublic-key based searchable encryption scheme.

Note that all these existingschemes support only exact keyword search, and thus are not suitable for CloudComputing. III.             PROBLEM FORMULATION A. System Model:    In this paper, weconsider a cloud data system consisting of data owner, data user and cloudserver. Given a collection of n encrypted data files C = (F1, F2,..

., FN)stored in the cloud server, a predefined set of distinct keywords W = {w1, w2,…, wp}, the cloud server provides the search service for the authorized usersover the encrypted data C.

We assume the authorization between the data ownerand users is appropriately done. An authorized user types in a request toselectively retrieve data files of his/her interest. The cloud server isresponsible for mapping the searching request to a set of data files, whereeach file is indexed by a file ID and linked to a set of keywords. The fuzzykeyword search scheme returns the search results according to the followingrules: 1) if the user’s searching input exactly matches the pre-set keyword,the server is expected to return the files containing the keyword1; 2) if thereexist typos and/or format inconsistencies in the searching input, the serverwill return the closest possible results based on pre-specified similaritysemantics. An architecture of fuzzy keyword search is shown in the Fig. 1. B.

Threat Model  We consider asemi-trusted server. Even though data files are encrypted, the cloud server maytry to derive other sensitive information from users’ search requests whileperforming keyword-based search over C. Thus, the search should be conducted ina secure manner that allows data files to be securely retrieved while revealingas little information as possible to the cloud server. More specifically, it isrequired that nothing should be leaked from the remotely stored files and indexbeyond the outcome and the pattern of search queries. C.Design Goals  In this paper, weaddress the problem of supporting efficient yet privacy-preserving fuzzykeyword search services over encrypted cloud data. Specifically, we have thefollowing goals: i) to explore new mechanism for constructing storage efficientfuzzy keyword sets; ii) to design efficient and effective fuzzy search schemebased on the constructed fuzzy keyword sets; iii) to validate the security ofthe proposed scheme. IV.

CONSTRUCTIONS OF EFFECTIVEFUZZY KEYWORD SEARCH IN CLOUD The key idea behindour secure fuzzy keyword search is two-fold: 1) building up fuzzy keyword setsthat incorporate not only the exact keywords but also the ones differingslightly due to minor typos, format inconsistencies, etc.; 2) designing anefficient and secure searching approach for file retrieval based on the resultedfuzzy keyword sets. A.Advanced Technique for Constructing Fuzzy Keyword Sets  To provide morepractical and effective fuzzy keyword search constructions with regard to bothstorage and search efficiency, we now propose an advanced technique to improvethe straightforward approach for constructing the fuzzy keyword set. Withoutloss of generality, we will focus on the case of edit distance d = 1 toelaborate the proposed advanced technique. For larger values of d, thereasoning is similar. Note that the technique is carefully designed in such away that while suppressing the fuzzy keyword set, it will not affect the searchcorrectness.

 Wildcard-basedFuzzy Set Construction  In the abovestraightforward approach, all the variants of the keywords have to be listedeven if an operation is performed at the same position. Based on the aboveobservation, we proposed to use a wildcard to denote edit operations at thesame position. The wildcard-based fuzzy set of wi with edit distanced is denoted as S wi,d ={S’wi,0, S’wi,2, ··· ,S’wi,d }, where S’ wi ,?  denotes the set of words wi with ? wildcards.

Note each wildcard represents an edit operation on wi. For example,for the keyword CASTLE with the pre-set edit distance 1, its wildcard-basedfuzzy keyword set can be constructed as SCASTLE,1 = {CASTLE, *CASTLE, *ASTLE,C*ASTLE, C*STLE, ··· , CASTL*E, CASTL*, CASTLE*}. The total number of variantson CASTLE constructed in this way is only 13 + 1, instead of 13 × 26 + 1 as inthe above exhaustive enumeration approach when the edit distance is set to be1. Generally, for a given keyword wi with length l,the size of S wi,1 will be only 2 l +1+1,as compared to (2 l + 1) × 26 + 1 obtained in thestraightforward approach. The larger the pre-set edit distance, the morestorage overhead can be reduced: with the same setting of the example in the straightforwardapproach, the proposed technique can help reduce the storage of the index from30GB to approximately 40MB.

In case the edit distance is set to be 2 and 3, thesize of S wi,2 and S wi,3 will be C1 l +1+C1 l ·C1 l+2C2 l +2 andC1 l+ C3 l + 2C2 l +2C2 l· C1 l . In other words, the number is only O( l d)for the keyword with length land edit distance d.  B.

The Efficient Fuzzy Keyword Search Scheme  Based on thestorage-efficient fuzzy keyword sets, we show how to construct an efficient andeffective fuzzy keyword search scheme. The scheme of the fuzzy keyword searchgoes as follows: 1) To build an index for wi with edit distance d, the dataowner first constructs a fuzzy keyword set Swi,d using the wildcardbased technique. Then he computes trapdoor set {Tw’i} for each w’i? Swi,d with a secret key sk shared between data owner and authorized users.The data owner encrypts FIDwi as Enc(sk, FIDwi  || wi). The index table {({ Tw’i}w’i? Swi,d , Enc(sk, FIDwi || wi ))} wi ?W and encrypted data files areoutsourced to the cloud server for storage;2) To search with(w, k), the authorized user computes the trapdoor set {Tw’}w’? Sw,k , where Sw,k isalso derived from the wildcard-based fuzzy set construction.

He then sends {Tw’}w’? Sw,k to the server; 3) Upon receivingthe search request {Tw’}w’ ? Sw,k, the servercompares them with the index table and returns all the possible encrypted fileidentifiers {Enc(sk, FIDwi || wi)} according to the fuzzykeyword definition in section III-D. The user decrypts the returned results andretrieves relevant files of interest. In thisconstruction, the technique of constructing search request for w is the same asthe construction of index for a keyword. As a result, the search request is atrapdoor set based on Sw,k , instead of a single trapdoor as in thestraightforward approach.

In this way, the searching result correctness can beensured. V. CONCLUSION In this paper, we formalize and solve the problem ofsupporting efficient yet privacy-preserving fuzzy search for achievingeffective utilization of remotely stored encrypted data in Cloud Computing. Wedesign an advanced technique (i.e., wildcard-based technique) to construct thestorage-efficient fuzzy keyword sets by exploiting a significant observation onthe similarity metric of edit distance. Based on the constructed fuzzy keywordsets, we further propose an efficient fuzzy keyword search scheme. Throughrigorous security analysis, we show that our proposed solution is secure andprivacy-preserving, while correctly realizing the goal of fuzzy keyword search.

 REFERENCES1 D. Song, D. Wagner, and A. Perrig, “Practicaltechniques for searches on encrypted data,” in Proc. of IEEE Symposium onSecurity and Privacy’00, 2000. 2 E.

-J. Goh, “Secure indexes,” Cryptology ePrintArchive, Report 2003/216, 2003, http://eprint.iacr.

org/.3 Fuzzy keyword search over encrypted data in cloudcomputing”, Illinois Institute of Technology, ISSN: 2321- 8134.4 Y.-C. Chang and M.

Mitzenmacher, “Privacypreserving keyword searches on remote encrypted data,” in Proc. of ACNS’05,2005. 5 D. Boneh, G.

D. Crescenzo, R. Ostrovsky, and G.Persiano, “Public key encryption with keyword search,” in Proc. of EUROCRYP’04,2004.

6 R. Curtmola, J. A. Garay, S. Kamara, and R.Ostrovsky, “Searchable symmetric encryption: improved definitions and efficientconstructions,” in Proc.

of ACM CCS’06, 2006. 7 D. Boneh and B.

Waters, “Conjunctive, subset, andrange queries on encrypted data,” in Proc. of TCC’07, 2007, pp. 535–554. 8 REVIEW PAPER ON FUZZY SEARCH OVER ENCRYPTED DATAIN CLOUD COMPUTING by Neel Gala ISSN:2393-98429 F. Bao, R. Deng, X. Ding, and Y.

Yang, “Privatequery on encrypted data in multi-user settings,” in Proc. of ISPEC’08, 2008. 10 C. Li, J.

Lu, and Y. Lu, “Efficient merging andfiltering algorithms for approximate string searches,” in Proc. of ICDE’08,2008.