The String Similarity Query Processing in Cloud Computing System

LiaoYuanLai

The String Similarity Query Processing in Cloud Computing System

원문정보

LiaoYuanLai

보안공학연구지원센터(IJGDC) International Journal of Grid and Distributed Computing Vol.8 No.2 2015.04 pp.25-36

초록

영어

The paper target at string similarity search in cloud systems. Existing works focus on query processing within a single server, and it incurs main memory overflow and external memory overflow while dealing with big data. For the above problems, the paper proposes a distributed index to support string similarity search in cloud environments. To provide efficient searching in a single node, an external memory index is designed, which adopts multiple filtering techniques and optimizing strategies. The external memory resident index supports length filter, positional filter in disks. This paper proposes the index construction method. During query processing, asymmetric q-gram is used to reduce the number of inverted lists read from disks. An adaptive algorithm is given to choose inverted lists, and seek the tradeoff between two aspects of query cost. The global index partitions the entire string dataset according the content of strings, and a char vector space partition method is proposed. In char vector space partition method, similar strings are partitioned into the same computing nodes, thus the number of computing nodes involved in a single query is reduced. The partition method is also adopted to determine necessary computing node set for a query to access. Simulation results validate the efficiency and effectiveness of our proposed index.

키워드

저자정보

LiaoYuanLai Heyuan Polytechnic HeYuan 517000, China

참고문헌

자료제공 : 네이버학술정보

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

0개의 논문이 장바구니에 담겼습니다.

earticle