Big vector spatial data storage, indexing and query processing using the NoSQL Accumulo data store
Keywords:
Accumulo, big data, spatial,vector,indexAbstract
Recent advent of connected objects, omnipresence of social networks, success of the smartphone market, mobile telecommunications technology generalization, all these factors contribute to generating more and more spatial data with big data characteristics. Relational databases are widely used to store, retrieve and manage spatial data but they reach their limits in the presence of big data constraints. NoSQL Key/Value stores are a new approach that has emerged to support data big data optimally. In this research work, we are interested in the Accumulo database. A NoSQL database that uses a key value store model. Through this paper, we propose a spatial version of Accumulo to handle big vector spatial data. First, we analyse Accumulo’s storage and indexing logic. Next, we design a storage and indexing model specific to vector-type spatial data. Finally, we propose an application of this new model through an example of spatial queries.
Downloads
References
Gartner, Inc. Definition of Big Data - Gartner Information Technology Glossary. Mise à jour 17 Aout 2020. [En ligne]. Https://www.gartner.com/en/information-technology/glossary/big-data, [consulted the 01 Septembre 2020]
JASEENA, K. U. et DAVID, Julie M. Issues, challenges, and solutions: big data mining. CS & IT-CSCP, 2014, vol. 4, no 13, p. 131-140.
Turner, Andrew. Introduction to Neogeography, O'Reilly, 1er janvier 2006, page 54.
RAJ, Pethuru et RAMAN, Anupama C. The Internet of Things: Enabling technologies, platforms, and use cases. CRC Press, 2017.
EF, CODD. A relational model of data for large shared data banks. Communications of the ACM, 1970, vol. 13, no 6, p. 377-387.
CODD, E. F. An evaluation scheme for database management systems that are claimed to be relational. In : 1986 IEEE Second International Conference on Data Engineering. IEEE, 1986. p. 720-729.
SANDBERG, Russel. The Sun network file system: Design, implementation and experience. In : in Proceedings of the Summer 1986 USENIX Technical Conference and Exhibition. 1986.
Redhat Inc. Stockage en mode fichier, bloc ou objet. Mis à jour le 01 Février 2018. [En ligne]. .https://www.redhat.com/fr/topics/data-storage/file-block-object-storage [consulted the 13 Mars 2018]GHEMAWAT, Sanjay, GOBIOFF, Howard, et LEUNG, Shun-Tak. The Google file system. In : Proceedings of the nineteenth ACM symposium on Operating systems principles. 2003. p. 29-43.
GHEMAWAT, Sanjay, GOBIOFF, Howard, et LEUNG, Shun-Tak. The Google file system. In : Proceedings of the nineteenth ACM symposium on Operating systems principles. 2003. p. 29-43.
BIALECKI, Andrzej. Hadoop: a framework for running applications on large clusters built of commodity hardware. http://lucene. apache. org/hadoop, 2005.
CATTELL, Rick. Scalable SQL and NoSQL data stores. Acm Sigmod Record, 2011, vol. 39, no 4, p. 12-27.
STONEBRAKER, Michael. SQL databases v. NoSQL databases. Communications of the ACM, 2010, vol. 53, no 4, p. 10-11.
GRAY, Jim, et al. The transaction concept: Virtues and limitations. In : VLDB. 1981. p. 144-154.
Pritchett, D. (2008). BASE: An acid alternative. ACM Queue, 6(3):48–55.
BREWER, Eric A. Towards robust distributed systems. In : PODC. 2000.
DECANDIA, Giuseppe, HASTORUN, Deniz, JAMPANI, Madan, et al. Dynamo: Amazon's highly available key-value store. ACM SIGOPS operating systems review, 2007, vol. 41, no 6, p. 205-220.
CHANG, Fay, DEAN, Jeffrey, GHEMAWAT, Sanjay, et al. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS), 2008, vol. 26, no 2, p. 1-26.
SCOFIELD, Ben. NoSQL–Death to Relational Databases. In : Presentation at the CodeMash conference in Sandusky (Ohio). 2010. p. 01-14.
S. He, L. Chu and X. Li, "Spatial query processing for location-based applications on Hbase," 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA), 2017, pp. 110-114, doi: 10.1109/ICBDA.2017.8078787.
HAN, Dan et STROULIA, Eleni. Hgrid: A data model for large geospatial data sets in hbase. In : 2013 IEEE sixth international conference on cloud computing. IEEE, 2013. p. 910-917.
SHEN, Bo, LIAO, Yi-Chen, LIU, Dan, et al. A Method of HBase Multi-Conditional Query for Ubiquitous Sensing Applications. Sensors, 2018, vol. 18, no 9, p. 3064
Brodsky, I. (2018). H3: Uber’s Hexagonal Hierarchical Spatial Index. Uber Engineering. https://eng.uber.com/h3/.
Shoji Nishimura, Sudipto Das, Divyakant Agrawal, and Amr El Abbadi. Md-hbase: a scalable multi-dimensional data infrastructure for location aware services. In Mobile Data Management (MDM), 2011 12th IEEE International Conference on, volume 1, pages 7–16. IEEE, 2011.
HE, Shouwu, CHU, Longxian, et LI, Xiaoying. Spatial query processing for location based application on Hbase. In : 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA). IEEE, 2017. p. 110-114.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 ELhassane Nassif, Hajji Hicham, Reda Yaagoubil, Hassan Badir
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Copyright on any article in the International Journal of Computer Engineering and Data Science (IJCEDS) is retained by the author(s) under the Creative Commons license, which permits unrestricted use, distribution, and reproduction provided the original work is properly cited.
License agreement
Authors grant IJCEDS a license to publish the article and identify IJCEDS as the original publisher.
Authors also grant any third party the right to use, distribute and reproduce the article in any medium, provided the original work is properly cited.