Big vector spatial data storage, indexing and query processing using the NoSQL Accumulo data store

Authors

  • ELhassane Nassif Ecole ESGIT, IAV H2, Rabat, Morocco
  • Hajji Hicham Ecole ESGIT, IAV H2, Rabat, Morocco
  • Reda Yaagoubi Ecole ESGIT, IAV H2, Rabat, Morocco
  • Hassan Badir System and Data Engineering Team, Abdelmalek Essaadi University, Tangier, Morocco

Keywords:

Accumulo, big data, spatial,vector,index

Abstract

Recent advent of connected objects, omnipresence of social networks, success of the smartphone market, mobile telecommunications technology generalization, all these factors contribute to generating more and more spatial data with big data characteristics. Relational databases are widely used to store, retrieve and manage spatial data but they reach their limits in the presence of big data constraints. NoSQL Key/Value stores are a new approach that has emerged to support data big data optimally. In this research work, we are interested in the Accumulo database. A NoSQL database that uses a key value store model. Through this paper, we propose a spatial version of Accumulo to handle big vector spatial data. First, we analyse Accumulo’s storage and indexing logic. Next, we design a storage and indexing model specific to vector-type spatial data. Finally, we propose an application of this new model through an example of spatial queries.

Downloads

Download data is not yet available.

References

Gartner, Inc. Definition of Big Data - Gartner Information Technology Glossary. Mise à jour 17 Aout 2020. [En ligne]. Https://www.gartner.com/en/information-technology/glossary/big-data, [consulted the 01 Septembre 2020]

JASEENA, K. U. et DAVID, Julie M. Issues, challenges, and solutions: big data mining. CS & IT-CSCP, 2014, vol. 4, no 13, p. 131-140.

Turner, Andrew. Introduction to Neogeography, O'Reilly, 1er janvier 2006, page 54.

RAJ, Pethuru et RAMAN, Anupama C. The Internet of Things: Enabling technologies, platforms, and use cases. CRC Press, 2017.

EF, CODD. A relational model of data for large shared data banks. Communications of the ACM, 1970, vol. 13, no 6, p. 377-387.

CODD, E. F. An evaluation scheme for database management systems that are claimed to be relational. In : 1986 IEEE Second International Conference on Data Engineering. IEEE, 1986. p. 720-729.

SANDBERG, Russel. The Sun network file system: Design, implementation and experience. In : in Proceedings of the Summer 1986 USENIX Technical Conference and Exhibition. 1986.

Redhat Inc. Stockage en mode fichier, bloc ou objet. Mis à jour le 01 Février 2018. [En ligne]. .https://www.redhat.com/fr/topics/data-storage/file-block-object-storage [consulted the 13 Mars 2018]GHEMAWAT, Sanjay, GOBIOFF, Howard, et LEUNG, Shun-Tak. The Google file system. In : Proceedings of the nineteenth ACM symposium on Operating systems principles. 2003. p. 29-43.

GHEMAWAT, Sanjay, GOBIOFF, Howard, et LEUNG, Shun-Tak. The Google file system. In : Proceedings of the nineteenth ACM symposium on Operating systems principles. 2003. p. 29-43.

BIALECKI, Andrzej. Hadoop: a framework for running applications on large clusters built of commodity hardware. http://lucene. apache. org/hadoop, 2005.

CATTELL, Rick. Scalable SQL and NoSQL data stores. Acm Sigmod Record, 2011, vol. 39, no 4, p. 12-27.

STONEBRAKER, Michael. SQL databases v. NoSQL databases. Communications of the ACM, 2010, vol. 53, no 4, p. 10-11.

GRAY, Jim, et al. The transaction concept: Virtues and limitations. In : VLDB. 1981. p. 144-154.

Pritchett, D. (2008). BASE: An acid alternative. ACM Queue, 6(3):48–55.

BREWER, Eric A. Towards robust distributed systems. In : PODC. 2000.

DECANDIA, Giuseppe, HASTORUN, Deniz, JAMPANI, Madan, et al. Dynamo: Amazon's highly available key-value store. ACM SIGOPS operating systems review, 2007, vol. 41, no 6, p. 205-220.

CHANG, Fay, DEAN, Jeffrey, GHEMAWAT, Sanjay, et al. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS), 2008, vol. 26, no 2, p. 1-26.

SCOFIELD, Ben. NoSQL–Death to Relational Databases. In : Presentation at the CodeMash conference in Sandusky (Ohio). 2010. p. 01-14.

S. He, L. Chu and X. Li, "Spatial query processing for location-based applications on Hbase," 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA), 2017, pp. 110-114, doi: 10.1109/ICBDA.2017.8078787.

HAN, Dan et STROULIA, Eleni. Hgrid: A data model for large geospatial data sets in hbase. In : 2013 IEEE sixth international conference on cloud computing. IEEE, 2013. p. 910-917.

SHEN, Bo, LIAO, Yi-Chen, LIU, Dan, et al. A Method of HBase Multi-Conditional Query for Ubiquitous Sensing Applications. Sensors, 2018, vol. 18, no 9, p. 3064

Brodsky, I. (2018). H3: Uber’s Hexagonal Hierarchical Spatial Index. Uber Engineering. https://eng.uber.com/h3/.

Shoji Nishimura, Sudipto Das, Divyakant Agrawal, and Amr El Abbadi. Md-hbase: a scalable multi-dimensional data infrastructure for location aware services. In Mobile Data Management (MDM), 2011 12th IEEE International Conference on, volume 1, pages 7–16. IEEE, 2011.

HE, Shouwu, CHU, Longxian, et LI, Xiaoying. Spatial query processing for location based application on Hbase. In : 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA). IEEE, 2017. p. 110-114.

Downloads

Published

2022-04-01 — Updated on 2022-04-15

How to Cite

Nassif, E., Hajji Hicham, Yaagoubi, R. ., & Badir, H. . (2022). Big vector spatial data storage, indexing and query processing using the NoSQL Accumulo data store . International Journal of Computer Engineering and Data Science (IJCEDS), 2(1), 17–27. Retrieved from https://ijceds.com/ijceds/article/view/29