Authors:
Ricardo Jimenez
1
;
Marta Patino
2
;
Valerio Vianello
2
;
Ivan Brondino
1
;
Ricardo Vilaca
1
;
Jorge Teixeira
3
;
Miguel Biscaia
3
;
Giannis Drossis
4
;
Damien Michel
4
;
Chryssi Birliraki
4
;
George Margetis
4
;
Antonis Argyros
4
;
Constantine Stephanidis
4
;
Luigi Sgaglione
5
;
Gaetano Papale
5
;
Giavanni Mazzeo
5
;
Ferdinando Campanile
6
;
Marc Sole
7
;
Victor Muntés-Mulero
7
;
David Solans
7
;
Alberto Huelamo
7
;
Pavlos Kranas
8
;
Dora Varvarigou
8
;
Vrettos Moulos
8
and
Fotis Aisopos
8
Affiliations:
1
LeanXcale, Spain
;
2
Universidad Politecnica de Madrid, Spain
;
3
Altice Labs, Portugal
;
4
Institute of Computer Science, Foundation for Research and Technology Hellas & Computer Science Department, University of Crete, Greece
;
5
University of Naples “Parthenope”, Italy
;
6
Sync Lab srlSync Lab srl, Italy
;
7
CA Technologies, Spain
;
8
National Technical University of Athens & ICCS, Greece
Keyword(s):
Big Data, Real-Time Big Data, SQL, OLTP, OLAP.
Abstract:
One of the major problems in enterprise data management lies in the
separation of databases between operational databases and data warehouses. This
separation is motivated by the different capabilities of OLTP and OLAP data management
systems. Due to this separation copies from the operational databases
to the data warehouses should be performed periodically. These copies are performed
by a process call Extract-Transform-Load (ETL) that turns out to amount to
80% of the budget of performing business analytics. LeanBigData main goal has
been to address this major pain by providing a real-time big data platform providing
both functions, OLTP and OLAP, in a single data management solution. The
way to achieve this goal has been to leverage an ultra-scalable OLTP database,
LeanXcale, and develop a new OLAP engine that works directly over the operational
data. The platform is based on a novel storage engine that provides extreme
levels of efficiency. The platform has also an integrated par
allel-distributed CEP
that scales the processing of streaming data and that can be combined with the
processing of data at rest at the new OLTP+OLAP database to address a wide
variety of data management problems. LeanBigData has a bigger vision and aims
at providing and end-to-end analytics platform. This platform provides a visual
workbench that enables data scientist to perform discovery of new insights. The
platform is also enriched with a subsystem that performs anomaly detection and
root cause analysis that works with the new developed system and enables to
perform this analysis over streaming data. The LeanBigData platform has been
validated by four real-world use case scenarios cloud data centre monitoring,
fraud detection in direct debit operations, sentiment analysis in social networks
and targeted advertisement.
(More)