The need for Big Data became apparent in recent years, as the data volume stored in the data warehouses has grown exponentially - as it became clear that significant volumes of data and information may be stored to generate valuable knowledge, translated into profit. Large organizations in Israel and overseas have realized the advantage that the access to data and the ability to analyze it in real time holds.
Enormous volumes of data are collected in the world of mobile, internet, finance, gaming, social networks, media, meter reading and biotechnology - every day and every hour. Many customers with a data warehouse based on ordinary databases found out they were not up to the task of massive loading runs while simultaneously running queries - or even up to any one of these tasks on its own. The solution proposed to these customers: purchase larger servers, a load of expensive disks, add expensive licensing, add a costly human resource for database maintenance... Alas, all of these have reached their limits. (And we have not even mentioned energy consumption and environmental protection issues).
In addition to that, the need arose to handle other, equally important issues, such as: survivability, high availability, alternate sites, ever expanding maintenance, scalability, constant query optimization, handling history, aggregation - the whole issue of Business Intelligence (BI) has become more complex, with many queries running for long, sometimes never ending periods of time.
Professor Michael Stonebraker, who invented most of the databases currently known, figured out that we there is a fundamental problem: database configurations created over 30 years ago are not up to 21st century needs. He defined and designed from scratch the VERTICA database to provide a solution for these issues, while keeping the administration simple and the operating and management costs low. Stonebraker decided to rebuild - not relying on any existing technology - a database which could meet the challenge of demands over the coming decade, hence his invention of the Vertica database!
The Vertica database was created as infrastructure which allows customers to linearly grow in terms of data volume, number of users and computing power. The product was designed using grid technology, allowing for use of multiple standard servers with local disks, with scalability achieved by adding servers to the system. The system uses a Shared Nothing configuration, with no master server - where any server can run loading, run queries and failure of one or more servers does not result in a system crash (built-in high availability).
Vertica is a column-based database using standard ANSI SQL with GIS-related extensions and advanced analytical functions. The system supports all existing ETL and BI tools, including direct integration with tools such as HADOOP. The product provides integral data compression, including running queries - achieving a data compression ratio of 80%-90%.
VERTICA's biggest advantage is the extra-high data loading rate (over 9 TB per hour), with query execution up to 1,000 faster than other databases (including Count Distinct queries) and most importantly – the fact that these two can run concurrently with loading having no effect on queries.
The issues described above are a sore point with many customers in Israel, whose database technology prevents them from moving forward - making the whole enterprise slide backward in time. Vertica is a leading column-based database in the international market; with over 600 world-leading customers, it may be said that Vertica has fully met customer expectations - even with very large customers, some with 1-10 petabytes of data, loaded in real time while running analytics and other queries.
Any company wishing to migrate an existing system into the world of Big Data - should know that this migration process may be simple or complex, depending on the configuration of the current data warehouse. Migrating the system as-is may be a disadvantage, as it may negate the system advantages. The ETL and staging process should be carried out outside of the data warehouse. Column-based systems could bring enormous benefits – if you know how to properly design and use them, optimizing the schema for Big Data systems.
We’d be happy to meet and discuss Vertica Big Data Twingo's expert team!