Database/Business Intelligence/Datawarehousing: NoSQL Introductory

NoSQL

The data grows day by day due to high usage of Mobile, social, and cloud computing resources. Data management requirements go beyond the effective scope of traditional relational databases.

So companies are looking to capitalize on the advantage of alternatives like NoSQL and Hadoop: NoSQL to build operational applications that drive their business through systems of engagement, and Hadoop to build applications that analyze their data retrospectively and help deliver powerful insights.

MongoDB

It has been Written in: C++.

MongoDB has aimed for a balanced approach suited to a wide variety of applications. While the functionality is close to that of a traditional relational database, MongoDB allows users to capitalize on the benefits of cloud infrastructure with its horizontal scalability and to easily work with the diverse data sets in use today thanks to its flexible data model.

MongoDB is mainly designed for OLTP workloads. It can do complex queries, but it’s not necessarily the best fit for reporting-style workloads. Or if you need complex transactions, it’s not going to be a good choice. However, MongoDB’s simplicity makes it a great place to start.

Cassandra

It has been Written in Java.

There are at least two kinds of database simplicity: development simplicity and operational simplicity. While MongoDB rightly gets credit for an easy out-of-the-box experience, Cassandra earns full marks for being easy to manage at scale.

As for adding capacity to a cluster, “You simply boot up a new machine and tell Cassandra where the other nodes are, and it takes care of the rest.”

This ease of scaling, coupled with exceptional write performance (“All you’re doing is appending to the end of a log file”) and predictable query performance, add up to a high-performance workhorse in Cassandra.

The replication and read and write paths are purposefully simple. You can learn the core internals of Cassandra in a few hours. That can bring a lot of confidence as you deploy new technology because there are less “black box” details that introduce complex failure modes.

This means that the price for admission to effective Cassandra development is in understanding the data model and how it will work with your application.

CQL3 is very similar SQL, but with some limitations that come from the scalability (most notably: no JOINs, no aggregate functions.)

Map/reduce possible with Apache Hadoop

All nodes are similar, as opposed to Hadoop/HBase

HBase

It has been Written in Java.

HBase, like Cassandra a column-oriented key-value store, gets a lot of use in large part because of its common pedigree with Hadoop. HBase provides a record-based storage layer that enables fast, random reads and writes to data, complementing Hadoop by emphasizing high throughput at the expense of low-latency I/O. Changes are efficiently cataloged in memory to achieve maximum access while the data is persisted to HDFS. This design enables a Hadoop-based EDH [enterprise data hub] to serve random reads and writes to users and applications in real time, yet still enjoy the fault- tolerance and durability of HDFS.

HBase’s roots as an open source implementation of Google’s Bigtable translate into the database being highly scalable by design.

Because it can utilize the storage, memory, and CPU resources of any number of servers, as well as has scale-out features like automatic sharding, HBase can scale limitlessly as load and performance demands increase simply by adding server nodes. HBase was designed from the ground up to provide optimal performance when consistency is critical.

But scale isn’t it’s only utility. “Thanks to its tight integration with the rest of the Hadoop ecosystem, data is readily available to users and applications via SQL queries (using Cloudera Impala, Apache Phoenix, or Apache Hive) or even faceted free-text search (using Cloudera Search).” Thus, HBase gives developers a way to leverage existing expertise with SQL while building on a more modern, distributed database.

Limitations of NoSQL

NoSQL alternatives and solutions are still in nascent and pre-production stages and many key features are yet to be implemented.

Customer support is also better in RDBMS systems like SQL and vendors provide a higher level of enterprise support. In contrast, NoSQL system support is provided by small start-up companies.

They offer few facilities for ad-hoc query and analysis. It is much easier to code an SQL query, commonly used BI tools do not provide connectivity to NoSQL.

Database/Business Intelligence/Datawarehousing

Labels

Friday, February 19, 2016

NoSQL Introductory

No comments:

Post a Comment

Blog Archive