Overview: NoSQL database is the buzzword in the current software industry. NoSQL database is also widely accepted, but it is NOT a replacement for the traditional relational database management system (RDBMS), which stores data in relational tables. So we can simplify this by saying that NoSQL is there to overcome the gaps found in traditional RDBMS.
In this article, I will discuss about the NoSQL database and its various aspects.
Introduction: NoSQL – interpreted as ‘Not only SQL’ is a database that provides a mechanism to store and retrieve data in a manner which is different from the traditional RDBMS, which heavily depends on tabular relations. This approach was initiated and accepted based on the following facts –
- Design Simplicity/Performance – In NoSQL the data structure is either key-value or flat file. Because of its simple and easy to manage data structure, NoSQL is faster than its counter parts. So the performance is major differentiator.
- Horizontal Scalability – NoSQL database implementations can be easily scaled up or down as and when required.
So the two most influencing factors of NoSQL databases are ‘Performance’ and ‘Scalability’. NoSQL database is designed to combat the drawbacks of the relational model.
Different types of NoSQL database:
There are different types of NoSQL databases available in the market. Let us have a look to get an idea.
- Key Value paired database – This is the simplest and most commonly used type of NoSQL based database. In this category, each item in the database is stored in the database as an attribute called key along with its value. So it is basically a key-value pair.
- Graph Stores – This category of NoSQL is used to store information about network e.g. social networking data. E.g. Neo4J and HyperGraphDB etc.
- Document Database – This is an extended form of key value paired DB where every key is associated with a complex data structure. This data structure is known as document. Documents can further contain key value pair or even nested documents.
- Wide Column storage – These are optimized for queries over large data records. These databases store columns of data instead of rows. E.g. Cassandra, HBase.
Advantages of NoSQL:
As compared to the traditional relational databases, NoSQL based databases are more scalable and offer better performance. Relational databases are said to be not competent to handle the following scenarios –
- Relational databases often fail to handle data of larger volumes be it structured, semi structured or unstructured data.
- Relational databases have failed in the agile environment which are sprint based and require raid iteration and frequent code publishing.
- Relational databases are not designed to be compatible with object oriented programming which is very simple, flexible and easy to use.
- If you want to store hierarchical objects with query capabilities, then RDBMS is not the recommended solution. Only NoSQL can perform well.
- For a cloud deployment, which is a distributed environment, RDBMS is not suitable.
So in the above scenarios NoSQL is the only solution to fill the gaps. NoSQL data model is efficient and has a scalable architecture as compared to relational model which is expensive and follows a monolithic architecture.
NoSQL allows us to have dynamic schema for the database: In relational database; we need to define the schema in the very beginning. Any relational database will like to know in advance, the data that we want to store e.g. if we want to store an employee’s record such as name, department, phone number, address etc. We also need to know the data type and their possible size in advance. This approach presents challenges in agile development methodology as every time we need to include new feature, we need to modify the schema which may result in making the application unstable. E.g. if we take a call to add the spouse and kids details of every employee in the application, we will require to add few more columns and then a migration is required to migrate the old data in the new table. In this situation, if the database size is large, we will require a significant amount of time to migrate the database which may result in a large down time. If we need to address these kinds of changes frequently, then it will be quite problematic to manage these downtimes.
NoSQL based databases are designed and developed to handle these kinds of situations. In NoSQL databases, we can insert data without having a pre-defined schema which makes our life easy while making changes at the database level. Thus, it helps in rapid development and also the code integration is easier in this approach.
So in NoSQL, the advantage of ‘Dynamic Schema’ gives us a lot of flexibility for managing ever changing demands of web applications.
Sharding Mechanism: Because of their way of structuring, relational databases can scale vertically i.e. if we need to scale the database of an application; we need to host a single server having the entire database loaded on it. This is to ensure data availability. This approach is relatively expensive and the chances of failure are also high. To come out of this bottleneck it is advised to scale horizontally rather than vertically. Sharding mechanism allows us to have the database across multiple server instances which are done on SQL based databases. This is accomplished with the help of Storage Area Networks or SANs. Since the databases don’t provide this feature it becomes the responsibility of the developer to deploy multiple relational databases across different systems. Each and every single data is stored on all the database instances. The developer needs to develop the application code in order to distribute data, queries and collate the results of the data across all the database instances. In addition to this, code should be developed to handle the resource failures. This can be done by performing joins across the different databases. This approach is called data rebalancing and replication. In addition to this many benefits of the relational database like transactional integrity is compromised while employing manual sharding.
On the other hand, NoSQL databases generally support automatic sharding. I.e. these databases have the ability to spread the data across any number of database instances automatically. This mechanism doesn’t require the application to be even aware of the server composition pool. Data and query load are automatically balanced across servers, and when a server goes down, it replaced immediately causing no disruption in the application.
With Cloud computing in place, we can have this approach significantly in an easy way. Cloud Providers like Amazon Web Services or AWS have the ability to provide virtually unlimited capacity on demand and also takes care of all the important database administration tasks. Now the developers are no longer required to build complicated and expensive platform to support their applications, and hence are free to concentrate on writing application code which requires more attention given the complexity of the business. This approach is also cost effective.
Data Replication: The commonly used NoSQL databases support automatic data replication. Thus we get high availability of the data and also recovery against disaster and do not require involving separate applications to manage these tasks.
Implementing NoSQL database:
Most organizations start with having a trial implementation of NoSQL database which helps them to develop an understanding of the software and the technology since it becomes very difficult for the traditional DBAs to digest the approach of NoSQL. Most of the NoSQL databases are open source, thus allowing the developers to download the software and start the POC development without bothering about the licensing challenges. Since the development cycles are shorter and faster developers can take the advantage to innovate and explore new areas which might produce better results.
We have discussed about the NoSQL database and its various aspects. Now, it is clear that NoSQL is not a replacement to the traditional RDBMS. But it has a different set of use cases which are not suitable for RDBMS. NoSQL databases are continuously evolving and it will come with more new features in near future. To conclude the discussion, let’s have a quick look at the following bullets.
- NoSQL stands for ‘Not Only SQL’.
- NoSQL based databases differs with the traditional databases in the approach of storing and retrieving the data.
- NoSQL based databases are much faster as compared to their relational counterpart.
- Different types of NoSQL databases are –
- Key Value Paired
- Graph Stored