As old as it is, we still have the debate: what kind of database should be used for my system? The most common answer is usually: “it depends”. We know that it does depend on many different factors. Therefore I would like to cover some of them, which can help you in identifying and selecting the proper one, based on the requirements for your project.
DOCUMENT VS ROW BASED DATA
At the beginning it might be worth mentioning in what way we store the data within these two concepts.
When you have a Sql database, it keeps the row-based-data in tables. For instance, you may have a users table, where it stores the data concerning users in rows with each row having strict, predefined fields. A NoSql database, keeps data in collections of documents. So what we call a table in a Sql database, is now called a collection in NoSql. Following the naming, what we call a record in Sql, is called a document in the latter one. Each document has dynamic fields, so it can store whatever data you want to put in there, with limitation to its size.
There is one more concept in the NoSql database. Some databases do not have collections or documents so they store the data in a key-value way.
STRUCTURED VS UNSTRUCTURED DATA
The first thing which may come to your mind is stored data. Actually, as you think about this, it is all about the data. So the question can be: what kind of data do you want to store? Is it the data which is well-structured, e.g. it comes from predefined forms, filled in by users, or is it the unstructured one or even of dynamic types from external services?
As you consider this, you will see that well-structured data plays well with a SQL database, where it has a predefined schema for data. When you deal with the integration with the external services and you want to store a lot from them, maintaining schema can be a painful thing. In the latter case it is better to use a database which handles unstructured of dynamic data; a NoSQL database plays well in this case.
Also, when you have a hierarchical type of data, e.g. an order with order lines together with its prices, it can be very well-placed in a document of a NoSql database. In case of a NoSql database we keep data as documents, and these documents can be nested in each other. For instance, you may have a patient document with his treatments history and examination as sub-documents.
On the other hand, you may have well-structured data like bills or invoices, from which you generate a beautiful report. The best choice for that is Sql as you can easily generate these reports based on multiple tables joined together.
SCHEMA VS SCHEMA-LESS
Question about schema is almost like… “Quo-vadis project?”. You should consider that. When you know up-front what kind of data you want to store in your project, and you are sure that it will not change too much in the future, you may safely go for a Sql database.
When you are aware that you do not know upfront too much about the data and it should be dynamic, it is better to go for NoSql. You may want to have an easy access to add new documents to the already existing collections with new, not defined fields, it will be easier to have a schema-less kind of database – a NoSql one.
HORIZONTAL VS VERTICAL SCALING
When you think about scalability, it is important to understand how these two database concepts scale. Taking into consideration the cases where you want to have easy scalability of the database, as you system grows and handles more and more users and their data. Depending on the database type, scaling is handled in a slightly different way. Both ways are obviously valid, however, it is good to see the difference between them.
When you consider a Sql database, it scales vertically which means scaling it by using a more powerful machine. Instead of scaling it across multiple machines, you scale it with the one by adding better and more powerful hardware.
The second concept – a NoSql database scales horizontally. It means you can add new machines and it will scale across all of them as it is prepared for this type of scaling.
You will see it on the later step of you project, what kind of scalability you need, but it is a good thing, to have some predictions concerning where the project may go. Based on these predictions, plan and prepare for the future.
CONSISTENT VS CONSISTENT-LESS DATA
Consistency of the database is really important. It might be really important yet that is not true for all cases.
In a NoSql database you come across less consistent data as it does not have a predefined schema. Initially, you may not consider it as a good idea. Actually, it is not a bad one, either. Especially, as I mentioned earlier, you may sometimes want to have inconsistent data, e.g. in your warehouse or in a database which takes data from multiple places. You may want it not to be consistent as you intend to handle it from many different sources. In this case, a NoSql database appears to be a better choice.
On the other side of the spectrum, you have consistent databases. For instance, you want to have strict validations and be sure that the data is kept in a predefined way, that it is actually consistent across an entire table and database. There you had better go for a Sql database.
COMPLEX VS SIMPLE TRANSACTIONS
An issue related to consistency refers to transactions. For certain projects, transactions are important, while for the others they are placed somewhere in a shadow.
Consider that you need to ensure that a user puts his data and it is valid, so later you can create additional records for the user at the same time. Create it only, when the data is valid, in the other case, do not insert the user data at all. In order to ensure complex transactions across multiple tables, it is only handled by Sql databases.
So far, NoSql databases allow you to define transactions just for a single document. Therefore you are not able to revert insertions into multiple tables, if a data is wrong at a certain stage. Also, again that might be a benefit when you find that you do not care about consistency, and transactions are not taken into the consideration.
Whether you need a database for storing unstructured data or the one for handling high load for inserts and reads, consider making a conscious choice. Do not attempt to make a choice based on what is currently new and shiny, just because it is that.
It is crucial to make a choice based on requirements, numbers and valid arguments. Remember that the selection cannot be a golden hammer for every problem. You may consider mixing SQL and NoSQL for appropriate use-case.
If you still encounter any trouble on that matter, I am open to help you making the appropriate and the most reasonable choice.