MongoDB Introduction

Welcome to a new tutorial on MongoDB. Here you will learn about the features and history of MongoDB. 

 

Introduction to NoSQL

NoSQL is typically a database used to manage huge sets of unstructured data, in which data is not stored in tabular relations like relational databases. However, the most commonly used relational databases have been able to solve some complex modern problems. These complex modern problems may include the following:

The continuously changing nature of data, such as structured, semi-structured, unstructured, and polymorphic data.

Software Applications currently serve millions of users in different geo-locations, in different timezones, thus, have to be up and running all the time, with data integrity maintained

Software Applications are more distributed with many moving towards cloud computing.

NoSQL has contributed immensely to an enterprise application that needs to access and analyze a massive set of data that is being made available on multiple virtual servers (or remote based) in the cloud infrastructure and mainly when the data set is not structured. 

Therefore, the NoSQL database is designed to overcome the Performance, Scalability, Data Modelling, and Distribution limitations that are evident in all relational databases.

 

What is Structured Data?

Structured data is simply text files, with defined column titles and data in rows. These types of data can be visualized in form of charts and can be processed using data mining tools.

 

What is Unstructured Data?

Unstructured data may include an image file, video file, Emails, PDF, and so on. However, these files do not have anything in common. But structured Information can be extracted from unstructured data, although the process is time-consuming, interestingly, most modern data are unstructured. Hence, the need to store these data set a path for NoSQL.

 

NoSQL Database Types

The following are the NoSQL database types.

Document Databases: In this database, the key is paired with a complex data structure called a Document. E.g. MongoDB.

Graph stores: This database is used to store networked data. In which we can relate data based on other existing data.

Key-Value stores: The database, is the simplest NoSQL database, as it is stored with a key to identify it. However, in some Key-value databases, we can save the type of the data saved along, such as in Redis.

Wide-column stores: This database is used to store large data sets (i.e. store columns of data together). E.g. Cassandra (Used in Facebook), HBase, and so on.

 

Advantages of NoSQL Databases

Some of the major advantages of NoSQL databases are discussed below with examples.

 

Dynamic Schemas

Many might be thinking of what dynamic schema means. In Relational Databases such as Oracle, MySQL we define table structures, right? for instance, when we decide to save records of Student Data, then we need to create a table named Student, add columns to it, like student_idstudent_name, and so on. Hence, this is referred to as schema, in which we define the structure before saving any data. In the future we might want to add more related data to our student table, then we will have to add a new column to our table. 

Now, which one is easier, in the case we have millions of records, or fewer data in our tables; thus migration to the updated schema would be a hectic job, hence, NoSQL databases solve this problem. However, in a NoSQL database, we do not need schema definition.

 

Sharding

Sharding makes it possible for large databases to be partitioned into small, faster, and easily manageable databases.

The Relational Databases follow a vertical architecture in which a single server holds the data, while all the data is related. But, Relational Databases don't provide the Sharding feature by default, however, to achieve it lots of effort has to be put in, as transactional integrity (Inserting/Updating data in transactions), Multiple tables JOINS, and so on, cannot be achieved easily in distributed architecture in case of Relational Databases.

Although, NoSQL Databases have the Sharding feature as default and no additional effort is needed. They automatically spread the data across servers, as well as fetch the data in the fastest time from the server which is free, while maintaining the integrity of data.

 

Replication

Automatic data replication is supported in NoSQL databases by default. Therefore, if a DB server goes down, data can be restored by using its copy created on another server network.

 

Integrated Caching

Most NoSQL databases have support for Integrated Caching, in which frequently demanded data is stored in a cache to make the queries faster.

 

MongoDB - NoSQL Database

MongoDB is a NoSQL database written in C++ language, where some of its drivers use the C programming language as the base. Also, it is a document-oriented database where it stores data in collections rather than in tables. The interesting part of MongoDB is that its drivers are available for almost all the popular programming languages.

However, in our current highly competitive technological world, every company has started hosting its enterprise applications over the cloud in order to expand the business globally, provide faster services, and to personalize the customer's experience with the application and overall business. Thus, NoSQL has become the first choice in database technology in the development of such applications. 

 

Features and history of MongoDB. 

As you may know, MongoDB is a NoSQL database that stores the data in form of key-value pairs and is an open source, Document Database that provides high performance and scalability along with data modeling and data management of very large sets of data in an enterprise application. Also, it provides the feature of Auto-Scaling. 

More importantly, MongoDB is a cross-platform database and thus can be installed across different platforms such as Linus, Windows, and so on.

 

What is Document based storage?

A Document is simply a data structure with name-value pairs like in JSON. The document is extremely easy to map any custom object of any programming language with a MongoDB Document. E.g. the Student object has attributes like name, rollno, and subjects, in which subjects is a List.

Document for Student in MongoDB is shown below:

{
	name : "Programming Language",
	rollno : 1,
	subjects : ["C Language", "C++", "Core Java"]
}

As you can see, Documents are typically JSON representations of custom Objects, and excessive JOINS can be avoided by saving data in form of Arrays, as well as Documents (Embedded) inside a Document.

Now let's look at the history of MongoDB

 

Brief History of MongoDB

In 2007, MongoDB was developed by Eliot Horowitz and Dwight Merriman, when they experienced scalability issues with the relational database while developing enterprise web applications at their company DoubleClick. Dwight Merriman, who is a part developer of MongoDB said that name was coined from the word humongous to support the idea of processing a large amount of data.

Later, in the year 2009, MongoDB was made an open source project, as the company offered commercial support services. Afterward, numerous companies began to use MongoDB for its amazing features. 

The New York Times newspaper made use of MongoDB to build a web-based application to submit photos. 

The company was officially named MongoDB Inc. in 2013.

 

Key Features of MongoDB

MongoDB has more and very useful features aside from the NoSQL default features. These features are highlighted below.

  • MongoDB provides high performance, as Input/Output operations are lesser than relational databases due to its support of embedded documents (or data models). Also, its Select queries are faster as Indexes in MongoDB support faster queries.
  • It is a rich Query Language, as it supports all the major CRUD operations. Also, the Query Language provides nice Text Search and Aggregation features.
  • Its Auto Replication feature leads to its High Availability. Also, it provides an automatic failover mechanism, as data is restored through a backup (or replica) copy.
  • Sharding is a major feature of MongoDB, which provide Horizontal Scalability.
  • Also, MongoDB supports multiple Storage Engines. If we save data in form of documents (NoSQL) or tables (RDBMS), it is the Storage Engine that saves this data. The Storage Engines manage how data is saved in memory as well as on disk.

 

Organizations that use MongoDB

Companies that use MongoDB as a database for most of their business applications include but are not limited to the following:

  • Adobe
  • LinkedIn
  • McAfee
  • FourSquare
  • eBay
  • MetLife
  • SAP