MongoDB Architecture

v Introduction of MongoDB:

MongoDB is an open-source document database and leading NoSQL database. MongoDB is written in C++. MongoDB is a cross-platform, document oriented database that provides, high performance high availability and easy scalability. MongoDB works on concept of collection and document.

You can leverage MongoDB if you are expecting a lot of reads and write operations from your application but you don’t care much about some the data being lost in the server crash.

You can also use MongoDB when you are planning to integrate hundreds of different data sources since the document-based model of MongoDB Serves as a great fit to provide a single unified view of your data. You can even use it to clickstream data and use it for customer behavioral analysis.

v Architecture of MongoDB :

MongoDB table designing

1. Database:

Database is a physical container for collections. Each database gets its own sets of files on the file system. In addition to grouping documents by collection, MongoDB groups collections into databases. A single MongoDB server has multiple databases. A database has its own permissions, and each database is stored in separate files on disk. A good rule of thumb is to store all data for a single application in the same database. Separate databases are useful when storing data for several application or users on the same MongoDB server.

Like collections, databases are identified by name. Database names can be any UTF-8 string, with the following restrictions:

Naming Rules:

1. The empty string is not a valid database name.

2. A database name cannot contain any of special symbols.

3. Database names are limited to a maximum of 64 bytes.

4. Database names are case-sensitive, even on non-case sensitive file systems.

There are also several reserved database names, which you can access but which have special semantics. These are as follows;

1. admin:

This is the root database in terms of authentication. If a user is added to the admin database, the user automatically inherits permissions for all databases. There are also certain server side commands that can be run only from the admin database, such as listing all of the databases or shutting down the server.

2. local:

This database will never be replicated and can be used to store any collections that should be local to a single server.

3. config:

When MongoDB is being used in a sharded setup, it uses the config database to store information about the shards.

By concatenating a database name with a collection in that database you can get a fully qualified collection name called a namespace.

2. Collection:

Collection is a group of MongoDB documents, It is the equivalent of an RDBMS table. A collection exists within a single database. Documents within a collection can have different fields. Typically, all documents in a collection are of similar or related purpose.

Naming Rules:

A collection is identified by its name. Collection names can be any UTF-8 string, with a few below restrictions;

1. The empty string (“ “) is not a valid collection name.

2. Collection name may not contain the character \0 (Null) because this delineates the end of a collection name.

3. User should not create any collections that start with the system., a prefix reserved for internal collections. For example, the system.users collection contains the database’s users, and the system.namespaces collection contains information about all of the database’s collections.

4. User cannot create collection with the reserved character $ in the name. The various drivers available for the database do support using $ in collection names because some system generated collections contain it.

3. Document:

A document is a set of key-value pairs. Documents have dynamic schema. Dynamic schema means that documents in the same collection do not need to have the same set of fields or structure, and common fields in a collection’s documents may hold different types of data.

Terms related with RDBMS and MongoDB

RDBMS	MongoDB
Database	Database
Table	Collection
Tuple/Row	Document
Column	Field
Table Join	Embedded Documents
Primary Key	Primary key (Default Key-id provided by MongoDB itself)

Database Server and Client

mysqld/Oracle	Mongod
mysql/sqlplus	Mongo

Any relational database has a typical schema design that shows number of tables and the relationship between these tables. While in MongoDB, there is no concept of relationship.

v MongoDB Features:

1. Queries: It supports ad-hoc queries and document-based queries.

2. Index-support: Any field in the document can be indexed.

3. Replication: It supports Master-Slave replication. MongoDB uses the native applications to maintain multiple copies of data. Preventing database downtime is one of the replica set’s features as it has a self-healing shard.

4. Multiple Servers: The database can run over multiple servers. Data is duplicated to foolproof the system in the case of hardware failure.

5. Auto-sharding: This process distributes data across multiple physical partitions called shards. Due to sharding, MongoDB has an automatic load balancing feature.

6. Map-reduce: It supports MapReduce and flexible aggregation tools.

7. Failure Handling: In MongoDB, it’s easy to cope with cases of failures. Huge numbers of replicas give out increased protection and data availability against database downtimes like rack failures, multiple machine failures, and data center failures, or even network partitions.

8. GridFS: Without complicating your stack, any size of files can be stored, GridFS feature divides files into smaller parts and stores them as separate documents.

9. Schema-less Database: It is a schema-less database written in C++.

10. Document-oriented Storage: It uses the JSON format document.

11. Procedures: MongoDB JavaScript works well as the database uses the language instead of procedures.

v Advantages of MongoDB over RDBMS:

1. Schema less: MongoDB is a document database in which one collection holds different documents. Number of fields, content ad size of the document can differ from one document to another. Data is stored in the form of JSON style documents.

2. Structure of a single object is clear.

3. No complex joins.

4. Deep query ability. MongoDB supports dynamic queries on documents using a document based query language that is nearly as powerful as SQL.

5. Tuning

6. Easy of scale- out: MongoDB is easy to scale.

7. Conversion/ mapping of an application objects to database objects not needed.

8. Uses internal memory for storing the working set, enabling faster access of data.

v Application areas of MongoDB:

1. Big Data

2. Content management and delivery

3. Mobile and Social infrastructure

4. User Data Management

5. Data Hub

v Data Model in MongoDB:

Data in MongoDB has a flexible schema. Documents in the same collection. They do not need to have the same set of fields or structure Common fields in a collection’s documents may hold different types of data.

MongoDB provides two types of data models:

1. Embedded Data Model:

In this model, user can have (embed) all the related data in a single document, it is also known as de-normalized data model. For example, assume we are using details of employee in three different documents namely, Personal_data, Contact_data and Address, we can embed all the three documents in a single one like;

{

_id: ,

Emp_ID: "10025AE336"

Personal_details:{

First_Name: "Tejas",

Last_Name: "Sharma",

Date_Of_Birth: "1995-09-26"

Contact: {

e-mail: "Tejas123@gmail.com",

phone: "9000022338"

Address: {

city: "Mathura",

Area: "Mathura",

State: "Uttar Pradesh"

}

2. Normalized Data Model:

In this model, we can refer the sub documents in the original document, using references. For example, we can re-write the above document in the normalized model as:

Employee:

               _id: <ObjectId101>,

               Emp_ID: "10025AE336"

Personal_details:

               _id: <ObjectId102>,

               empDocID: " ObjectId101",

               First_Name: "Tejas",

               Last_Name: "Sharma",

               Date_Of_Birth: "1995-09-26"

Contact:

               _id: <ObjectId103>,

               empDocID: " ObjectId101",

               e-mail: "Tejas123@gmail.com",

               phone: "9000022338"

Address:

               _id: <ObjectId104>,

               empDocID: " ObjectId101",

               city: "Mathura",

               Area: "Mathura",

               State: "Uttar Pradesh"

v Designing Schema in MongoDB:

1. Design your schema according to user requirements.

2. Combine objects into one documents if you will use them together, Otherwise separate them(but make sure there should not be need of joins)

3. Duplicate the data (limited) because disk space is cheap as compare to compute time.

4. Do joins while write, not on read.

5. Optimize your schema for most frequent use cases.

6. Do complex aggregation in the schema.

Ø Example:

Suppose a client needs a database design for his blog/website and see the difference between RDBMS and MongoDB schema design. Website has the following requirements;

ü Every post has the unique title, description and url.

ü Every post can have one or more tags.

ü Every post has the name of its publisher and total number of likes.

ü Every post has comments given by users along with their name, messages data-time and likes.

ü On each post, there can be zero or more comments.

In RDBMS schema, design for above requirements will have minimum three tables.

While in MongoDB schema, design will have one collection post and the following structure;

   _id: POST_ID

   title: TITLE_OF_POST,

   description: POST_DESCRIPTION,

   by: POST_BY,

   url: URL_OF_POST,

   tags: [TAG1, TAG2, TAG3],

   likes: TOTAL_LIKES,

   comments: [

         user:'COMMENT_BY',

         message: TEXT,

         dateCreated: DATE_TIME,

         like: LIKES

},

         user:'COMMENT_BY',

         message: TEXT,

         dateCreated: DATE_TIME,

         like: LIKES

So, while showing the data, in RDBMS we need to join three tables and in MongoDB, data will be shown from one collection only.

Rudra Education

MongoDB Architecture

Posted by RudraEducation

Post a Comment

0 Comments

Followers

Search This Blog

Popular Posts

Mastering SQL for Data Science: A Comprehensive Guide Day-10

Mastering SQL for Data Science: A Comprehensive Guide Day-11

Mastering SQL for Data Science: A Comprehensive Guide Day-9

Contact form

Rudra Education

MongoDB Architecture

Posted by RudraEducation

You may like these posts

Post a Comment

0 Comments

Followers

Search This Blog

Popular Posts

Mastering SQL for Data Science: A Comprehensive Guide Day-10

Mastering SQL for Data Science: A Comprehensive Guide Day-11

Mastering SQL for Data Science: A Comprehensive Guide Day-9

Contact form