Module 3: NoSQL and Document-Oriented Databases
Module Overview
Need to deal with Big Data? You may need tools beyond standard SQL approaches. Enter NoSQL and document-oriented databases! In this module, we explore the world of NoSQL databases with a focus on MongoDB, one of the most popular document-oriented databases. We'll learn how to store, retrieve, and query data in a schema-less environment.
Learning Objectives
- Deploy and use a simple MongoDB instance
- Build a data pipeline between SQL (SQLite) and NoSQL (MongoDB) databases
Setting Up MongoDB in 2025
MongoDB has evolved into a leading cloud database platform, with MongoDB Atlas as the recommended solution for most users. Atlas is a fully managed cloud service that simplifies deployment, scaling, and management. While local installations are still available for on-premises or offline use, Atlas is preferred for its ease of use, global availability, and automatic scaling. Here's what you need to know to get started in 2025:
Modern Setup Process
- MongoDB Atlas Cloud: A cloud-based platform that eliminates the need for local setup in most cases, offering automated management, backups, and global deployment across AWS, Azure, and Google Cloud.
- Flex Clusters: A flexible pricing model replacing older Shared (M2/M5) and Serverless instances. Flex Clusters provide usage-based scaling for variable workloads, starting with 5GB storage and supporting up to 500 operations per second, ideal for small to medium applications.
- Enhanced Security: Includes network isolation (restricting access to specific networks), end-to-end encryption, and granular access controls (role-based permissions). Higher-tier clusters offer advanced features like LDAP integration and database auditing.
- Integration Tools: Supports popular frameworks like Node.js, Python, and Java, and integrates with cloud providers (AWS, Azure, Google Cloud) and tools like MongoDB Compass for GUI management.
Current Deployment Options (2025)
- Free Tier (M0): Offers 512MB storage and shared resources, perfect for learning, prototyping, or small projects. Note: Features like Atlas Search and advanced backups are not supported.
- Flex Clusters: Start at approximately $10–$50/month (check MongoDB’s pricing page for current rates) with 5GB+ storage and usage-based pricing. Costs are capped at ~$30/month for up to 500 ops/sec, ideal for apps with variable traffic.
- Dedicated Clusters: Start at ~$57–$60/month for M10 clusters (check MongoDB’s pricing page), offering 10GB storage, 2 vCPUs, and dedicated resources for production applications.
Note: Pricing varies by cloud provider (AWS, Azure, Google Cloud), region, and usage (e.g., compute, storage, data transfer). Always verify current rates on MongoDB’s pricing page, as costs may fluctuate.
Key Requirements
- A MongoDB Atlas account (free tier available, sign up with an email address)
- Python 3.7+ (preferably 3.10+) with the pymongo package (install via
pip install pymongo
; verify withpython -m pip show pymongo
) - MongoDB Compass (optional, free GUI tool for visualizing and managing databases, downloadable from MongoDB’s website)
- A stable internet connection and modern web browser (e.g., Chrome, Firefox) for accessing the Atlas dashboard
Quick Start Guide
To set up a free MongoDB Atlas cluster:
- Sign up at MongoDB Atlas.
- Choose the Free Tier (M0), select a cloud provider (e.g., AWS), and pick a region (e.g., US East).
- Configure your cluster’s security: Add your IP address to the IP whitelist (e.g., 0.0.0.0/0 for access from anywhere, but use caution) and create a database user with a username and password.
- Deploy the cluster (takes ~5 minutes). Copy the connection string to connect via your app or MongoDB Compass.
- Test your connection using Python: Install pymongo (
pip install pymongo
) and use the connection string in your code (replacewith your user’s password).
Tip: Ensure your IP whitelist includes your current network to avoid connection errors. For production apps, use VPC peering or private endpoints for secure access.
Additional Learning Resources
Enhance your MongoDB skills with these free resources:
- MongoDB University: Start with the “MongoDB Basics” course to learn core concepts like documents, collections, and queries.
- MongoDB Community Forums: Ask questions and get help from experts and peers.
- MongoDB Atlas Documentation: Detailed guides on setup, configuration, and best practices.
Note: As of 2025, MongoDB has deprecated M2/M5 Shared clusters and Serverless instances, transitioning to Flex Clusters for greater flexibility. Atlas is the industry standard for cloud deployments due to its automated management, global scalability, and cost predictability. Local installations (e.g., MongoDB Community Edition) remain an option for on-premises or offline environments but require manual setup and maintenance.
Guided Project
In this guided project, we'll learn how to work with NoSQL databases and build data pipelines between different database types. Open guided-project.md in the GitHub repository below to follow along with the guided project.
The GitHub repository contains valuable resources, examples, and documentation that align with the lecture content and learning objectives. Take time to review these materials as they will help reinforce your understanding of NoSQL databases and MongoDB implementation.
Module Assignment
For this assignment, you'll practice working with MongoDB, creating document-oriented databases, and building data pipelines between SQL and NoSQL systems.