SlideShare uma empresa Scribd logo
1 de 39
Baixar para ler offline
Scaling MongoDB with Horizontal
and Vertical Sharding
Manosh Malai
CTO, Mydbops LLP
07th Oct 2023
Mydbops 14th Opensource Database Meetup
Interested in Open Source technologies
Interested in MongoDB, DevOps & DevOpSec Practices
Tech Speaker/Blogger
CTO, Mydbops LLP
Manosh Malai
About Me
Consulting
Services
Managed
Services
Focuses on MySQL, MongoDB, PostgreSQL, TiDB and Cassandra
Mydbops Services
Our Clients
1M+ DB Transactions
handling per day
3000+ Servers
Monitored
100+ Database
Migrations to cloud
300+ Happy Clients
Vertical Sharding
Horizontal Sharding
Introduction
Agenda
INTRODUCTION
Database Sharding
Database sharding is the process of storing a large database across
multiple machines
WHEN TO SHARD ?
When To Shard - I
Size of Data: If your database is becoming too large to fit on a single server,
sharding may be necessary to distribute the data across multiple servers.
Performance: Sharding can improve query performance by reducing the amount
of data that needs to be processed on a single server.
When To Shard - II
Scalability: Sharding enables you to horizontally scale out your MongoDB
database by distributing data across multiple nodes.
Availability and Redundancy: Sharding can improve query performance
by reducing the amount of data that needs to be processed on a single
server.
When To Shard - III
Availability: Sharding can improve the overall availability of your database
by providing redundancy across multiple nodes.
Flexibility: Sharding enables you to distribute data across multiple nodes
based on your specific requirements.
Type Of Sharding
Vertical
Sharding
Horizontal
Sharding
Will MongoDB Support Vertical Sharding?
Vertical Sharding
Session
Session
Product Catalog
Carts
Product Catalog
Checkouts
Carts
Checkouts
Distributing tables across multiple Standalone / Replica / Shards
Vertical Sharding Strategy - Pros
Different data access patterns:
Vertical sharding may be useful when different table are accessed at different frequencies or
have different access patterns.
â–Ș
By splitting these tables into different shards, the performance of queries that only need to
access a subset of columns can be improved.
â–Ș
Better data management:
Vertical sharding can provide better control over data access, as sensitive or confidential data
can be stored separately from other data. This can help with compliance with regulations such
as GDPR or HIPAA.
â–Ș
Vertical Sharding Strategy - Cons
Data Interconnectedness:
Vertical sharding may not be the best solution for databases with heavily interconnected data. If
there is a need for complex joins or queries across multiple columns, horizontal sharding or
other scaling strategies may be more appropriate.
â–Ș
Limited Scalability:
Only Suitable for Small or Medium data size.
â–Ș
How We Can Achieve Vertical Sharding?
Service Discovery
â–Ș
Consul
â–Ș
Etcd
â–Ș
ZooKeeper
â–Ș
Data Sync
â–Ș
Mongopush
â–Ș
mongosync
â–Ș
mongodump&mongorestore
â–Ș
Vertical Sharding Strategy
Vertical Sharding: Service Discovery and Data Migration
Use Consul to dynamically discover the nodes in your MongoDB cluster and route traffic to them accordingly.
â–Ș
Mongopush sync the data from X1 Cluster to X2 Cluster
â–Ș
Type Of Sharding
Vertical
Sharding
Horizontal
Sharding
Will MongoDB Support Horizontal Sharding?
What MongoDB Horizontal Sharding and Its Components
Each shard contains a subset of the sharded data
Mongos
Config Server
Shards
Shard Key
Collection Shard Key
Divide and distribute collection evenly using shard key
The shard key consists of a field or fields that exists in the every document in a collection
MongoDB Shard Key
IO Scheduler
Range Sharding
Hash Sharding
Zone Sharding
Pros Cons
Even Data Distribution
â–Ș
Even Read and Write Workload
Distribution
â–Ș
Range queries likely trigger
expensive
‱
broadcast operation
‱
Pros Cons
Even Data Distribution
â–Ș
Target Operation for both single
and ranged queries
â–Ș
Even Read and Write Workload
Distribution
â–Ș
Susceptible to the selection and
usage of good shard key that used
in both read and write queries
‱
Pros Cons
Isolate a specific subset of data on
the specific set of shards
‱
Data geographically closet to
application servers
‱
Data tiering and sla's based on
shard hardware
‱
Susceptible to the selection and
usage of good shard key that used
in both read and write queries
‱
Target and Broadcast Operation
db.collection.find({ })
Target Query
Broadcast Query
db.collection.find({ })
Shard Key Indexes
2.0 + 100%
Single-field Ascending Index
2.0 + 100%
Single-field Hashed Index
2.0 + 100%
Compound Ascending Index
4.4+ 100%
Compound Hashed Index
Declare Shard Key
sh.shardCollection("db.test", {"fieldA" : 1, "fieldB": "hashed"}, false/true, {numInitialChunks: 5, collation: { locale: "simple" }})
sh.shardCollection(namespace, key, unique, options)
When the collection is empty, sh.shardCollection() generates an index on the shard key if an index for that
key does not already exist.
â–Ș
If the collection is not empty, you must create the index first before using sh.shardCollection()
â–Ș
It is not possible to have a shard key index that indicates a multikey index, text index, or geospatial index on
the fields of the shard key.
â–Ș
MongoDB can enforce a uniqueness constraint on ranged shard key index only.
â–Ș
In a compound index with uniqueness, where the shard key is a prefix
â–Ș
MongoDB ensures uniqueness across the entire key combination, rather than individual components of the
shard key.
â–Ș
Shard Key Improvement After MongoDB v4.2
WITHOUT PREFIX COMPRESSION
Mutable Shard key value (v4.2)
Refinable Shard Key (v4.4)
Compound Hashed Shard Key (v4.4)
Live Resharding(v5.0)
What and Why Refinable Shard Key (v4.4)
Shard Key: customer_id
Refining Shard
Key
db.adminCommand({refineCollectionShardKey:
database.collection, key:{<existing Key>, <New Suffix1>: <1|""hashed">,...}})
21%
15%
64%
Shard A Shard B Shard C
Refine at any time
â–Ș
No Database downtime
â–Ș
Refining a collection's shard key
improves data distribution and resolves
issues caused by insufficient cardinality
leading to jumbo chunks.
Refinable Shard Key (v4.4)
Shard Key: vehical_no Refining Shard
Key
db.adminCommand({refineCollectionShardKey: "mydb.test", key:
{vehical_no: 1, user_mnumber: "hashed"}})
Avoid changing the range or hashed type for any existing shard key fields, as it can lead to
inconsistencies in data. For instance, refrain from changing a shard key such as { vehicle_no: 1 }
to { vehicle_no: "hashed", order_id: 1 }.
For refining shard keys, your cluster must have a version of at least 4.4 and a feature compatibility version of 4.4.
â–Ș
Retain the same prefix when defining the new shard key, i.e., it must begin with the same field(s) as the existing
shard key.
â–Ș
When refining shard keys, additional fields can only be added as suffixes to the existing shard key.
â–Ș
To support the modified shard key, it is necessary to create a new index.
â–Ș
Prior to executing the refineCollectionShardKey command, it is essential to stop the balancer.
â–Ș
sh.status to see the status
â–Ș
Guidelines for Refining Shard Keys
Compound Hashed Shard Key (v4.4)
21%
15%
64%
Shard A Shard B Shard C
Existing Shard Key: vehical_no
New Shard Key: vehical_no, user_mnumber
sh.shardCollection( "test.order", {"vehical_no": 1, "user_mnumber": "hashed"})
sh.shardCollection( "test.order", {"vehical_no": "hashed", "user_mnumber": 1})
Overcome Monotonicall
increase key
â–Ș
Live Resharding(v5.0)
Resharding without downtime
Any Combinations Change
Compound Hash Range
Range Range
Range Hash
Resharding Process Flow
Before starting a sharding operation on a collection of 1 TB size, it is recommended to have a minimum of
1.2 TB of free storage.
â–Ș
I/O: Ensure that your I/O capacity is below 50%.
â–Ș
CPU load: Ensure your CPU load is below 80%.
â–Ș
Rewrite your application's queries to use both the current shard key and the new shard key
rewrite your application's queries to use the new shard key without reload
Monitor the resharding process, use a $currentOp pipeline stage
Deploy your rewritten application
Resharding Who's Donor and Recipients
Donor are shards which currently own chunks of the sharded collection
‱
Recipients are shards which would own chunks of the sharded collection according to the new
shard key and zones
‱
Resharding Internal Process Flow
Commit Phase
Clone, Apply, and Catch-up
Phase
Index Phase
Initialization Phase The balancer determines the new data distribution for the sharded collection.
A new empty sharded collection, with the same collection options as the original one, is
created by each shard recipient.
This new collection serves as the target for the new data written by the recipient shards.
Each shard recipient builds the necessary new indexes.
Each recipient of a shard makes a copy of the initial documents that it would be
responsible for under the new shard key
‱
Each shard recipient begins applying oplog entries from operations that happened after the
recipient cloned the data.
‱
When all shards have reached strict consistency, the resharding coordinator commits
the resharding operation and installs the new routing table.
‱
The resharding coordinator instructs each donor and recipient shard primary,
independently, to rename the temporary sharded collection. The temporary collection
becomes the new resharded collection
‱
Each donor shard drops the old sharded collection.
‱
Resharding Process Command
db.adminCommand({
reshardCollection: "mydb.test",
key: {"vehical_no": 1, "user_mnumber": "hashed"}
})
Start the resharding operation
Monitor the resharding operation
db.getSiblingDB("admin").aggregate([
{ $currentOp: { allUsers: true, localOps: false } },
{
$match: {
type: "op",
"originatingCommand.reshardCollection": "mydb.test"
}}])
Abort resharding operation
db.adminCommand({
abortReshardCollection: "mydb.test"
})
To summarize, what issue does this feature resolve?
Jumbo Chunks
‱
Uneven Load Distribution
‱
Decreased Query Performance Over Time by Scatter-gather queries
‱
Improvement From Mongodb 5.2 and 7.X
Default Chunk Size 128 megabytes - 5.2
‱
AutoMerger - 7.0
‱
Reach Us : Info@mydbops.com
Thank You

Mais conteĂșdo relacionado

Semelhante a Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Database Meetup 14

DBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsDBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsSrinivas Mutyala
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
One to Many: The Story of Sharding at Box
One to Many: The Story of Sharding at BoxOne to Many: The Story of Sharding at Box
One to Many: The Story of Sharding at BoxFlorian Jourda
 
Sharding - Seoul 2012
Sharding - Seoul 2012Sharding - Seoul 2012
Sharding - Seoul 2012MongoDB
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...Amazon Web Services
 
Hellenic MongoDB user group - Introduction to sharding
Hellenic MongoDB user group - Introduction to shardingHellenic MongoDB user group - Introduction to sharding
Hellenic MongoDB user group - Introduction to shardingcsoulios
 
Avoiding Data Hotspots at Scale
Avoiding Data Hotspots at ScaleAvoiding Data Hotspots at Scale
Avoiding Data Hotspots at ScaleScyllaDB
 
What We Need to Unlearn about Persistent Storage
What We Need to Unlearn about Persistent StorageWhat We Need to Unlearn about Persistent Storage
What We Need to Unlearn about Persistent StorageScyllaDB
 
MySQL Conference 2011 -- The Secret Sauce of Sharding -- Ryan Thiessen
MySQL Conference 2011 -- The Secret Sauce of Sharding -- Ryan ThiessenMySQL Conference 2011 -- The Secret Sauce of Sharding -- Ryan Thiessen
MySQL Conference 2011 -- The Secret Sauce of Sharding -- Ryan Thiessenryanthiessen
 
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...MongoDB
 
MongoDB in FS
MongoDB in FSMongoDB in FS
MongoDB in FSMongoDB
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impalamarkgrover
 
ВОталОĐč Đ‘ĐŸĐœĐŽĐ°Ń€Đ”ĐœĐșĐŸ "Fast Data Platform for Real-Time Analytics. Architecture ...
ВОталОĐč Đ‘ĐŸĐœĐŽĐ°Ń€Đ”ĐœĐșĐŸ "Fast Data Platform for Real-Time Analytics. Architecture ...ВОталОĐč Đ‘ĐŸĐœĐŽĐ°Ń€Đ”ĐœĐșĐŸ "Fast Data Platform for Real-Time Analytics. Architecture ...
ВОталОĐč Đ‘ĐŸĐœĐŽĐ°Ń€Đ”ĐœĐșĐŸ "Fast Data Platform for Real-Time Analytics. Architecture ...Fwdays
 
Webinar: Sharding
Webinar: ShardingWebinar: Sharding
Webinar: ShardingMongoDB
 
Basic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchBasic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchMongoDB
 
Mongo presentation conf
Mongo presentation confMongo presentation conf
Mongo presentation confShridhar Joshi
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2MongoDB
 
Jose portillo dev con presentation 1138
Jose portillo   dev con presentation 1138Jose portillo   dev con presentation 1138
Jose portillo dev con presentation 1138Jose Portillo
 

Semelhante a Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Database Meetup 14 (20)

DBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsDBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training Presentations
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
One to Many: The Story of Sharding at Box
One to Many: The Story of Sharding at BoxOne to Many: The Story of Sharding at Box
One to Many: The Story of Sharding at Box
 
Sharding - Seoul 2012
Sharding - Seoul 2012Sharding - Seoul 2012
Sharding - Seoul 2012
 
Incredible Impala
Incredible Impala Incredible Impala
Incredible Impala
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
 
Hellenic MongoDB user group - Introduction to sharding
Hellenic MongoDB user group - Introduction to shardingHellenic MongoDB user group - Introduction to sharding
Hellenic MongoDB user group - Introduction to sharding
 
Avoiding Data Hotspots at Scale
Avoiding Data Hotspots at ScaleAvoiding Data Hotspots at Scale
Avoiding Data Hotspots at Scale
 
What We Need to Unlearn about Persistent Storage
What We Need to Unlearn about Persistent StorageWhat We Need to Unlearn about Persistent Storage
What We Need to Unlearn about Persistent Storage
 
MySQL Conference 2011 -- The Secret Sauce of Sharding -- Ryan Thiessen
MySQL Conference 2011 -- The Secret Sauce of Sharding -- Ryan ThiessenMySQL Conference 2011 -- The Secret Sauce of Sharding -- Ryan Thiessen
MySQL Conference 2011 -- The Secret Sauce of Sharding -- Ryan Thiessen
 
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
 
MongoDB in FS
MongoDB in FSMongoDB in FS
MongoDB in FS
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
ВОталОĐč Đ‘ĐŸĐœĐŽĐ°Ń€Đ”ĐœĐșĐŸ "Fast Data Platform for Real-Time Analytics. Architecture ...
ВОталОĐč Đ‘ĐŸĐœĐŽĐ°Ń€Đ”ĐœĐșĐŸ "Fast Data Platform for Real-Time Analytics. Architecture ...ВОталОĐč Đ‘ĐŸĐœĐŽĐ°Ń€Đ”ĐœĐșĐŸ "Fast Data Platform for Real-Time Analytics. Architecture ...
ВОталОĐč Đ‘ĐŸĐœĐŽĐ°Ń€Đ”ĐœĐșĐŸ "Fast Data Platform for Real-Time Analytics. Architecture ...
 
Webinar: Sharding
Webinar: ShardingWebinar: Sharding
Webinar: Sharding
 
Basic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchBasic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun Verch
 
Mongo presentation conf
Mongo presentation confMongo presentation conf
Mongo presentation conf
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2
 
Jose portillo dev con presentation 1138
Jose portillo   dev con presentation 1138Jose portillo   dev con presentation 1138
Jose portillo dev con presentation 1138
 

Mais de Mydbops

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
PostgreSQL Schema Changes with pg-osc - Mydbops @ PGConf India 2024
PostgreSQL Schema Changes with pg-osc - Mydbops @ PGConf India 2024PostgreSQL Schema Changes with pg-osc - Mydbops @ PGConf India 2024
PostgreSQL Schema Changes with pg-osc - Mydbops @ PGConf India 2024Mydbops
 
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...Mydbops
 
Mastering Aurora PostgreSQL Clusters for Disaster Recovery
Mastering Aurora PostgreSQL Clusters for Disaster RecoveryMastering Aurora PostgreSQL Clusters for Disaster Recovery
Mastering Aurora PostgreSQL Clusters for Disaster RecoveryMydbops
 
Navigating Transactions: ACID Complexity in Modern Databases- Mydbops Open So...
Navigating Transactions: ACID Complexity in Modern Databases- Mydbops Open So...Navigating Transactions: ACID Complexity in Modern Databases- Mydbops Open So...
Navigating Transactions: ACID Complexity in Modern Databases- Mydbops Open So...Mydbops
 
AWS RDS in MySQL 2023 Vinoth Kanna @ Mydbops OpenSource Database Meetup 15
AWS RDS in MySQL 2023 Vinoth Kanna @ Mydbops OpenSource Database Meetup 15AWS RDS in MySQL 2023 Vinoth Kanna @ Mydbops OpenSource Database Meetup 15
AWS RDS in MySQL 2023 Vinoth Kanna @ Mydbops OpenSource Database Meetup 15Mydbops
 
Data-at-scale-with-TIDB Mydbops Co-Founder Kabilesh PR at LSPE Event
Data-at-scale-with-TIDB Mydbops Co-Founder Kabilesh PR at LSPE EventData-at-scale-with-TIDB Mydbops Co-Founder Kabilesh PR at LSPE Event
Data-at-scale-with-TIDB Mydbops Co-Founder Kabilesh PR at LSPE EventMydbops
 
MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...
MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...
MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...Mydbops
 
Mastering MongoDB Atlas: Essentials of Diagnostics and Debugging in the Cloud...
Mastering MongoDB Atlas: Essentials of Diagnostics and Debugging in the Cloud...Mastering MongoDB Atlas: Essentials of Diagnostics and Debugging in the Cloud...
Mastering MongoDB Atlas: Essentials of Diagnostics and Debugging in the Cloud...Mydbops
 
Data Organisation: Table Partitioning in PostgreSQL
Data Organisation: Table Partitioning in PostgreSQLData Organisation: Table Partitioning in PostgreSQL
Data Organisation: Table Partitioning in PostgreSQLMydbops
 
Navigating MongoDB's Queryable Encryption for Ultimate Security - Mydbops
Navigating MongoDB's Queryable Encryption for Ultimate Security - MydbopsNavigating MongoDB's Queryable Encryption for Ultimate Security - Mydbops
Navigating MongoDB's Queryable Encryption for Ultimate Security - MydbopsMydbops
 
Data High Availability With TIDB
Data High Availability With TIDBData High Availability With TIDB
Data High Availability With TIDBMydbops
 
Mastering Database Migration_ Native replication (8.0) to InnoDB Cluster (8.0...
Mastering Database Migration_ Native replication (8.0) to InnoDB Cluster (8.0...Mastering Database Migration_ Native replication (8.0) to InnoDB Cluster (8.0...
Mastering Database Migration_ Native replication (8.0) to InnoDB Cluster (8.0...Mydbops
 
Enhancing Security of MySQL Connections using SSL certificates
Enhancing Security of MySQL Connections using SSL certificatesEnhancing Security of MySQL Connections using SSL certificates
Enhancing Security of MySQL Connections using SSL certificatesMydbops
 
Exploring the Fundamentals of YugabyteDB - Mydbops
Exploring the Fundamentals of YugabyteDB - Mydbops Exploring the Fundamentals of YugabyteDB - Mydbops
Exploring the Fundamentals of YugabyteDB - Mydbops Mydbops
 
Time series in MongoDB - Mydbops
Time series in MongoDB - Mydbops Time series in MongoDB - Mydbops
Time series in MongoDB - Mydbops Mydbops
 
TiDB in a Nutshell - Power of Open-Source Distributed SQL Database - Mydbops
TiDB in a Nutshell - Power of Open-Source Distributed SQL Database - MydbopsTiDB in a Nutshell - Power of Open-Source Distributed SQL Database - Mydbops
TiDB in a Nutshell - Power of Open-Source Distributed SQL Database - MydbopsMydbops
 
Achieving High Availability in PostgreSQL
Achieving High Availability in PostgreSQLAchieving High Availability in PostgreSQL
Achieving High Availability in PostgreSQLMydbops
 
MySQL Data Encryption at Rest
MySQL Data Encryption at RestMySQL Data Encryption at Rest
MySQL Data Encryption at RestMydbops
 
Top-10-Features-In-MySQL-8.0 - Vinoth Kanna RS - Mydbops Team
Top-10-Features-In-MySQL-8.0 - Vinoth Kanna RS - Mydbops TeamTop-10-Features-In-MySQL-8.0 - Vinoth Kanna RS - Mydbops Team
Top-10-Features-In-MySQL-8.0 - Vinoth Kanna RS - Mydbops TeamMydbops
 

Mais de Mydbops (20)

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
PostgreSQL Schema Changes with pg-osc - Mydbops @ PGConf India 2024
PostgreSQL Schema Changes with pg-osc - Mydbops @ PGConf India 2024PostgreSQL Schema Changes with pg-osc - Mydbops @ PGConf India 2024
PostgreSQL Schema Changes with pg-osc - Mydbops @ PGConf India 2024
 
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
 
Mastering Aurora PostgreSQL Clusters for Disaster Recovery
Mastering Aurora PostgreSQL Clusters for Disaster RecoveryMastering Aurora PostgreSQL Clusters for Disaster Recovery
Mastering Aurora PostgreSQL Clusters for Disaster Recovery
 
Navigating Transactions: ACID Complexity in Modern Databases- Mydbops Open So...
Navigating Transactions: ACID Complexity in Modern Databases- Mydbops Open So...Navigating Transactions: ACID Complexity in Modern Databases- Mydbops Open So...
Navigating Transactions: ACID Complexity in Modern Databases- Mydbops Open So...
 
AWS RDS in MySQL 2023 Vinoth Kanna @ Mydbops OpenSource Database Meetup 15
AWS RDS in MySQL 2023 Vinoth Kanna @ Mydbops OpenSource Database Meetup 15AWS RDS in MySQL 2023 Vinoth Kanna @ Mydbops OpenSource Database Meetup 15
AWS RDS in MySQL 2023 Vinoth Kanna @ Mydbops OpenSource Database Meetup 15
 
Data-at-scale-with-TIDB Mydbops Co-Founder Kabilesh PR at LSPE Event
Data-at-scale-with-TIDB Mydbops Co-Founder Kabilesh PR at LSPE EventData-at-scale-with-TIDB Mydbops Co-Founder Kabilesh PR at LSPE Event
Data-at-scale-with-TIDB Mydbops Co-Founder Kabilesh PR at LSPE Event
 
MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...
MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...
MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...
 
Mastering MongoDB Atlas: Essentials of Diagnostics and Debugging in the Cloud...
Mastering MongoDB Atlas: Essentials of Diagnostics and Debugging in the Cloud...Mastering MongoDB Atlas: Essentials of Diagnostics and Debugging in the Cloud...
Mastering MongoDB Atlas: Essentials of Diagnostics and Debugging in the Cloud...
 
Data Organisation: Table Partitioning in PostgreSQL
Data Organisation: Table Partitioning in PostgreSQLData Organisation: Table Partitioning in PostgreSQL
Data Organisation: Table Partitioning in PostgreSQL
 
Navigating MongoDB's Queryable Encryption for Ultimate Security - Mydbops
Navigating MongoDB's Queryable Encryption for Ultimate Security - MydbopsNavigating MongoDB's Queryable Encryption for Ultimate Security - Mydbops
Navigating MongoDB's Queryable Encryption for Ultimate Security - Mydbops
 
Data High Availability With TIDB
Data High Availability With TIDBData High Availability With TIDB
Data High Availability With TIDB
 
Mastering Database Migration_ Native replication (8.0) to InnoDB Cluster (8.0...
Mastering Database Migration_ Native replication (8.0) to InnoDB Cluster (8.0...Mastering Database Migration_ Native replication (8.0) to InnoDB Cluster (8.0...
Mastering Database Migration_ Native replication (8.0) to InnoDB Cluster (8.0...
 
Enhancing Security of MySQL Connections using SSL certificates
Enhancing Security of MySQL Connections using SSL certificatesEnhancing Security of MySQL Connections using SSL certificates
Enhancing Security of MySQL Connections using SSL certificates
 
Exploring the Fundamentals of YugabyteDB - Mydbops
Exploring the Fundamentals of YugabyteDB - Mydbops Exploring the Fundamentals of YugabyteDB - Mydbops
Exploring the Fundamentals of YugabyteDB - Mydbops
 
Time series in MongoDB - Mydbops
Time series in MongoDB - Mydbops Time series in MongoDB - Mydbops
Time series in MongoDB - Mydbops
 
TiDB in a Nutshell - Power of Open-Source Distributed SQL Database - Mydbops
TiDB in a Nutshell - Power of Open-Source Distributed SQL Database - MydbopsTiDB in a Nutshell - Power of Open-Source Distributed SQL Database - Mydbops
TiDB in a Nutshell - Power of Open-Source Distributed SQL Database - Mydbops
 
Achieving High Availability in PostgreSQL
Achieving High Availability in PostgreSQLAchieving High Availability in PostgreSQL
Achieving High Availability in PostgreSQL
 
MySQL Data Encryption at Rest
MySQL Data Encryption at RestMySQL Data Encryption at Rest
MySQL Data Encryption at Rest
 
Top-10-Features-In-MySQL-8.0 - Vinoth Kanna RS - Mydbops Team
Top-10-Features-In-MySQL-8.0 - Vinoth Kanna RS - Mydbops TeamTop-10-Features-In-MySQL-8.0 - Vinoth Kanna RS - Mydbops Team
Top-10-Features-In-MySQL-8.0 - Vinoth Kanna RS - Mydbops Team
 

Último

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo GarcĂ­a Lavilla
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Último (20)

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Database Meetup 14

  • 1. Scaling MongoDB with Horizontal and Vertical Sharding Manosh Malai CTO, Mydbops LLP 07th Oct 2023 Mydbops 14th Opensource Database Meetup
  • 2. Interested in Open Source technologies Interested in MongoDB, DevOps & DevOpSec Practices Tech Speaker/Blogger CTO, Mydbops LLP Manosh Malai About Me
  • 3. Consulting Services Managed Services Focuses on MySQL, MongoDB, PostgreSQL, TiDB and Cassandra Mydbops Services
  • 4. Our Clients 1M+ DB Transactions handling per day 3000+ Servers Monitored 100+ Database Migrations to cloud 300+ Happy Clients
  • 7. Database Sharding Database sharding is the process of storing a large database across multiple machines
  • 9. When To Shard - I Size of Data: If your database is becoming too large to fit on a single server, sharding may be necessary to distribute the data across multiple servers. Performance: Sharding can improve query performance by reducing the amount of data that needs to be processed on a single server.
  • 10. When To Shard - II Scalability: Sharding enables you to horizontally scale out your MongoDB database by distributing data across multiple nodes. Availability and Redundancy: Sharding can improve query performance by reducing the amount of data that needs to be processed on a single server.
  • 11. When To Shard - III Availability: Sharding can improve the overall availability of your database by providing redundancy across multiple nodes. Flexibility: Sharding enables you to distribute data across multiple nodes based on your specific requirements.
  • 13. Will MongoDB Support Vertical Sharding?
  • 14. Vertical Sharding Session Session Product Catalog Carts Product Catalog Checkouts Carts Checkouts Distributing tables across multiple Standalone / Replica / Shards
  • 15. Vertical Sharding Strategy - Pros Different data access patterns: Vertical sharding may be useful when different table are accessed at different frequencies or have different access patterns. â–Ș By splitting these tables into different shards, the performance of queries that only need to access a subset of columns can be improved. â–Ș Better data management: Vertical sharding can provide better control over data access, as sensitive or confidential data can be stored separately from other data. This can help with compliance with regulations such as GDPR or HIPAA. â–Ș
  • 16. Vertical Sharding Strategy - Cons Data Interconnectedness: Vertical sharding may not be the best solution for databases with heavily interconnected data. If there is a need for complex joins or queries across multiple columns, horizontal sharding or other scaling strategies may be more appropriate. â–Ș Limited Scalability: Only Suitable for Small or Medium data size. â–Ș
  • 17. How We Can Achieve Vertical Sharding? Service Discovery â–Ș Consul â–Ș Etcd â–Ș ZooKeeper â–Ș Data Sync â–Ș Mongopush â–Ș mongosync â–Ș mongodump&mongorestore â–Ș
  • 19. Vertical Sharding: Service Discovery and Data Migration Use Consul to dynamically discover the nodes in your MongoDB cluster and route traffic to them accordingly. â–Ș Mongopush sync the data from X1 Cluster to X2 Cluster â–Ș
  • 21. Will MongoDB Support Horizontal Sharding?
  • 22. What MongoDB Horizontal Sharding and Its Components Each shard contains a subset of the sharded data Mongos Config Server Shards
  • 23. Shard Key Collection Shard Key Divide and distribute collection evenly using shard key The shard key consists of a field or fields that exists in the every document in a collection
  • 24. MongoDB Shard Key IO Scheduler Range Sharding Hash Sharding Zone Sharding Pros Cons Even Data Distribution â–Ș Even Read and Write Workload Distribution â–Ș Range queries likely trigger expensive ‱ broadcast operation ‱ Pros Cons Even Data Distribution â–Ș Target Operation for both single and ranged queries â–Ș Even Read and Write Workload Distribution â–Ș Susceptible to the selection and usage of good shard key that used in both read and write queries ‱ Pros Cons Isolate a specific subset of data on the specific set of shards ‱ Data geographically closet to application servers ‱ Data tiering and sla's based on shard hardware ‱ Susceptible to the selection and usage of good shard key that used in both read and write queries ‱
  • 25. Target and Broadcast Operation db.collection.find({ }) Target Query Broadcast Query db.collection.find({ })
  • 26. Shard Key Indexes 2.0 + 100% Single-field Ascending Index 2.0 + 100% Single-field Hashed Index 2.0 + 100% Compound Ascending Index 4.4+ 100% Compound Hashed Index
  • 27. Declare Shard Key sh.shardCollection("db.test", {"fieldA" : 1, "fieldB": "hashed"}, false/true, {numInitialChunks: 5, collation: { locale: "simple" }}) sh.shardCollection(namespace, key, unique, options) When the collection is empty, sh.shardCollection() generates an index on the shard key if an index for that key does not already exist. â–Ș If the collection is not empty, you must create the index first before using sh.shardCollection() â–Ș It is not possible to have a shard key index that indicates a multikey index, text index, or geospatial index on the fields of the shard key. â–Ș MongoDB can enforce a uniqueness constraint on ranged shard key index only. â–Ș In a compound index with uniqueness, where the shard key is a prefix â–Ș MongoDB ensures uniqueness across the entire key combination, rather than individual components of the shard key. â–Ș
  • 28. Shard Key Improvement After MongoDB v4.2 WITHOUT PREFIX COMPRESSION Mutable Shard key value (v4.2) Refinable Shard Key (v4.4) Compound Hashed Shard Key (v4.4) Live Resharding(v5.0)
  • 29. What and Why Refinable Shard Key (v4.4) Shard Key: customer_id Refining Shard Key db.adminCommand({refineCollectionShardKey: database.collection, key:{<existing Key>, <New Suffix1>: <1|""hashed">,...}}) 21% 15% 64% Shard A Shard B Shard C Refine at any time â–Ș No Database downtime â–Ș Refining a collection's shard key improves data distribution and resolves issues caused by insufficient cardinality leading to jumbo chunks.
  • 30. Refinable Shard Key (v4.4) Shard Key: vehical_no Refining Shard Key db.adminCommand({refineCollectionShardKey: "mydb.test", key: {vehical_no: 1, user_mnumber: "hashed"}}) Avoid changing the range or hashed type for any existing shard key fields, as it can lead to inconsistencies in data. For instance, refrain from changing a shard key such as { vehicle_no: 1 } to { vehicle_no: "hashed", order_id: 1 }. For refining shard keys, your cluster must have a version of at least 4.4 and a feature compatibility version of 4.4. â–Ș Retain the same prefix when defining the new shard key, i.e., it must begin with the same field(s) as the existing shard key. â–Ș When refining shard keys, additional fields can only be added as suffixes to the existing shard key. â–Ș To support the modified shard key, it is necessary to create a new index. â–Ș Prior to executing the refineCollectionShardKey command, it is essential to stop the balancer. â–Ș sh.status to see the status â–Ș Guidelines for Refining Shard Keys
  • 31. Compound Hashed Shard Key (v4.4) 21% 15% 64% Shard A Shard B Shard C Existing Shard Key: vehical_no New Shard Key: vehical_no, user_mnumber sh.shardCollection( "test.order", {"vehical_no": 1, "user_mnumber": "hashed"}) sh.shardCollection( "test.order", {"vehical_no": "hashed", "user_mnumber": 1}) Overcome Monotonicall increase key â–Ș
  • 32. Live Resharding(v5.0) Resharding without downtime Any Combinations Change Compound Hash Range Range Range Range Hash
  • 33. Resharding Process Flow Before starting a sharding operation on a collection of 1 TB size, it is recommended to have a minimum of 1.2 TB of free storage. â–Ș I/O: Ensure that your I/O capacity is below 50%. â–Ș CPU load: Ensure your CPU load is below 80%. â–Ș Rewrite your application's queries to use both the current shard key and the new shard key rewrite your application's queries to use the new shard key without reload Monitor the resharding process, use a $currentOp pipeline stage Deploy your rewritten application
  • 34. Resharding Who's Donor and Recipients Donor are shards which currently own chunks of the sharded collection ‱ Recipients are shards which would own chunks of the sharded collection according to the new shard key and zones ‱
  • 35. Resharding Internal Process Flow Commit Phase Clone, Apply, and Catch-up Phase Index Phase Initialization Phase The balancer determines the new data distribution for the sharded collection. A new empty sharded collection, with the same collection options as the original one, is created by each shard recipient. This new collection serves as the target for the new data written by the recipient shards. Each shard recipient builds the necessary new indexes. Each recipient of a shard makes a copy of the initial documents that it would be responsible for under the new shard key ‱ Each shard recipient begins applying oplog entries from operations that happened after the recipient cloned the data. ‱ When all shards have reached strict consistency, the resharding coordinator commits the resharding operation and installs the new routing table. ‱ The resharding coordinator instructs each donor and recipient shard primary, independently, to rename the temporary sharded collection. The temporary collection becomes the new resharded collection ‱ Each donor shard drops the old sharded collection. ‱
  • 36. Resharding Process Command db.adminCommand({ reshardCollection: "mydb.test", key: {"vehical_no": 1, "user_mnumber": "hashed"} }) Start the resharding operation Monitor the resharding operation db.getSiblingDB("admin").aggregate([ { $currentOp: { allUsers: true, localOps: false } }, { $match: { type: "op", "originatingCommand.reshardCollection": "mydb.test" }}]) Abort resharding operation db.adminCommand({ abortReshardCollection: "mydb.test" })
  • 37. To summarize, what issue does this feature resolve? Jumbo Chunks ‱ Uneven Load Distribution ‱ Decreased Query Performance Over Time by Scatter-gather queries ‱
  • 38. Improvement From Mongodb 5.2 and 7.X Default Chunk Size 128 megabytes - 5.2 ‱ AutoMerger - 7.0 ‱
  • 39. Reach Us : Info@mydbops.com Thank You