Skip to content

Commit

Permalink
mysql performance optimization
Browse files Browse the repository at this point in the history
  • Loading branch information
santhalakshminarayana committed Nov 11, 2024
1 parent b2de153 commit bf62065
Show file tree
Hide file tree
Showing 7 changed files with 319 additions and 113 deletions.
3 changes: 2 additions & 1 deletion components/Tags.js
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,12 @@ const tags = [
"ai",
"system-design",
"python",
"python-performance",
"go",
"mysql",
"image-processing",
"opencv",
"concurrency",
"python-performance",
"color-science",
"react",
"next-js",
Expand Down
295 changes: 295 additions & 0 deletions posts/mysql-performance-optimization-techniques.mdx

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion posts/retinex-theory-of-color-vision.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ Reflectance values for various Munsell chips are determined under the same white

Now back to the experiment, a very close relation between reflectance and illumination is observed that the light energy reaching the eye is equal to reflectance product illumination.

![Reflectance product Illumination:=:90:=:](retinex-theory/reflectance-illumination-relation.jpg)
![Reflectance product Illumination:=:90](retinex-theory/reflectance-illumination-relation.jpg)

In the above image, the left side charts denote Mondrian color paper values and the right side denotes matched Munsell chip values. The bar chart denoting reflectance has values in scaled reflectance in the range of 0.0-1.0 for each of the three narrow-bands 630nm, 530nm, and 450nm. For both Mondrian and Munsell, light energy at the eye (recorded by photometer) is equal to the product of reflectance and illuminant.

Expand Down
91 changes: 0 additions & 91 deletions posts/software-development-comprehensive-guide.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -639,15 +639,6 @@ Microservices are full cohesive small autonomous components that provide indepen

## Database Optimizations

### Performance Improvements
- Connection pool
- Query cache
- Join de-normalization
- Query re-write
- Index and views
- Open Table Cache
- Load balance operations

### Operations-Based Scaling
- Read-heavy: Horizontal scaling with multiple read-replicas.
- Write-heavy: Horizontal scaling with data partition/sharded replicas that can be vertically scaled.
Expand Down Expand Up @@ -843,88 +834,6 @@ Microservices are full cohesive small autonomous components that provide indepen
- https://medium.com/@mani.saksham12/raft-and-paxos-consensus-algorithms-for-distributed-systems-138cd7c2d35a


---

## MySQL Internals, Configurations & Optimizations

#### Locks
- Shared
- Exclusive
- Gap (*where* lock)
- Control (*for share*, *for update*)

### Isolation Levels
When multiple connections try to change the same data, how can transaction isolation be maintained so that the following problems can be avoided?
- **Dirty Reads**: A query in a transaction may return inconsistent data due to uncommitted changes in other transactions.
- **Non-repeatable Reads**: The same query in a transaction reads different data rows if executed multiple times. This may happen due to other transactions committed to the changes.
- **Phantom Reads**: The data rows returned by the select statements differ within the same transaction as another transaction might have inserted new rows.

The following isolation levels can be set in MySQL for transaction isolation:
- **Read Uncommitted**: Transactions see other transactions' un-committed changes that may cause data inconsistencies. Suitable for highly frequent updates where accuracy/consistency is not critical. Ex: Dashboards, Analytics. Solves: None
- **Read Committed**: Transactions see other transactions changes only if they are committed. Consistent data but anything can happen between transactions. Suitable for both highly frequent and data-consistent scenarios. Ex: Banking, reservations. Solves: Dirty Reads.
- **Repeatable Read**: A transaction sees the same snapshot of data throughout the life cycle and is unaffected by other transactions. Default in MySQL with InnoDB's MVCC(Multi-Version Concurrency Control). Suitable for data-consistent systems where performance can be compromised. Ex: General Web services. Solves Dirty Reads, Non-Repeatable Reads
- **Serialization**: Transactions lock the rows restricting other transactions to wait until completion. Suitable for high data consistent systems. Ex: Financial services. Solves: Dirty Reads, Non-repeatable Reads, Phantom Reads

### Connection pool
- A DB connection comprises opening tcp socket, acknowledgement, authentication, authorization, network session creation, etc. So, it takes time to open a new connection every time. So maintain a pool of connections to re-use.
- By default, the *max_open_connections* a MySQL server can handle is 150, but can be set up to 2^32 (but up to 100,000 should be the limit generally).
- Types of connection pooling:
- Session: Maintain connection until the session completes. The client can make any number of transactions until connection timeout is reached.
- Transaction: Connection is returned to the pool when the transaction completes.
- Statement: A connection is used only for a single SQL statement.
- Generally, *max_pool_size* (*max_active_connections*) is set to (2 or 4 * no.of cores), but it varies depending on the type of application and the traffic.
- For normal setup, *max_idle_connections* (ex: 80) will be less than *max_pool_size* (ex: 100, and 20 connections will be closed after use as the max idle connections are 80).
- For high concurrent systems, set the *max_idle_connections* the same as the *max_pool_size*, idle connections take some memory but it's a trade-off compared to the overhead of opening connections for highly frequent requests.

### Scaling challenges
- Max open connections, idle connections, and pool size should be limited considering the system resources limit.
- System resources like memory, CPU cache, and data storage are required if there are more connections opened at a time. If the system can't handle more connections, all operations will be rejected and that leads to data inconsistency.
- In Linux servers, the *ulimit* restricts the max open file descriptors.
- MySQL is multi-threaded and allocates one thread per connection which requires thread management overhead along with system resources.
- Configuration variables like *thread_cache_size* ({8 + (max_connections/100)} defaults to 8-100, and multiplexed to cores) which defines how many threads can be cached for re-use when the client disconnects. This also requires additional memory but improves performance.
- MySQL creates a THD (Thread Handle Descriptor) for each connection with a minimum memory of ~10KB and can grow to ~10MB for average connection when executing queries. So, handling huge no.of parallel connections requires huge memory requirements and also high thrashing.
- As more no.of connections increases but the max_connections are set in limit which are nothing but user threads, there has to be a balance between the user thread-per-core ratio (max ratio recommended is ~4) and the latency. So, based on this, the transactions-per-second (TPS) the server can handle can be determined and has to scale the DB for the expected load.
- Another important challenge is the underlying disk storage. If there is huge data stored, the user threads spend most of the time for data to arrive from disk. So, better disk storage mechanism has to be considered like SSDs, cache, read/write heavy disks, etc.
- In MySQL thread pool, correctly tuning the *thread_pool_size* and *max_transaction_limit* for high concurrency is very difficult but it's better than the default thread handling mechanism.

**Ref**:
- https://dev.mysql.com/blog-archive/mysql-connection-handling-and-scaling/
- https://dev.mysql.com/blog-archive/the-new-mysql-thread-pool/
- https://planetscale.com/blog/mysql-isolation-levels-and-how-they-work
- https://dev.to/dbvismarketing/a-guide-to-multithreading-in-sql-3hh1


---

## MySQL Monitoring and Optimizations
- MySQL can handle high concurrent traffic with a single instance if it is deployed with high CPU cores, fast access storage (SSD), and high RAM.
- Some of the variables to look out for monitoring are
- max_connections
- table_open_cache
- threads_connected
- threads_running
- thread_cache_size
- innodb_buffer_pool_size
- Also monitor the status variables like
- Com_select, Com_insert, Com_update, Com_delete
- status_queries
- max_used_connections
- Check the reads and writes for a table with index used or not. Read count increases even without any operations also due to various reasons like background processes, cache eviction, etc., So these are not very accurate but give the proportion of operations that MySQL performs.
```text
select object_schema, object_name, count_read, count_write, index_name
from performance_schema.table_io_waits_summary_by_index_usage
order by count_read+count_write desc limit 5;
```
- Check the max percentage of concurrent connections at a time relative to max connections. If the value touches >95%, then increase the max_connections value.
```text
100 * threads_connected / max_connections
```
- Reduce the no.of queries and use caching for not frequently changing data.
- Increase the max_connections and thread_cache_size for handling high concurrent connections with reuse.
- Increase the innodb_buffer_pool_size for more cached data and less time for the server to read from the disk.
- Monitor the p99 which measures the latency of 99% of all transactions and check if that latency is within the defined limits.


---

## System Monitoring and Optimizations
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
41 changes: 21 additions & 20 deletions public/sitemap.xml
Original file line number Diff line number Diff line change
@@ -1,23 +1,24 @@
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:mobile="http://www.google.com/schemas/sitemap-mobile/1.0" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
<url><loc>https://santhalakshminarayana.github.io</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.529Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/about</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/advanced-golang-memory-model-concurrency</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/build-blog-with-nextjs-mdx-and-deploy-to-github-pages</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/chromatic-adaptation</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/color-science</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/color-theory</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/concurrency-patterns-python</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/create-a-notes-app-with-flutter</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/doppalf-rag-powered-ai-chatbot</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/go-gotchas-and-good-practices</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/retinex-image-enhancement</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/retinex-theory-of-color-vision</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/software-development-comprehensive-guide</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/super-fast-python-cython</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/super-fast-python-good-practices</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/super-fast-python-multi-processing</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/super-fast-python-numba</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/super-fast-python-why-python-slow</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/whiteboard-image-enhancement-opencv-python</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-06T14:09:31.530Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/about</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/advanced-golang-memory-model-concurrency</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/build-blog-with-nextjs-mdx-and-deploy-to-github-pages</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/chromatic-adaptation</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/color-science</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/color-theory</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/concurrency-patterns-python</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/create-a-notes-app-with-flutter</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/doppalf-rag-powered-ai-chatbot</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/go-gotchas-and-good-practices</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/mysql-performance-optimization-techniques</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/retinex-image-enhancement</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/retinex-theory-of-color-vision</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/software-development-comprehensive-guide</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/super-fast-python-cython</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/super-fast-python-good-practices</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/super-fast-python-multi-processing</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/super-fast-python-numba</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/super-fast-python-why-python-slow</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
<url><loc>https://santhalakshminarayana.github.io/blog/whiteboard-image-enhancement-opencv-python</loc><changefreq>daily</changefreq><priority>0.7</priority><lastmod>2024-11-11T12:31:47.083Z</lastmod></url>
</urlset>

0 comments on commit bf62065

Please sign in to comment.