🗃 Database Optimization

From 17 Seconds to 1 Millisecond:
MySQL on 4.7 Million Products

A thriving e-commerce company was watching their server CPU hover at 76% around the clock. Their hosting provider was pressuring them to upgrade to a $8,400/year dedicated server. We found the real problem in 2 hours and fixed it in a day.

17s → 0.001sQuery Speed
76% → 15%CPU Usage
$8,400Upgrade Prevented
4.7MRows Optimized

The Challenge

The client ran a B2B e-commerce platform with a MySQL database holding 4.7 million product records. Despite having a powerful server — a Dell PowerEdge R730 with 48 cores and 504GB RAM — MySQL was pegged at 76% CPU continuously. Pages were timing out. The slow query log was being flooded.

Their hosting provider had a simple suggestion: upgrade to a larger server for $8,400/year. We had a different hypothesis: the problem wasn't hardware — it was missing indexes and poorly structured queries doing full table scans on millions of rows.

The real problem: A single "similar products" query was executing with no index, forcing MySQL to read every one of 4.7 million rows — on every product page load. With any meaningful traffic, this alone could saturate a 48-core server.

Diagnosing the Slow Queries

We started with the slow query log and SHOW PROCESSLIST. Within an hour we had a ranked list of offenders. Here's what we found:

QueryBeforeRows ScannedProblem
Similar products lookup17.0s4,700,000Full table scan — no index on mfg_id + category
Category product count17.0s2,400,000NOT IN subquery forcing full scan
FULLTEXT search (broad terms)1.2s4,700,000Terms like "support", "renewal" matching everything
Email domain lookup2.2s722,000LIKE '%@domain' — leading wildcard kills indexes
Tracking verification1.1s90,000No index on tracking_number column

The Solutions

1. Composite Index for Similar Products

The "similar products" query filtered by manufacturer ID and category, then sorted by sales rank. It was doing a full table scan because the index on mfg_id alone wasn't selective enough without the second column. We created a composite index covering both filter columns plus the sort column:

MySQL — Creating composite index
-- Before: full scan on 4.7M rows, 17s execution time
-- EXPLAIN: type=ALL, rows=4721843, Extra=Using filesort

CREATE INDEX idx_similar_mfg
  ON products (mfg_id, category_id, top_selling)
  ALGORITHM=INPLACE LOCK=NONE;

-- After: index range scan, 96 rows examined
-- EXPLAIN: type=ref, rows=96, Extra=Using index condition
-- Query time: 17.0s → 0.001s

2. Replacing NOT IN with LEFT JOIN

The category count query used a NOT IN (SELECT ...) subquery. MySQL's optimizer handles this poorly — it re-executes the subquery for every row in the outer table. Rewriting as a LEFT JOIN ... WHERE IS NULL pattern allowed the optimizer to use indexes on both sides:

Before — 17 seconds, 2.4M rows
SELECT category_id,
  COUNT(*) as cnt
FROM products
WHERE product_id NOT IN (
  SELECT product_id
  FROM discontinued
)
GROUP BY category_id;
After — 0.008s, index scan
SELECT p.category_id,
  COUNT(*) as cnt
FROM products p
LEFT JOIN discontinued d
  ON p.product_id = d.product_id
WHERE d.product_id IS NULL
GROUP BY p.category_id;

3. Fixing Broad FULLTEXT Queries

FULLTEXT search was being used for operational queries — things like finding all records related to "renewal" or "support." These terms matched hundreds of thousands of rows, making FULLTEXT's ranking overhead enormous. We replaced the FULLTEXT fallback with a category-based random selection query that used the existing category index, eliminating the 1+ second FULLTEXT scans entirely.

4. Stored Generated Column for Email Domain Lookup

The contacts table had email addresses but no domain column. Queries were using WHERE email LIKE '%@example.com' — the leading % wildcard prevents any index use. We added a stored generated column that MySQL auto-populates:

MySQL — Generated column + index
-- Add stored generated column (computed once at INSERT/UPDATE)
ALTER TABLE contacts
  ADD COLUMN email_domain VARCHAR(100)
  GENERATED ALWAYS AS (
    SUBSTRING_INDEX(email, '@', -1)
  ) STORED,
  ADD INDEX idx_sender_domain (email_domain);

-- Query now: WHERE email_domain = 'example.com'
-- Lookup time: 2.2s → 0.0003s (scans 1 row, not 722K)

5. Right-Sizing the InnoDB Buffer Pool

Perhaps the most shocking finding: the buffer pool was configured at 200GB, but the entire database was only 30GB. The oversized pool was wasting memory that could be used for OS file cache and PHP workers, and was actually increasing contention on InnoDB's internal mutex structures.

my.cnf — Buffer pool changes (applied live with SET GLOBAL)
# Before
innodb_buffer_pool_size = 200G # 6.7x the 30GB database!
innodb_buffer_pool_instances = 16
log_queries_not_using_indexes = ON # flooding slow log with sub-ms queries

# After (applied with SET GLOBAL — zero downtime)
innodb_buffer_pool_size = 96G # 3x DB size — the recommended ratio
innodb_buffer_pool_instances = 12 # one per 8GB of pool
log_queries_not_using_indexes = OFF # only log queries > long_query_time

Why 3x the database size? The rule of thumb is buffer pool = 1.5x–3x your working dataset. With 96GB, the entire 30GB dataset fits in memory with room for indexes and hot page buffers. More than 3x provides no performance benefit and wastes memory that other processes need.

6. Top-Selling Facets Cached to Disk

Manufacturer pages ran an aggregation query to count top-selling products per facet — a 17-second query that ran on every page load. We cached the results to /tmp with a 10-minute TTL. The first request after expiry takes 17 seconds (usually triggered by a cron job, not a visitor). Every subsequent request reads from the flat file in under 1ms.

Results

MetricBeforeAfterImprovement
Similar products query17.0s0.001s99.99% faster
Category count query17.0s0.008s99.95% faster
Email domain lookup2.2s0.0003s99.99% faster
Tracking verification1.1s0.002s99.8% faster
MySQL CPU usage76%15%80% reduction
Buffer pool size200GB96GB104GB freed
Server upgrade cost$8,400/yr$0Cancelled

Tech Stack

MySQL 8.0 • InnoDB • PHP 8.1 • AlmaLinux 8.10 • Dell PowerEdge R730 • EXPLAIN ANALYZE • information_schema.INNODB_METRICS

Key Takeaway

Hardware is rarely the real problem. Before spending money on a server upgrade, run EXPLAIN on your slowest queries. In this case, five indexes and three query rewrites eliminated the need for an $8,400 annual server upgrade. The entire engagement cost less than one month of the proposed upgrade.

Is Your Database Burning Money Right Now?

Send us your slow query log and we'll identify the top offenders — free, within 48 hours.

Get Free Database Audit View All Case Studies