Scaling Laravel to a Billion Daily Jobs: High-Performance Architecture at Square
A common myth suggests that
Solving Database Latency with Persistent Connections
When Square enabled
To combat this, you should use persistent connections and emulate prepared statements. This allows
'options' => [
PDO::ATTR_PERSISTENT => true,
PDO::ATTR_EMULATE_PREPARES => true,
]
Building a Multi-Layered Caching Strategy
Square utilizes a sophisticated tree-based caching system. Instead of simply caching for a fixed time, they cache for as long as data remains valid, using
However, standard tag implementations can lead to "cache query bloat." If a response has three tags, a naive implementation might make four calls to the cache server. Square solved this by developing a library that propagates tags up the hierarchy, ensuring only two calls are ever needed to retrieve even the most complex, nested responses. This shifted their latency distribution significantly to the left, making most requests lightning-fast.
Offloading to the Edge with CDN Caching
For public-facing data like product catalogs, there's no reason every request should hit your origin server. Since these APIs don't require authentication, Square uses Surrogate-Control and Surrogate-Key headers to tell the CDN exactly how to store and purge data.
Surrogate-Control: max-age=31536000
Surrogate-Key: product_123 category_45
When the price of a "taco" changes in the database, the backend sends a single purge request to the CDN provider for that specific key, instantly clearing that product from edge nodes globally.
Optimizing Query Performance with Elasticsearch
As the application grew,
Square transitioned these read-heavy queries to
Advanced Queue Patterns: Fairness and Buffering
In a massive ecosystem, one large merchant can "hog" the queue by dispatching millions of jobs, causing delays for smaller users. Square solved this by implementing a Fairness pattern using the Laravel rate limiter.
They track the execution time of jobs in milliseconds. If a specific user exceeds a threshold, their subsequent jobs are automatically routed to a lower-priority "slow queue" with its own worker pool. This ensures that a single large update doesn't block the main queue, keeping the experience snappy for everyone else.
Additionally, for third-party APIs with rate limits, Square uses Buffering. Instead of hitting an external API 1,000 times, a worker bundles jobs together and sends them as a single batch once a time or count threshold is reached.
