Scaling Laravel to a Billion Daily Jobs: High-Performance Architecture at Square

A common myth suggests that

isn't suited for high-scale enterprise environments.
Seb Armand
from
Square
systematically deconstructs this notion, sharing how the financial giant manages hundreds of millions of requests and nearly a billion daily jobs using the framework. Scaling isn't just about adding servers; it involves optimizing database connections, clever caching hierarchies, and sophisticated queue management.

Solving Database Latency with Persistent Connections

When Square enabled

for database connections, they saw a 50% spike in latency. In a standard
PHP-FPM
environment, every request starts from scratch, tearing down database connections at the finish. For apps talking to multiple databases, the handshake overhead for secure connections becomes a massive bottleneck.

To combat this, you should use persistent connections and emulate prepared statements. This allows

to keep the connection alive between requests and handle prepared statements in memory, saving precious network round-trips to the database server.

'options' => [
    PDO::ATTR_PERSISTENT => true,
    PDO::ATTR_EMULATE_PREPARES => true,
]

Building a Multi-Layered Caching Strategy

Square utilizes a sophisticated tree-based caching system. Instead of simply caching for a fixed time, they cache for as long as data remains valid, using

to manage invalidation. When a child entity (like a product topping) changes, the system clears the entire branch of the cache tree.

However, standard tag implementations can lead to "cache query bloat." If a response has three tags, a naive implementation might make four calls to the cache server. Square solved this by developing a library that propagates tags up the hierarchy, ensuring only two calls are ever needed to retrieve even the most complex, nested responses. This shifted their latency distribution significantly to the left, making most requests lightning-fast.

Offloading to the Edge with CDN Caching

For public-facing data like product catalogs, there's no reason every request should hit your origin server. Since these APIs don't require authentication, Square uses

to cache
JSON
responses at the edge. They utilize Surrogate-Control and Surrogate-Key headers to tell the CDN exactly how to store and purge data.

Surrogate-Control: max-age=31536000
Surrogate-Key: product_123 category_45

When the price of a "taco" changes in the database, the backend sends a single purge request to the CDN provider for that specific key, instantly clearing that product from edge nodes globally.

Optimizing Query Performance with Elasticsearch

As the application grew,

queries reached 200 lines of complex SQL to handle aggregates like "available for pickup under $20 at 5 PM." Even with optimization, some complex merchant requests took 20 seconds.

Square transitioned these read-heavy queries to

. By triggering a background job to re-index items whenever they change, they moved from 20-second
MySQL
queries to 200-millisecond search results. This architectural shift separates the source of truth (MySQL) from the high-performance read layer (Elasticsearch).

Advanced Queue Patterns: Fairness and Buffering

In a massive ecosystem, one large merchant can "hog" the queue by dispatching millions of jobs, causing delays for smaller users. Square solved this by implementing a Fairness pattern using the Laravel rate limiter.

They track the execution time of jobs in milliseconds. If a specific user exceeds a threshold, their subsequent jobs are automatically routed to a lower-priority "slow queue" with its own worker pool. This ensures that a single large update doesn't block the main queue, keeping the experience snappy for everyone else.

Additionally, for third-party APIs with rate limits, Square uses Buffering. Instead of hitting an external API 1,000 times, a worker bundles jobs together and sends them as a single batch once a time or count threshold is reached.

4 min read