Jesper Söderlund put together an excellent list of four general scalability patterns and four subpatterns in his post Scalability patterns and an interesting story:
- Load distribution – Spread the system load across multiple processing units
- Load balancing / load sharing – Spreading the load across many components with equal properties for handling the request
- Partitioning – Spreading the load across many components by routing an individual request to a component that owns that data specific
- Vertical partitioning – Spreading the load across the functional boundaries of a problem space, separate functions being handled by different processing units
- Horizontal partitioning – Spreading a single type of data element across many instances, according to some partitioning key, e.g. hashing the player id and doing a modulus operation, etc. Quite often referred to as sharding.
- Queuing and batch – Achieve efficiencies of scale by processing batches of data, usually because the overhead of an operation is amortized across multiple request
- Relaxing of data constraints – Many different techniques and trade-offs with regards to the immediacy of processing / storing / access to data fall in this strategy
- Parallelization – Work on the same task in parallel on multiple processing units
For more details and explanations see the original post.