Tuesday, January 05, 2010

Horizontal Partitioning 1

When you want to speed up the physical reads on a SQL Server, you have two options: faster disks, or more disks. But only once you've determined that physical reads are a (potential) bottleneck do you get to make the choice.

If using faster disks is not enough to get your system back to an acceptable level of performance, then there's still the option of adding more disks. Merely adding disks won't - in itself - speed anything up. To realize performance gains you need to restructure your data. SQL Server allows you to partition tables, and also indexes. But as usual, the devil resides in the details.

Partition functions can only take one parameter. This means that the partition in which each row resides is determined by the value of just one column value in that row.

If your original table had a clustered index, you'll probably want to keep it. However, this has a big consequence: you will need to make the partition function congruent with the clustered index. SQL Server will complain if you leave the partition column out of the clustered index "Partition columns for a unique index must be a subset of the index key". It gets worse if you want a composite clustered index - you should be aware that in some cases SQL Server appears to store data internally sorted first by the partition column, and then by any other columns in the composite clustered index. If your original table was sorted by Col_1,Col_2 and you choose Col_2 as your partition column, then your table may be sorted internally by Col_2,Col_1. Actually, it's not this straightforward: I need some time to figure this out; it will be the subject of a later post.

No comments: