Data compression in a SQL Server database

SQL Server 2008 Enterprise includes the option to compress data in a SQL database. The data compression feature in the Microsoft® SQL Server®2008 database software can help reduce the size of the database as well as improve the performance of I/O intensive workloads. However, extra CPU resources are required on the database server to compress and decompress the data, while data is exchanged with the application. SQL Server 2008 provides great flexibility in how data compression is used. Row and page compression can be configured at the table, index, indexed view, or partition level.

SQL Server 2008 provides two levels of data compression – row compression and page compression.

Row compression helps store data more efficiently in a row by storing fixed-length data types in variable-length storage format. A compressed row uses 4 bits per compressed column to store the length of the data in the column. NULL and 0 values across all data types take no additional space other than these 4 bits.

Page compression is a superset of row compression. In addition to storing data efficiently inside a row, page compression optimizes storage of multiple rows in a page, by minimizing the data redundancy. Page compression uses prefix compression and dictionary compression. Prefix compression looks for common patterns in the beginning of the column values on a given column across all rows on each page. Dictionary compression looks for exact value matches across all columns and rows on each page. Both dictionary and prefix are type-agnostic and see every column value as a bag of bytes.

Page compression saves disk space if you data has duplicate values.