Explain Codes LogoExplain Codes Logo

How much disk-space is needed to store a NULL value using postgresql DB?

sql
database-administration
performance-optimization
data-management
Anton ShumikhinbyAnton Shumikhin·Nov 19, 2024
TLDR

In PostgreSQL, a NULL value itself consumes zero bytes of disk space in a column. This efficiency is achieved through PostgreSQL's bitmap vector to represent NULL values, where each bit corresponds to a nullable column in the table. Hence, multiple NULL columns add up to just a few bytes (8 nullable columns fit within 1 byte). However, it's worth mentioning that table and column overhead can add to space usage, independently of the NULL values themselves.

CREATE TABLE example (id SERIAL, info TEXT); --Our sample table INSERT INTO example (info) VALUES (NULL); -- Storing NULL, no extra space consumed. Yay!

The Internals of NULL

In PostgreSQL, the importance of inner workings is often underscored. When we talk about the storage of NULL values, it's crucial mentioning the HeapTupleHeader. This agent is responsible for holding metadata about rows and has a fixed size of 23 bytes, whether the row has NULL values or not.

Correspondingly, the actual data storage of a row starts at a multiple of MAXALIGN. Typically, MAXALIGN is 8 on 64-bit systems - don't confuse this with the latest dance move!

In terms of NULL value storage, PostgreSQL is quite economical. For tables with up to 8 columns, no additional space contributions are needed for the null bitmap; it's snugly included in the HeapTupleHeader. However, as you add nullable columns in multiples of 8, an extra MAXALIGN bytes are allocated— not for the NULLs themselves, but to map them.

Are you still with us? Let's illustrate this with some SQL:

CREATE TABLE space_test ( id SERIAL, info TEXT, details TEXT, more_info TEXT ); -- Still within the limit of the initial null bitmap. Nice!

Even if info, details, and more_info are all set to NULL, the storage space remains consistent. That's PostgreSQL being frugal!

Delving into Anomalies

If you're thinking that a table with more nullable columns will be significantly larger because of those NULLs, think again! PostgreSQL, smart as it is, uses a minimalistic approach to manage the null bitmap alongside the usual row headers.

However, anomalies in associated behavior are more typically the result of database bloat (that's not a cool, new AI term) but factors such as dead tuples left over from previous data manipulation operations.

Data Realignment and Performance

Following the bookshelf metaphor, consider the alignment of books on the shelf. PostgreSQL, being performance-conscious, aligns its data at MAXALIGN boundaries. This MAXALIGN (alignment mask) is done for efficient CPU memory access. This data alignment, in essence, is independent of NULLs but critical for disk storage and also makes your database run like a gazelle!

Handling Wide Tables

As a database administrator, it's essential to understand how column number—not just data—affects the storage. When tables grow wider than 8 columns, PostgreSQL grabs more space for each set of 8 additional columns to manage the null bitmap.

Essentially, PostgreSQL turns into a space-conscious squirrel, hoarding additional space for null bitmaps in anticipation of more nullable column data. Bottom line mission: never underestimate the storage implications on large tables when you're in the designing phase.

-- Designing table for efficient storage: task tougher than assembling IKEA furniture CREATE TABLE wide_table ( -- Max of eight columns before PostgreSQL starts hoarding additional space for null bitmap. col1 INT, col2 INT, -- Additional columns - think before you add. You've been warned! -- ... );

Monitoring disk Space Consumption

Welcome to the last stop on this rollercoaster ride! Here's something you'll love - practical ways to monitor and maintain your precious database.

To understand the disk-space consumption, you can use the pg_column_size function to measure both the apparent and actual size of rows, including the space taken by NULL values. Remember, maintenance is key, babes!

Regular data grooming, like the VACUUM operation, is the spit and polish needed to keep your database decluttered from dead tuples, ensuring efficient PostgreSQL functionality, and a happier you!