Explain Codes LogoExplain Codes Logo

Unioning two tables with different number of columns

sql
union
data-types
sql-variants
Alex KataevbyAlex Kataev·Nov 16, 2024
TLDR

To union two tables with differing columns, adjust their structure for compatibility. Introduce NULLs in the place of missing fields. Here's an example:

SELECT column1, column2, NULL AS column3 FROM table1 UNION ALL SELECT column1, NULL AS column2, column3 FROM table2;

This positions the columns correctly, upholding data integrity for a flawless union.

Ensuring Compatibility in Data Types

To execute a union on tables with dissimilar columns, the data types of the corresponding columns must be uniform. This ensures you avoid errors like the notorious ORA-01790. Convert any incompatible data types explicitly like this:

SELECT CAST(column1 AS VARCHAR(100)), column2 FROM table1 -- I don't always cast, but when I do, I make sure it's varchar UNION ALL SELECT column1, CAST(column2 AS VARCHAR(100)) FROM table2; -- CASTing spells to convert data types. Wizardry and SQL, who knew?

The casting operation here prevents data type clashes, ensuring a smooth union operation.

Tackling Duplicates using UNION ALL and UNION

Determine whether to retain or eliminate duplicates:

  • Use UNION to effectively remove duplicates
  • Utilize UNION ALL to preserve all records, duplicates included

Interestingly, UNION ALL has a speed advantage as it saves time by bypassing duplicate elimination.

Using Meta-Data and Aliasing Fields

When undertaking a union on different columns, assigning aliases can help align data and prevent ordering issues. Additionally, you could add meta-data indicating the source of the records:

SELECT 'table1' AS source, column1, column2, NULL AS column3 FROM table1 -- table1: "I'm the source. You're welcome." UNION ALL SELECT 'table2' AS source, column1, NULL AS column2, column3 FROM table2; -- table2: "And I'm the second source, no autographs please."

The addition of the source column offers easy traceability of each record back to its table of origin.

Exploiting the Powers of SQL Variants for Unions

Different SQL dialects and extensions offer novel methods for handling unions with mismatched columns:

  • SAS SQL: OUTER UNION CORR operation takes care of automatic null-filling for unmatched columns.

  • KQL (Kusto Query Language) and DuckDB: They support outer union operations by matching column names, not just their positions.

  • U-SQL: OUTER UNION BY NAME ON (*) comes to the rescue by filling missing columns with default values.

  • PostgreSQL: A NATURAL FULL JOIN behaves like a union operation while smartly filling in nulls for absent columns.

Advanced Techniques and Guarding Against Pitfalls

  • PostgreSQL users can toy with FULL JOIN and COALESCE to emulate a union operation and fill those pesky missing fields:

    SELECT COALESCE(A.column1, B.column1) AS column1, A.column2, B.column3 FROM table1 A FULL OUTER JOIN table2 B ON A.column1 = B.column1; -- Full outer join. It's like joining a club but with more SQL and less fun
  • Always verify that your columns in a union query have uniform data types to dodge runtime errors.

  • While aliasing columns, be reasonable to maintain readability and consistency across your SQL script.