Chad Musick’s Post

View profile for Chad Musick, graphic

Chief Data Officer at Bellroy | CDMP Master | PhD Mathematics | Data strategy and implementation

Let's talk about CROSS JOIN. What it is. What it most definitely is not. It is useful. It's not a conditional many-to-many join. The main thing -- the dangerous thing, usually -- about cross joins is that they form the Cartesian product of two tables. That is, every row in the left-hand table is matched to every row in the right-hand table. The number of rows in the outcome is the product of the number of rows in the two tables. CROSS JOIN is syntactic sugar. You can accomplish this by using a different join type. These are the same: SELECT a.*, b.* FROM a FULL OUTER JOIN b ON TRUE; SELECT a.*, b.* FROM a CROSS JOIN b; I've seen cross joins proposed as a method of "finding all pairs of football matches played" or "finding all pairs of cities with flights between them". For little examples like this, the output looks correct, because you're wanting a dense outcome set. But imagine two tables with 1 million (1,000,000) rows each. The product of these will have 1 trillion (1,000,000,000,000) rows, right around 125 rows for each person alive today. If you want non-dense many-to-many matches, use an INNER JOIN with your many-to-many bridging table. If your rows have arrays of matches (instead of a bridging table), use a LATERAL JOIN first to expand these rows. (LATERAL JOIN is written as an UNNEST in some SQL variants.) So what's a CROSS JOIN good for? (1) Widening rows with additional information. (2) Providing an exhaustive list of widened rows. One of the tables should be tiny, and contain columns relevant to every row of your other table. Consider a 1-row table. All it contains is a pair of timestamps, and you want to apply these timestamps to every row of your main table. CROSS JOIN is great for this. Your number of rows is unchanged, and you get your extra columns. Or consider a case where you've moved from having 1 warehouse of products to having 3, and you want to add a warehouse location to each row of your product catalog. A CROSS JOIN of your product catalog and your (very short) list of warehouses will do this immediately.

To view or add a comment, sign in

Explore topics