Can I provide a default for a left outer join?

Question

Suppose I have tables a (with column a1) and b (with columns b1 and b2) and I perform a left outer join

SELECT *
FROM a LEFT OUTER JOIN b
ON a.a1 = b.b1

Then b1 and b2 will be NULL where a value of a1 has no matching value of b1.

Can I provide a default value for b2, instead of NULL? Note that COALESCE won't work here, because I don't want the default value to override potential NULLs in b2 where there is a value of b1 matching a1.

That is, with a and b as

CREATE TABLE a (a1)
  AS VALUES (1),
            (2),
            (3) ;

CREATE TABLE b (b1,b2)
  AS VALUES (1, 10),
            (3, null) ;


a1     b1 | b2
---    --------
 1      1 | 10
 2      3 | NULL
 3

and a default for b2 of, say, 100, I want to get the result

a1 | b1   | b2
---------------
1  |  1   | 10
2  | NULL | 100
3  |  3   | NULL

In this simple case I could do it "by hand" by looking at whether b1 is NULL in the output. Is that the best option in general, or is there a more standard and neater way?

score 47 · Accepted Answer · edited Jan 15 '14 at 22:38

47

SELECT a.a1,b.b1,  
    CASE WHEN b.b1 is NULL THEN 5 ELSE b.b2 END AS b2  
FROM a LEFT OUTER JOIN b  
ON a.a1 = b.b1

edited Jan 15 '14 at 22:38

ypercubeᵀᴹ

97,895
13
214
305

answered Jan 15 '14 at 17:29

Mordechai

684
6
6

I've added a tag to indicate I would accept a Postgres specific answer. Still, standard SQL would be preferred if possible. – Tom Ellis Jan 15 '14 at 19:46
@Kin: as stated in my question I know that "I could do it "by hand" by looking at whether b1 is NULL in the output. Is that the best option in general, or is there a more standard and neater way?" – Tom Ellis Jan 15 '14 at 19:47
4

since you want to distinguish between NULLs that occur because of a JOIN and those that are "naturally" present, it is inevitable that you'd have to examine b1. If that's what you meant by "I could do it "by hand"", than yes, that is the only way. – Mordechai Jan 15 '14 at 19:54
@MorDeror: OK, I suppose I was thinking there may be a syntax such as "LEFT OUTER JOIN ... ON ... DEFAULT b2 = ...". – Tom Ellis Jan 15 '14 at 19:59
You could rewrite as a UNION of an INNER join and an anti-join but I don't see a reason to complicate the code more. – ypercubeᵀᴹ Jan 15 '14 at 23:04
OK, if this is indeed the usual method, then thank you, I accept! – Tom Ellis Jan 17 '14 at 18:17
3

WIthout explanation, I'm tempted to downvote this. This answer does not live up to the quality of DBA.SE. – Evan Carroll Mar 20 '17 at 22:41
The answer is very clearly "no". This answer was specifically called out as the one OP was seeking an alternative to. Their question is useful in cases where the default value would be used in several places, and the case/coalesce/ifnull statements would become cumbersome and unclear. There is no way to specify a 'default value for left joins' replacing the 'null', hence the only correct answer is "no", and this is incorrect. – Chris.Caldwell Jan 13 '21 at 14:26
@Chris.Caldwell The answer, as provided is a clear "yes" to me. With the exception of using 5 as the default, rather than 100, output matches what the initial question asked. The only way it is not a "yes" in my mind is that the "default" value are not immediately visible for further JOIN conditions or WHERE conditionals, but this was not part of the posited initial question -- in terms of the output generated, it is as the OP requested and avoids the possibility of filling in NULL when there is a row in b that matches to row a but that has a NULL in the target column of b. – mpag Jun 02 '23 at 16:29

Rob · Answer 2 · 2021-03-23T14:32:18.603

22

I find COALESCE to be very useful in that case. It will return the first non NULL value from a list:

SELECT
 a.a1,
 b.b1,
 COALESCE (b.b2, 100) AS b2
FROM a
LEFT OUTER JOIN b
  ON (a.a1 = b.b1);

After @ypercube correctly identified an issue where my answer did not match the question. here is his correction:

SELECT 
  a.a1, 
  b.b1, 
  COALESCE(b.b2, d.b2) AS b2   
FROM a LEFT JOIN b ON a.a1 = b.b1 
LEFT JOIN (SELECT 100 AS b2) AS d ON b.b1 IS NULL;

edited Mar 23 '21 at 14:32

answered Sep 20 '18 at 13:11

Rob

321
2
4

5

This is the best answer so far. – Luca Dec 24 '19 at 11:57
@ypercubeᵀᴹ Really? What part? The point was to illustrate the COALESCE function in contrast to the CASE WHEN syntax, based on the first query presented. and the accepted answer.. – Rob Mar 23 '21 at 12:19
Yeah.. re-reading the question i see your point.. but like I said above I was trying to illustrate the COALESCE function more than as a full answer. I will correct. Thank you for pointing that out. – Rob Mar 23 '21 at 13:53
yep that works. – Rob Mar 23 '21 at 14:21
@ypercubeᵀᴹ You did not have to ;). constructive comments are always a good thing. it's a team effort. (even if sometimes a bit embarrassing LOL) – Rob Mar 24 '21 at 12:09
1

Yeah I agree. Since you incorporated them into the answer though, it was just noise. It's really hard getting info from the comments of a question or answer that has 20+ of them. – ypercubeᵀᴹ Mar 24 '21 at 13:59

score 3 · Answer 3 · edited Jun 02 '20 at 14:53

The original answer to this question went unexplained, so let's give this another shot.

Using a `CASE` expression

Using this method we exploit that we have another value in a different column that IS NOT NULL in this case b.b1 if that value is null then we know the join failed.

SELECT
  a.a1,
  b.b1,  
  CASE WHEN b.b1 is NULL THEN 100 ELSE b.b2 END AS b2  
FROM a
LEFT OUTER JOIN b  
  ON (a.a1 = b.b1);

This will totally work, and generate the exact thing you want.

Using a sub-SELECT

Don't use this method, it's build-up idea. Keep reading.

If we do not have any NOT NULL columns that we can exploit like that, we need something to create a column that can function that way for us...

SELECT
  a.a1,
  b.b1,  
  CASE WHEN b.cond IS NULL THEN 100 ELSE b.b2 END AS b2  
FROM a
LEFT OUTER JOIN (
  SELECT true AS cond, b.*
  FROM b
) AS b
  ON (a.a1 = b.b1);

Using a row comparison

Even easier though then forcing a false value for which we can compare, is to compare the row. In PostgreSQL, the row has a value by the name of the table. For instance, SELECT foo FROM foo returns a row of type foo (which is a row type), from table foo. Here we test to see if that ROW is null. This will work so long as every column IS NOT NULL. And, if every column IS NULL in your table, then you're just trolling.

SELECT
  a.a1,
  b.b1,  
  CASE WHEN b IS NULL THEN 100 ELSE b.b2 END AS b2  
FROM a
LEFT OUTER JOIN b
  ON (a.a1 = b.b1);

Column b1 used in CASE solution does not need to be non-nullable. The construction works in either cases. — ypercubeᵀᴹ, Mar 20 '17 at 23:19

score 2 · Answer 4 · edited Mar 23 '21 at 12:36

2

As an alternative to coalescing column values one may want to use a default row on the non matching rows

Using tsql dialect based on the comment: I've added a tag to indicate I would accept a Postgres specific answer. Still, standard SQL would be preferred if possible.

Test data

SELECT *
INTO #a
FROM (VALUES (1)
           , (2)
           , (3)) AS x(a1);
SELECT *
INTO #b
FROM (VALUES (1, 10)
           , (3, NULL)
     ) AS x(b1, b2);

Basic query

SELECT *
FROM #a JOIN #b ON #a.a1 = #b.b1 -- all matches
UNION
SELECT #a.a1, NULL, 100 -- default row
FROM #a LEFT JOIN #b ON #a.a1 = #b.b1 WHERE b1 IS NULL -- where a1 <> b1;

Basic result

|a1|b1|b2 |
-----------
| 1| 1| 10|
| 2|  |100|
| 3| 3|   |

You could use this approach for a more set based approach, treating defaults as replacements for the difference in the two sets.

Here's a more advanced example where even/odd ids of a have different default b rows.

Advanced query

INSERT INTO #a
SELECT *
FROM (VALUES (5)) AS x(a1);
SELECT *
FROM #a
     JOIN #b ON #a.a1 = #b.b1 -- all matches
UNION
SELECT a1, NULL, val
FROM #a
     LEFT JOIN #b ON #a.a1 = #b.b1
     JOIN (
    SELECT *
    FROM (VALUES (0, 100) -- even default
               , (1, 99) -- odd default
         ) AS x(rem, val)
  ) t ON a1 % 2 = t.rem
WHERE b1 IS NULL -- where a1 <> b1;

Advanced result

|a1|b1|b2 |
-----------
| 1| 1| 10|
| 2|  |100|
| 3| 3|   |
| 5|  | 99|

edited Mar 23 '21 at 12:36

John K. N.

17,649
12
51
110

answered Mar 23 '21 at 08:23

CervEd

221
1
2
8

1

Nice but # is used only in SQL Server for temp tables. Not in Postgres. – ypercubeᵀᴹ Mar 23 '21 at 09:27
@ypercubeᵀᴹ question isn't postgres specific but I added a note that the answer uses TSQL – CervEd Mar 23 '21 at 11:25
@JohnK.N. the author tagged the question as postgresql because the first answer was in postgres dialect, there's nothing specificly postgres about the question. I would change the syntax of the temp tables but I can't be bothered since it has nothing to do with the answer itself – CervEd Mar 23 '21 at 12:30
Seen in a comment on an answer: "I've added a tag to indicate I would accept a Postgres specific answer. Still, standard SQL would be preferred if possible" so I guess the community could go with your answer. :-) – John K. N. Mar 23 '21 at 12:34
1

@JohnK.N. the accepted answer is standard SQL. This one as well (but I'm not sure if the fancy # are allowed in table names ;) – ypercubeᵀᴹ Mar 23 '21 at 13:41
@ypercubeᵀᴹ Probably not allowed/defined in standard SQL. – John K. N. Mar 23 '21 at 13:47
I welcome any edit to standard SQL, the join itself is AFAIK standard SQL and that's really the answer. How the tables are defined don't seem relevant. I had to change the original create table statements because they didn't work in sql server and then they became tsql temp tables – CervEd Mar 23 '21 at 16:49

score -1 · Answer 5 · answered Jan 13 '21 at 14:31

OP has asked for an alternative to using the typical case / coalesce / ifnull method for replacing null values by specifying a different 'default value' for left joins instead of null. This is a valid request as it may be cumbersome if the field is used repeatedly or within complex calculations within the query or you want to differentiate between the 'row doesnt exist in joined table' signal and 'row exists but that field is null' result. The correct answer here is "no, that feature does not exist".