I have data that is stored in S3 and I am querying it using SQL via Redshift Spectrum. One of the columns is stored as a list in the S3 file like so:
Table_ABC
| column_a | items_list |
|---|---|
| a. | [item_1, item_2] |
| b. | [item_3, item_4] |
Redshift has no unnest function, but I have picked out of our codebase the syntax below that works to unnest the list.
select
column_a,
unnested_list_items
from table_abc as abc
left join abc.items_list as unnested_list_items on true
Running this results in:
| column_a | unnested_list_items |
|---|---|
| a. | item_1. |
| a. | item_2. |
| b. | item_3. |
| b. | item_4. |
No one seems to understand why this works and I can't find any documentation on this and why it works. The closest I can find is the documentation below on Athena syntax, but the unnest seems to be implicit and afaik it's not being processed by Athena.
https://docs.aws.amazon.com/athena/latest/ug/flattening-arrays.html
Is anyone able to explain how this syntax is processed by the query engine or point to any documentation on this?
I am using Redshift 1.0.37680. Thanks!
from table_abc as abc, abc.items_list as unnested_list_items– ypercubeᵀᴹ May 06 '22 at 17:50hmm, tried your syntax and it didn't seem to process. The query engine doesn't seem to be cross joining as the items only appear column_a values they are associated with. It seems that the left join of the column is somehow an implicit unnest function. Very puzzled by this.
– peegee May 06 '22 at 18:04FROMclause as a table so it unnests it. – ypercubeᵀᴹ May 06 '22 at 18:16from table_abc as abc, unnest(abc.items_list) as unnested_list_itemsorfrom table_abc as abc left join unnest(abc.items_list) as unnested_list_items on true– ypercubeᵀᴹ May 06 '22 at 18:25Btw, I tried the comma separated from syntax like in your second comment again and it does work. The issue I had before was that I wasn't explicitly referencing the table
– peegee May 09 '22 at 09:49abcin the select statement when selecting columns that weren't the items list. Socolumn_aneeds to be referenced likeabc.column_a