I am trying to join two tables in SQL (Snowflake) that have hierarchical data. Table A contains rows with all possible levels of the hierarchy in each row, whereas Table B contains individual rows for each level of the hierarchy. Additionally, some rows in Table A may contain a higher level of data than Table B. In this case, the goal is to pick the highest matching level from Table B. A challenge is that the level needed in Table A isn't constant as the matching level in Table B varies.
Below are examples of each table:
Table A:
Row Level 1 Level 2 Level 3
1 Animal Mammal Dog
2 Animal Mammal Cat
3 Animal Reptile Lizard
4 Animal Reptile Snake
5 Tree Oak Live Oak
6 Tree Elm Cedar Elm
Table B:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Reptile Lizard L
3 Animal Mammal M
4 Animal Reptile R
5 Animal A
6 Tree Oak Live Oak LO
7 Tree Oak O
8 Tree T
Desired Results:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Mammal Cat M
3 Animal Reptile Lizard L
4 Animal Reptile Snake R
5 Tree Oak Live Oak LO
6 Tree Elm Cedar Elm T
In the above results table, rows 1, 3, and 5 matched on Level 3. Rows 2 and 4 matched on Level 2. Row 6 matched on Level 1. Below is the code I've tried that produced the closest results:
SELECT a.lvl_1, a.lvl_2, a.lvl_3, b.category
FROM TABLE_A a
LEFT JOIN TABLE_B b
ON COALESCE(b.lvl_3, b.lvl_2, b.lvl_1) IN (a.lvl_1, a.lvl_2, a.lvl_3)
The problem is that the resulting table shows all rows with matches at any level. For example:
Current Results:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Mammal Dog M
3 Animal Mammal Dog A
4 Animal Mammal Cat M
5 Animal Mammal Cat A
...
If I check for equality at a specific level, the results are correct for that level, but NULL for the other levels:
SELECT a.lvl_1, a.lvl_2, a.lvl_3, b.category
FROM TABLE_A a
LEFT JOIN TABLE_B b
ON COALESCE(b.lvl_3, b.lvl_2, b.lvl_1) = a.lvl_3
Equality Results:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Mammal Cat null
3 Animal Reptile Lizard L
4 Animal Reptile Snake null
5 Tree Oak Live Oak LO
6 Tree Elm Cedar Elm null
How can I join Table B with Table A to return the single Category of the highest level match? Any help is greatly appreciated. Thank you.
I am trying to join two tables in SQL (Snowflake) that have hierarchical data. Table A contains rows with all possible levels of the hierarchy in each row, whereas Table B contains individual rows for each level of the hierarchy. Additionally, some rows in Table A may contain a higher level of data than Table B. In this case, the goal is to pick the highest matching level from Table B. A challenge is that the level needed in Table A isn't constant as the matching level in Table B varies.
Below are examples of each table:
Table A:
Row Level 1 Level 2 Level 3
1 Animal Mammal Dog
2 Animal Mammal Cat
3 Animal Reptile Lizard
4 Animal Reptile Snake
5 Tree Oak Live Oak
6 Tree Elm Cedar Elm
Table B:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Reptile Lizard L
3 Animal Mammal M
4 Animal Reptile R
5 Animal A
6 Tree Oak Live Oak LO
7 Tree Oak O
8 Tree T
Desired Results:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Mammal Cat M
3 Animal Reptile Lizard L
4 Animal Reptile Snake R
5 Tree Oak Live Oak LO
6 Tree Elm Cedar Elm T
In the above results table, rows 1, 3, and 5 matched on Level 3. Rows 2 and 4 matched on Level 2. Row 6 matched on Level 1. Below is the code I've tried that produced the closest results:
SELECT a.lvl_1, a.lvl_2, a.lvl_3, b.category
FROM TABLE_A a
LEFT JOIN TABLE_B b
ON COALESCE(b.lvl_3, b.lvl_2, b.lvl_1) IN (a.lvl_1, a.lvl_2, a.lvl_3)
The problem is that the resulting table shows all rows with matches at any level. For example:
Current Results:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Mammal Dog M
3 Animal Mammal Dog A
4 Animal Mammal Cat M
5 Animal Mammal Cat A
...
If I check for equality at a specific level, the results are correct for that level, but NULL for the other levels:
SELECT a.lvl_1, a.lvl_2, a.lvl_3, b.category
FROM TABLE_A a
LEFT JOIN TABLE_B b
ON COALESCE(b.lvl_3, b.lvl_2, b.lvl_1) = a.lvl_3
Equality Results:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Mammal Cat null
3 Animal Reptile Lizard L
4 Animal Reptile Snake null
5 Tree Oak Live Oak LO
6 Tree Elm Cedar Elm null
How can I join Table B with Table A to return the single Category of the highest level match? Any help is greatly appreciated. Thank you.
Finding categories using joins would be appropriate to perform in the following procedure:
On the other hand, in this situation it also seems possible to apply the coalesce() function to possible categories without using joins:
select
*,
coalesce(
(select category from table_b b where b.level_1 = a.level_1 and b.level_2 = a.level_2 and b.level_3 = a.level_3),
(select category from table_b b where b.level_1 = a.level_1 and b.level_2 = a.level_2 and b.level_3 is null),
(select category from table_b b where b.level_1 = a.level_1 and b.level_2 is null and b.level_3 is null)
) category
from table_a a