sql - Unexpected number of rows returned in case when query - Stack Overflow

admin2025-04-22  0

I'm simply trying to profile a table, using a basic case when statement, to count the number of customers who fall into a band, and further split by another column.

My SQL Server query is returning multiple rows of the same definition however. I would expect 4 rows for the following 2 bands bands, but instead I'm getting multiple rows for all the risk band definitions.

Example of returning multiple rows for A1a and A1b, when would expect 4 rows for this - count of id, risk band (20 variables) and indicator (Y/N):

(No column name) SRIRiskBands HighlyLikelyIndicator
31423 A1b N
4702 A1b Y
29427 A1b N
4788 A1b Y
27854 A1b N
4543 A1b Y
4349 A1b Y
26029 A1b N
4291 A1a Y
25459 A1a N
24352 A1a N
4100 A1a Y
3875 A1a Y
22689 A1a N
4130 A1a Y
22454 A1a N
20619 A1a N

I'm simply trying to profile a table, using a basic case when statement, to count the number of customers who fall into a band, and further split by another column.

My SQL Server query is returning multiple rows of the same definition however. I would expect 4 rows for the following 2 bands bands, but instead I'm getting multiple rows for all the risk band definitions.

Example of returning multiple rows for A1a and A1b, when would expect 4 rows for this - count of id, risk band (20 variables) and indicator (Y/N):

(No column name) SRIRiskBands HighlyLikelyIndicator
31423 A1b N
4702 A1b Y
29427 A1b N
4788 A1b Y
27854 A1b N
4543 A1b Y
4349 A1b Y
26029 A1b N
4291 A1a Y
25459 A1a N
24352 A1a N
4100 A1a Y
3875 A1a Y
22689 A1a N
4130 A1a Y
22454 A1a N
20619 A1a N

Code used:

select count(distinct involvedPartyId),
case when MachineLearningScores >= 934.50 THEN 'A1a'
when MachineLearningScores >= 920.50 and MachineLearningScores <= 934.49 THEN 'A1b'
when MachineLearningScores >= 907.50 and MachineLearningScores <= 920.49 THEN 'A1c'
when MachineLearningScores >= 895.50 and MachineLearningScores <= 907.49 THEN 'A1d'
when MachineLearningScores >= 882.50 and MachineLearningScores <= 895.49 THEN 'A1e'
when MachineLearningScores >= 869.50 and MachineLearningScores <= 882.49 THEN 'A1f'
when MachineLearningScores >= 859.50 and MachineLearningScores <= 869.49 THEN 'A1g'
when MachineLearningScores >= 848.50 and MachineLearningScores <= 859.49 THEN 'A1h'
when MachineLearningScores >= 839.50 and MachineLearningScores <= 848.49 THEN 'A1i'
when MachineLearningScores >= 830.50 and MachineLearningScores <= 839.49 THEN 'A2'
when MachineLearningScores >= 809.50 and MachineLearningScores <= 830.49 THEN 'A3'
when MachineLearningScores >= 796.50 and MachineLearningScores <= 809.49 THEN 'A4'
when MachineLearningScores >= 789.50 and MachineLearningScores <= 796.49 THEN 'A5'
when MachineLearningScores >= 767.50 and MachineLearningScores <= 789.49 THEN 'B1'
when MachineLearningScores >= 752.50 and MachineLearningScores <= 767.49 THEN 'B2'
when MachineLearningScores >= 733.50 and MachineLearningScores <= 752.49 THEN 'B3'
when MachineLearningScores >= 724.50 and MachineLearningScores <= 733.49 THEN 'C1'
when MachineLearningScores >= 685.50 and MachineLearningScores <= 724.49 THEN 'D1'
when MachineLearningScores >= 649.50 and MachineLearningScores <= 685.49 THEN 'E1'
when MachineLearningScores < 649.49 THEN 'F1'
ELSE 'No MachineLearningScores'
    END AS 'SRIRiskBand', HighlyLikelyIndicator
from MarketingAnalytics.dbo.Ri_PA_Personal
where batchid = (SELECT MAX(BatchId) FROM MarketingAnalytics.dbo.Ri_PA_Personal)
group by MachineLearningScores, HighlyLikelyIndicator
order by MachineLearningScores asc

Any help would be greatly appreciated.

Share Improve this question edited Jan 30 at 12:58 GuidoG 12.2k6 gold badges55 silver badges97 bronze badges asked Jan 29 at 16:37 J O'DonnellJ O'Donnell 13 bronze badges 5
  • Looks like you are trying to set batchid = subquery but the sub query has a group by so will return more than one batchID. If that is intended change the batchid = to batchid in() – Brad Commented Jan 29 at 16:41
  • No function in SELECT affects the number of result rows. The problem is in the WHERE or GROUP BY clauses – Panagiotis Kanavos Commented Jan 29 at 17:15
  • You're grouping by MachineLearningScores, which can have multiple values within each band. That means you'll get a different row for every such value. If you included that column in the output you'd see the rows aren't repeated. The CASE generates the same band for multiple unique values – Panagiotis Kanavos Commented Jan 29 at 17:20
  • We need to see the sample data that you are using and the results that you are expecting. – Dale K Commented Jan 29 at 19:25
  • You already got good answers. Additional remark: the CASE returns the first match -> can be simplified to case when MachineLearningScores >= 934.50 THEN 'A1a' when MachineLearningScores < 934.50 THEN 'A1b' when MachineLearningScores < 920.50 THEN 'A1c' when MachineLearningScores < 907.50 THEN 'A1d' etc. This also avoids cut&paste/modify errors resulting in gaps. – dnoeth Commented Jan 30 at 15:34
Add a comment  | 

2 Answers 2

Reset to default 0

This result happens because the "group by" clause is using the original field without treating it. You can solve this by using the "group by" clause by copying the "case when" clause or by using a subselect to group

Solution 1

select count(distinct involvedPartyId),
case when MachineLearningScores >= 934.50 THEN 'A1a'
when MachineLearningScores >= 920.50 and MachineLearningScores <= 934.49 THEN 'A1b'
when MachineLearningScores >= 907.50 and MachineLearningScores <= 920.49 THEN 'A1c'
when MachineLearningScores >= 895.50 and MachineLearningScores <= 907.49 THEN 'A1d'
when MachineLearningScores >= 882.50 and MachineLearningScores <= 895.49 THEN 'A1e'
when MachineLearningScores >= 869.50 and MachineLearningScores <= 882.49 THEN 'A1f'
when MachineLearningScores >= 859.50 and MachineLearningScores <= 869.49 THEN 'A1g'
when MachineLearningScores >= 848.50 and MachineLearningScores <= 859.49 THEN 'A1h'
when MachineLearningScores >= 839.50 and MachineLearningScores <= 848.49 THEN 'A1i'
when MachineLearningScores >= 830.50 and MachineLearningScores <= 839.49 THEN 'A2'
when MachineLearningScores >= 809.50 and MachineLearningScores <= 830.49 THEN 'A3'
when MachineLearningScores >= 796.50 and MachineLearningScores <= 809.49 THEN 'A4'
when MachineLearningScores >= 789.50 and MachineLearningScores <= 796.49 THEN 'A5'
when MachineLearningScores >= 767.50 and MachineLearningScores <= 789.49 THEN 'B1'
when MachineLearningScores >= 752.50 and MachineLearningScores <= 767.49 THEN 'B2'
when MachineLearningScores >= 733.50 and MachineLearningScores <= 752.49 THEN 'B3'
when MachineLearningScores >= 724.50 and MachineLearningScores <= 733.49 THEN 'C1'
when MachineLearningScores >= 685.50 and MachineLearningScores <= 724.49 THEN 'D1'
when MachineLearningScores >= 649.50 and MachineLearningScores <= 685.49 THEN 'E1'
when MachineLearningScores < 649.49 THEN 'F1'
ELSE 'No MachineLearningScores'
    END AS 'SRIRiskBand', HighlyLikelyIndicator
from MarketingAnalytics.dbo.Ri_PA_Personal
where batchid = (SELECT MAX(BatchId) FROM MarketingAnalytics.dbo.Ri_PA_Personal)
group by case when MachineLearningScores >= 934.50 THEN 'A1a'
when MachineLearningScores >= 920.50 and MachineLearningScores <= 934.49 THEN 'A1b'
when MachineLearningScores >= 907.50 and MachineLearningScores <= 920.49 THEN 'A1c'
when MachineLearningScores >= 895.50 and MachineLearningScores <= 907.49 THEN 'A1d'
when MachineLearningScores >= 882.50 and MachineLearningScores <= 895.49 THEN 'A1e'
when MachineLearningScores >= 869.50 and MachineLearningScores <= 882.49 THEN 'A1f'
when MachineLearningScores >= 859.50 and MachineLearningScores <= 869.49 THEN 'A1g'
when MachineLearningScores >= 848.50 and MachineLearningScores <= 859.49 THEN 'A1h'
when MachineLearningScores >= 839.50 and MachineLearningScores <= 848.49 THEN 'A1i'
when MachineLearningScores >= 830.50 and MachineLearningScores <= 839.49 THEN 'A2'
when MachineLearningScores >= 809.50 and MachineLearningScores <= 830.49 THEN 'A3'
when MachineLearningScores >= 796.50 and MachineLearningScores <= 809.49 THEN 'A4'
when MachineLearningScores >= 789.50 and MachineLearningScores <= 796.49 THEN 'A5'
when MachineLearningScores >= 767.50 and MachineLearningScores <= 789.49 THEN 'B1'
when MachineLearningScores >= 752.50 and MachineLearningScores <= 767.49 THEN 'B2'
when MachineLearningScores >= 733.50 and MachineLearningScores <= 752.49 THEN 'B3'
when MachineLearningScores >= 724.50 and MachineLearningScores <= 733.49 THEN 'C1'
when MachineLearningScores >= 685.50 and MachineLearningScores <= 724.49 THEN 'D1'
when MachineLearningScores >= 649.50 and MachineLearningScores <= 685.49 THEN 'E1'
when MachineLearningScores < 649.49 THEN 'F1'
ELSE 'No MachineLearningScores'
    END AS 'SRIRiskBand', HighlyLikelyIndicator
order by case when MachineLearningScores >= 934.50 THEN 'A1a'
when MachineLearningScores >= 920.50 and MachineLearningScores <= 934.49 THEN 'A1b'
when MachineLearningScores >= 907.50 and MachineLearningScores <= 920.49 THEN 'A1c'
when MachineLearningScores >= 895.50 and MachineLearningScores <= 907.49 THEN 'A1d'
when MachineLearningScores >= 882.50 and MachineLearningScores <= 895.49 THEN 'A1e'
when MachineLearningScores >= 869.50 and MachineLearningScores <= 882.49 THEN 'A1f'
when MachineLearningScores >= 859.50 and MachineLearningScores <= 869.49 THEN 'A1g'
when MachineLearningScores >= 848.50 and MachineLearningScores <= 859.49 THEN 'A1h'
when MachineLearningScores >= 839.50 and MachineLearningScores <= 848.49 THEN 'A1i'
when MachineLearningScores >= 830.50 and MachineLearningScores <= 839.49 THEN 'A2'
when MachineLearningScores >= 809.50 and MachineLearningScores <= 830.49 THEN 'A3'
when MachineLearningScores >= 796.50 and MachineLearningScores <= 809.49 THEN 'A4'
when MachineLearningScores >= 789.50 and MachineLearningScores <= 796.49 THEN 'A5'
when MachineLearningScores >= 767.50 and MachineLearningScores <= 789.49 THEN 'B1'
when MachineLearningScores >= 752.50 and MachineLearningScores <= 767.49 THEN 'B2'
when MachineLearningScores >= 733.50 and MachineLearningScores <= 752.49 THEN 'B3'
when MachineLearningScores >= 724.50 and MachineLearningScores <= 733.49 THEN 'C1'
when MachineLearningScores >= 685.50 and MachineLearningScores <= 724.49 THEN 'D1'
when MachineLearningScores >= 649.50 and MachineLearningScores <= 685.49 THEN 'E1'
when MachineLearningScores < 649.49 THEN 'F1'
ELSE 'No MachineLearningScores'
    END AS 'SRIRiskBand' asc

Solution 2

select count(involvedPartyId),
    SRIRiskBand,
    HighlyLikelyIndicator
from
(
    select involvedPartyId,
    case when MachineLearningScores >= 934.50 THEN 'A1a'
    when MachineLearningScores >= 920.50 and MachineLearningScores <= 934.49 THEN 'A1b'
    when MachineLearningScores >= 907.50 and MachineLearningScores <= 920.49 THEN 'A1c'
    when MachineLearningScores >= 895.50 and MachineLearningScores <= 907.49 THEN 'A1d'
    when MachineLearningScores >= 882.50 and MachineLearningScores <= 895.49 THEN 'A1e'
    when MachineLearningScores >= 869.50 and MachineLearningScores <= 882.49 THEN 'A1f'
    when MachineLearningScores >= 859.50 and MachineLearningScores <= 869.49 THEN 'A1g'
    when MachineLearningScores >= 848.50 and MachineLearningScores <= 859.49 THEN 'A1h'
    when MachineLearningScores >= 839.50 and MachineLearningScores <= 848.49 THEN 'A1i'
    when MachineLearningScores >= 830.50 and MachineLearningScores <= 839.49 THEN 'A2'
    when MachineLearningScores >= 809.50 and MachineLearningScores <= 830.49 THEN 'A3'
    when MachineLearningScores >= 796.50 and MachineLearningScores <= 809.49 THEN 'A4'
    when MachineLearningScores >= 789.50 and MachineLearningScores <= 796.49 THEN 'A5'
    when MachineLearningScores >= 767.50 and MachineLearningScores <= 789.49 THEN 'B1'
    when MachineLearningScores >= 752.50 and MachineLearningScores <= 767.49 THEN 'B2'
    when MachineLearningScores >= 733.50 and MachineLearningScores <= 752.49 THEN 'B3'
    when MachineLearningScores >= 724.50 and MachineLearningScores <= 733.49 THEN 'C1'
    when MachineLearningScores >= 685.50 and MachineLearningScores <= 724.49 THEN 'D1'
    when MachineLearningScores >= 649.50 and MachineLearningScores <= 685.49 THEN 'E1'
    when MachineLearningScores < 649.49 THEN 'F1'
    ELSE 'No MachineLearningScores'
        END AS 'SRIRiskBand', HighlyLikelyIndicator
    from MarketingAnalytics.dbo.Ri_PA_Personal
    where batchid = (SELECT MAX(BatchId) FROM MarketingAnalytics.dbo.Ri_PA_Personal
) AS temp
group by SRIRiskBand,
    HighlyLikelyIndicator

Something like :

WITH 
vals AS 
(SELECT * 
 FROM   (VALUES (934.50, 'A1a'),
                (920.50, 'A1b'),
                (907.50, 'A1c'),
                (895.50, 'A1d'),
                (882.50, 'A1e'),
                (869.50, 'A1f'),
                (859.50, 'A1g'),
                (848.50, 'A1h'),
                (839.50, 'A1i'),
                (830.50, 'A2'),
                (809.50, 'A3'),
                (796.50, 'A4'),
                (789.50, 'A5'),
                (767.50, 'B1'),
                (752.50, 'B2'),
                (733.50, 'B3'),
                (724.50, 'C1'),
                (685.50, 'D1'),
                (649.50, 'E1'),
                (649.49, 'F1')
        ) AS VR (val,    ref ) 
)
SELECT count(distinct involvedPartyId) AS CNT, 
       CASE 
          WHEN LEAD(val) OVER(ORDER BY val) IS NULL THEN 'F1' 
          WHEN LAG(val) OVER(ORDER BY val) IS NULL THEN 'A1a' 
          ELSE ref
       END AS SRIRiskBandLAG
FROM   MarketingAnalytics.dbo.Ri_PA_Personal AS RPP
       LEFT OUTER JOIN vals AS v 
          ON MachineLearningScores < v.val 
             AND >= MachineLearningScores >= LAG(val) OVER(ORDER BY val)
where batchid = (SELECT MAX(BatchId) FROM MarketingAnalytics.dbo.Ri_PA_Personal)
group by val, ref, HighlyLikelyIndicator
order by val asc
转载请注明原文地址:http://conceptsofalgorithm.com/Algorithm/1745289922a294621.html

最新回复(0)