In our project, we are trying to combine multiple tables into a single table. The table has ukey
and effective_date
are PKs with not null constraint.
The structure works for below data as effective dates are different:
Ukey effective date combination:
ukey | effective_date | settle_date | price |
---|---|---|---|
u123 | 3/3/2025 | 3/8/2025 | 1.21215 |
u123 | 3/4/2025 | 3/9/2025 | 1.23 |
In our project, we are trying to combine multiple tables into a single table. The table has ukey
and effective_date
are PKs with not null constraint.
The structure works for below data as effective dates are different:
Ukey effective date combination:
ukey | effective_date | settle_date | price |
---|---|---|---|
u123 | 3/3/2025 | 3/8/2025 | 1.21215 |
u123 | 3/4/2025 | 3/9/2025 | 1.23 |
The requirement is for same ukey
and effective_dates
, there can be multiple settle_dates
. So if I wanted to make it work for same ukey
and effective_date
, I will have to add settle_date
as PK and make it non null. However there can be certain records where settle_date
is null so adding settle_date
with the data shown below won't work:
Ukey effective date settle_date combination:
ukey | effective_date | settle_date | price |
---|---|---|---|
u123 | 3/3/2025 | 3/8/2025 | 1.21215 |
u123 | 3/3/2025 | 3/9/2025 | 1.23 |
u456 | 3/3/2025 |
Two solutions I have thought are:
settle_date
eg 12/31/2999 making settle_date
as PK and non null. And wherever its applicable during data load, my table will have actual settle_date
.GENERATED BY DEFAULT AS IDENTITY
and use it as PK rather than settle_date
. However there would be lot of INSERTS if I don't handle updates properly in the code.Any other solution to this issue?
A UNIQUE
constraint is almost as potent as a PRIMARY KEY
, with the crucial difference that it allows null values, which are considered to be distinct values as per the SQL standard - by default. Since Postgres 15 this leeway can be cut by adding the NULLS NOT DISTINCT
clause. "Certain records where settle_date
is null" does not clarify to me whether multiple null values for the same (ukey, effective_date)
should be allowed or not.
ALTER TABLE tbl ADD CONSTRAINT tbl_u_e_s_uniq UNIQUE (ukey, effective_date, settle_date);
Or:
ALTER TABLE tbl ADD CONSTRAINT tbl_u_e_s_uniq UNIQUE NULLS NOT DISTINCT (ukey, effective_date, settle_date);
fiddle
See:
Depending on undisclosed data types and predominant queries, the order of columns in the constraint may be optimized. (But the set of columns is required to enforce your requirement.)
I would probably add a surrogate PK (serial
or IDENTITY
column) additionally. See:
A third option would be to degrade your constraint to a "simple" UNIQUE CONSTRAINT
.
Just like with a primary key, you'll get:
create unique index on t(ukey, effective_date, coalesce(settle_date, '1900-01-01')); -- Personal opinion: choosing a past date is less risky than a future one; but to evacuate all risks, add a ", settle_date is null" to the index.
(see fiddle for demo)
(your question is in fact universal: see nearly same question on Reddit or for SQL Server)
ukey
andeffective_date
, if they all have nosettle_date
? If so, are there any other features that distinguish them? – Bergi Commented Mar 3 at 14:26