I have 2 tables like below:
A
AId, | BId | Date1 | Date2 |
---|---|---|---|
1 | B1 | 1 Jan 2021 | 31 Dec 2021 |
1 | B1 | 1 Jan 2022 | 31 Dec 2022 |
1 | B1 | 1 Jan 2023 | 31 Dec 2030 |
2 | B2 | 1 Jan 2021 | 31 Dec 2030 |
3 | B3 | 1 Jan 2024 | 31 Dec 2030 |
B
BId | BVal |
---|---|
B1 | B1 |
B2 | B2 |
B3 | B3 |
I am joining these two tables A and B based on BId column. One input of Date type is coming in the query and we want to filter the data from table A based on that parameter and need the output of below data.
Input passed to the query (input_date): 1 Jan 2022
AId | BId | Date1 | Date2 |
---|---|---|---|
1 | B1 | 1 Jan 2022 | 31 Dec 2022 (There are multiple records of this but it should only come once) |
2 | B2 | 1 Jan 2021 | 31 Dec 2030 |
3 | B3 | 1 Jan 2024 | 31 Dec 2030 (This is starting in future still it should come in output) |
I am using below query currently but its pulling duplicate records.
select AId, BId, Date1, Date2
from A,B
where A.BId = B.BId
and input_date <= A.Date2; -- We cant add check of Date1 as we want the record with AId '3' from table A.
But this is pulling all rows from A for the record with id '1'. We just want 1 row as mentioned.
One option I can think of is to add distinct or group by but wanted to check if there is any better alternate to achieve this requirement. I am thinking DISTINCT or GROUP BY would have a performance impact as the number of rows increase.
Can this be handled in a better way?
PS: These are just sample tables Actual tables are bigger with large data.
Edit: Condition is Inputdate<= date2. Datebase: Oracle
Answers
You can achieve the desired result by using the ROW_NUMBER()
window function to partition the data by AId
and order it by Date1
. Then, you can select only the rows where the ROW_NUMBER()
is 1, indicating the earliest Date1
for each AId
. Here's how you can modify your query:
WITH A_B AS (
SELECT A.AId, A.BId, A.Date1, A.Date2,
ROW_NUMBER() OVER (PARTITION BY A.AId ORDER BY A.Date1) AS rn
FROM A
INNER JOIN B ON A.BId = B.BId
WHERE input_date <= A.Date2
)
SELECT AId, BId, Date1, Date2
FROM A_B
WHERE rn = 1;
This query will first join tables A and B, filter the rows based on the input_date
, and then assign a row number to each row within each AId
partition, ordered by Date1
. Finally, it selects only the rows where the row number is 1, ensuring that only the earliest Date1
for each AId
is included in the result set.