Performance improvement for SQL for a specific case

I have 2 tables like below:

A

AId,	BId	Date1	Date2
1	B1	1 Jan 2021	31 Dec 2021
1	B1	1 Jan 2022	31 Dec 2022
1	B1	1 Jan 2023	31 Dec 2030
2	B2	1 Jan 2021	31 Dec 2030
3	B3	1 Jan 2024	31 Dec 2030

B

BId	BVal
B1	B1
B2	B2
B3	B3

I am joining these two tables A and B based on BId column. One input of Date type is coming in the query and we want to filter the data from table A based on that parameter and need the output of below data.

Input passed to the query (input_date): 1 Jan 2022

AId	BId	Date1	Date2
1	B1	1 Jan 2022	31 Dec 2022 (There are multiple records of this but it should only come once)
2	B2	1 Jan 2021	31 Dec 2030
3	B3	1 Jan 2024	31 Dec 2030 (This is starting in future still it should come in output)

I am using below query currently but its pulling duplicate records.

select AId, BId, Date1, Date2
from A,B
where A.BId = B.BId
and input_date <= A.Date2; -- We cant add check of Date1 as we want the record with AId '3' from table A.

But this is pulling all rows from A for the record with id '1'. We just want 1 row as mentioned.

One option I can think of is to add distinct or group by but wanted to check if there is any better alternate to achieve this requirement. I am thinking DISTINCT or GROUP BY would have a performance impact as the number of rows increase.

Can this be handled in a better way?

PS: These are just sample tables Actual tables are bigger with large data.

Edit: Condition is Inputdate<= date2. Datebase: Oracle

Answers

You can achieve the desired result by using the ROW_NUMBER() window function to partition the data by AId and order it by Date1. Then, you can select only the rows where the ROW_NUMBER() is 1, indicating the earliest Date1 for each AId. Here's how you can modify your query:

WITH A_B AS (
    SELECT A.AId, A.BId, A.Date1, A.Date2,
           ROW_NUMBER() OVER (PARTITION BY A.AId ORDER BY A.Date1) AS rn
    FROM A
    INNER JOIN B ON A.BId = B.BId
    WHERE input_date <= A.Date2
)
SELECT AId, BId, Date1, Date2
FROM A_B
WHERE rn = 1;

This query will first join tables A and B, filter the rows based on the input_date, and then assign a row number to each row within each AId partition, ordered by Date1. Finally, it selects only the rows where the row number is 1, ensuring that only the earliest Date1 for each AId is included in the result set.

Performance improvement for SQL for a specific case

A

B

Hot

Latest