How do I stream data from On premise Oracle database to ADLS Gen2?
I have this requirement where I need to stream XML BLOBs from an on-premise Oracle database into ADLS Gen2 BLOB Container. I was able to get a self hosted VM, install Integration Run time & JRE, configure the IR using the keys, and tested the connectivity between the on-premise Oracle database and ADLS Gen2 using Azure Data Factory "Copy" Activity.
I have 2 questions -
- Can Oracle triggers (like AFTER INSERT) initiate an external event?
- Does Azure have any service that listens to this table and opens the stream active until the data flushes out?
Could anyone please help me how to achieve this?
Source is an Oracle table with 5 columns
SURROGATE_ID | NATURAL_KEY1 | NATURAL_KEY2 | XML_COL | LAST_UPDATE_TIMESTAMP |
---|---|---|---|---|
1 | AAA | XYZ | xml BLOB | 2024-04-08 12:03:34 |
2 | AAB | XYX | xml BLOB | 2024-04-08 12:03:39 |
3 | AAC | XYZ | xml BLOB | 2024-04-08 12:05:27 |
I need the XML_COL to be saved into the ADLS Gen 2 Blob container as separate XML files like below:
1_AAA_XYZ_20240408120334.xml
2_AAB_XYX_20240408120339.xml
3_AAC_XYZ_20240408120527.xml
Answers
To achieve your requirement of streaming XML BLOBs from an on-premise Oracle database into Azure Data Lake Storage (ADLS) Gen2 Blob Container, triggered by Oracle events (like AFTER INSERT), and saving each XML file separately in ADLS, you can follow these steps:
-
Oracle Triggers: Yes, Oracle triggers like AFTER INSERT can be used to initiate an external event. You can create an AFTER INSERT trigger on your Oracle table that calls an external procedure or service whenever a new row is inserted into the table.
-
Azure Service for Listening to Oracle Table Changes: Azure does not have a built-in service specifically for listening to changes in an Oracle table directly. However, you can implement this functionality using Azure Functions or Azure Logic Apps. Here's how you can do it:
-
Azure Functions: Create an Azure Function with a trigger that listens to a webhook or queue. Inside the function, connect to your Oracle database and query the table for new records. Whenever a new record is found, process the XML BLOB and save it to ADLS Gen2 Blob Container.
-
Azure Logic Apps: Create a Logic App with a trigger that periodically polls your Oracle database for new records. Once a new record is found, use an action to process the XML BLOB and save it to ADLS Gen2 Blob Container.
-
-
XML File Naming Convention: You can generate the XML file names based on the values in your Oracle table columns. Use a combination of the SURROGATE_ID, NATURAL_KEY1, NATURAL_KEY2, and LAST_UPDATE_TIMESTAMP to create unique file names for each XML file.
-
Integration with ADLS Gen2 Blob Container: Use Azure Data Factory to copy the XML files from your Oracle database to ADLS Gen2 Blob Container. You can create a pipeline in Azure Data Factory that executes the Oracle query to fetch the XML BLOBs and then writes them to ADLS Gen2 Blob Container using a Copy Activity.
By following these steps, you can implement a solution that streams XML BLOBs from your on-premise Oracle database into ADLS Gen2 Blob Container, triggered by Oracle events, and saves each XML file separately with a unique file name.