The primary purpose of this solution is to make the work done in cost effective manner. The goal can be achieved simply by using Stored Procedure
activity in pipeline, but Stored Procedure
activity is not supported in Synapse Serverless SQL Pool. So, if we use Dedicated SQL Pool, then it will be a costly approach.
The scope is to create views of files that are stored in ADLS Gen2. But catch here is, the views should be created in such a way that the date in file should be dynamic. The goal is to create an solution that can automatically create view in Synapse Serverless SQL Pool, whenever new files arrives in ADLS Gen2.
- Azure Synapse
Inside the Data Lake, I have a folder in a container that basically contains the files pushed by external source every day. However, I wanted to process only the latest added file in that folder. For this:-
-
Create another Dataset that will going to point to the files in that folder.
-
Use
Get Metadata
activity, to get the metadata of that folder. -
Then, use
ForEach
activity to iterate over all files in that folder. -
In that
ForEach
activity, again addGet Metadata
activity for files. -
Then add
If Condition
activity to check the latest file. I have used this dynamic expression because, my filename is like that<filename_yyyy-mm-dd>
and I checking the latest file based on filename. -
Then add
Set Variable
activity inTrue activities
ofIf Condition
. To set the latest filename and maxtime variable.
-
Create a Linked Service to Azure Synapse Analytics.
- But, there’s one problem however: it expects
Dedicated SQL pools
and doesn’t display the databases ofSynapse Serverless SQL Pool
in the list. - As a work around, you can type your Workspace SQL endpoint manually.
- But, there’s one problem however: it expects
-
Then in pipeline add
Script
activity to execute the CREATE VIEW command in Synapse Serverless SQL Pool.- Enter this code in the Script:
DECLARE @sql NVARCHAR(MAX); SET @sql = N'CREATE VIEW ' + QUOTENAME(@viewName) + N' AS SELECT * FROM OPENROWSET( BULK ' + QUOTENAME(@filename, '''') + N', DATA_SOURCE = ''raw'', FORMAT = ''CSV'', FIELDTERMINATOR ='','', ROWTERMINATOR = ''\n'', PARSER_VERSION = ''2.0'' ) AS [r]'; EXEC sp_executesql @sql;
NOTE: - Some points to be noted in this code is, SQL Server does not support the use of variables or parameters for the view name directly in a CREATE VIEW statement like commands, this is because these names must be a constant. However, it can be achieve by using dynamic SQL (construct the SQL command as a string and then execute it).
- Enter this Dynamic Expression in the
viewName
parameter:
@substring(variables('filename'), 0, indexOf(variables('filename'), '.'))
- Enter this Dynamic Expression in the
fileName
parameter:
@concat('dynamic_view_files/', variables('filename'))