SSIS MULTIFLATFILE Connection Manager
Until recently I had no idea SSIS allowed opening multiple connections to (flat) files, but due to the performance and parallelism implication of such a connection, I thought its definitely worth blogging about.
SSIS is a great ETL tool, but after a few years of experience in producing these packages, I came to the belief that memory management and parallelism isn’t current easily handled within SSIS, not to say that you don’t have the options available, but the default (out-of-the-box) setup tends not to be behave well with very large datasets while managing system resource effectively, and you need to tinker with your package to make sure you get the balance right between resource utilization and processing speed. Also, debugging is horrendous, better than SSRS’s debugging options, but still very frustrating.
Today we will look at how to create and use a MULTIFLATFILE connection type in SSIS.
MULTIFLATFILE connection can be created by right-clicking on “Connection Managers” and selecting the custom “New Connection” option (as shown in the image below).
You can now select the “MULTIFLATFILE” connection from the table (shown below)
You can also see in the table above a few other options that do not exist as part of the default connection type options when creating a new SSIS connection.
Configuring a MUTLIFLATFILE connection type is similar to configuring a normal SSIS Flat File Connection (configuration screen shown below). Although instead of the connection consuming 1 file at a time, it will consume all files that match the configured “File names” property. You can use wild cards (e.g. *) or | (pipes) to delimit file names in the “File names” property, although wild cards is only supported through the Expression menu, and is not allowed in the configuration menu below (annoyingly, it was allowed in SSIS 2005). For more information on the properties you have available you can checkout MSDN’s article on the General Page of the Multiple Flat File Connection Editor windows
And bob is your uncle! now you can consume multiple files from the same Flat File source (within your Data Flow) without having to loop over every file individually. Image below demonstrate that.
Do file structure have to be identical for MULTIFLATFILE connection manager to work?
Yeap
How to capture the file name when multiple flat file connection is used?
I think there is a custom property called “FileNameColumnName”, which gives u back the full file name and path.
When using that property it fetches the file pathname which is given in flatfile connection manager. For eg. Flat file connection path will be “c:\proceed\*.txt” means it captures c:\proceed alone.