๐Ÿ”˜File IO Process

The File IO Process is the most versatile process of the plugins as it isn't restricted to performing a single action (like checking data quality). The File IO Process is completely open to you on how you modify the incoming or outgoing data,

  • Need to modify the data before it's ever touched by the Transforms? Use File IO.

  • Need to modify the data, maps, options, etc after Transforms has successfully loaded the data into a table? Use the Transform Process

  • Need to generate data quality reporting? Use Data Quality.

Use Cases

File IO is perfect when:

  • The incoming data is not recognizable from the file read methods.

  • The file is plaintext, a flat file, XML, or other format that requires a pre-transform to generate an appropriate table.

  • The outgoing data must be modified in a unique way.

Creating a plugin project

If you would like to actually create a plugin library (dll project), follow these steps first and we'll put our code here. Otherwise, skip this step, and create the code directly within your project.

  1. Create a new DLL project, and for the time being, set the framework to net6.0.

  2. Install the latest version of Perigee using install-package perigee - OR use Nuget Package Manager.

  3. Open the .proj file by double clicking on the DLL project in your code editor. You should see the XML for the project below.

  4. The two changes you need to make are:

    • Add the <EnableDynamicLoading>true</EnableDynamicLoading> to the PropertyGroup tag

    • For the PackageReferences, add <Private>false</Private and <ExcludeAssets>runtime</ExcludeAssets>

<Project Sdk="Microsoft.NET.Sdk">

	<PropertyGroup>
		<EnableDynamicLoading>true</EnableDynamicLoading>
		<TargetFramework>net6.0</TargetFramework>
		<ImplicitUsings>enable</ImplicitUsings>
		<Nullable>enable</Nullable>
	</PropertyGroup>

	<ItemGroup>
		<PackageReference Include="perigee" Version="24.6.1.1">
			<Private>false</Private>
			<ExcludeAssets>runtime</ExcludeAssets>
		</PackageReference>
	</ItemGroup>

</Project>

That's it! You've created a new DLL Project that when built, will produce a plugin.dll that Transforms is able hot reload and run dynamically at runtime.

Authoring the plugin

The plugin can contain many file IO processes. Each process is defined by a method, and an attribute. Here's what a new process for FlatFileSplitter looks like:

[FileIOProcess(true, "Sample: FlatFileSplitter")]
public class FlatFileSplitter : IFileIOProcess
{
    public void ProcessData(FileIOProcessData data)
    {
        //Process!
    }
}

Attribute

The [attribute] tells the system two important things, in the order shown above, they are:

  1. Active? - Should the plugin loader use this plugin, is it active? Or is this in development or unavailable.

  2. Name - What name is this plugin given? Although it may not be shown anywhere immediately when running locally, the name is used for debugging and shown in certain log messages.

Other optional attribute values you can supply are:

  • AutoRun (false|true) - If this is true, this IO process is an auto-start process, meaning it doesn't require an explicit reference to trigger. These types of processes are not as common, but allow you to ALWAYS run this process any time a file is being processed and you can optionally modify it

  • IsPreTransform (false|true) - This is typically true, meaning this process is run before the transformation occurs. If you're writing a process to modify the transformed results, then set this to false.

  • SortOrder (int) - When multiple IO steps are defined and activated, which order (ascending) are they run in?

Interface

The IFileIOProcess interface gives the method all of the required data it needs to process the file.

Here's a quick example of the powerful toolset provided by this interface. This will split a flat file with two columns into a data table that can be read by the transformer

[FileIOProcess(true, "Sample: FlatFileSplitter")]
public class FlatFileSplitter : IFileIOProcess
{
    public void ProcessData(FileIOProcessData data)
    {
        if (!data.IsTextMime) return;

        //Convert a flat file to a CSV
        DynamicDataTable DDT = new();
        data.BytesToString().Split(new char[] { '\r', '\n' },
            StringSplitOptions.RemoveEmptyEntries).ForEach((uRow, r) =>
            {
                DDT.AddRowValues((uint)uRow, r.Substring(0, 30).Trim(), r.Substring(30, 30).Trim());
            });

        //Write
        data.SetAsDataTable(DDT.FinishDataLoad().ToDataTable());
    }
}

SDK

To see all of the available methods, properties, and helpers, check out the SDK page:

๐Ÿ”˜pageFileIOProcessData

Running FIOPS Manually (SDK)

Running the processes is up to you and what processes to even run. Here's a quick snippet of running all of the defined modules in the current assembly.

//Assign FIOData
var FIOData = new FileIOProcessData()
{
    FileData = File.ReadAllBytes("file.csv"),
    FileMime = TransformUtil.ToMimeType(Path.GetExtension("file.csv")),
    FileName = "file",
    SignificantDigits = 5
};

//Iterate modules and run in order
foreach (var fi in FileIOProcess.GetModules().OrderBy(f => f.Attribute.SortOrder))
    fi.RunForInstance(fi.Instance, FIOData);

//FIOData convert to table (or read direct table if it exists)
var dt = FIOData.TableFromFile();

Installation in Client App

If you created a plugin.dll project: Compile the project and drop the .dll into the Plugins/IO folder.

If you wrote the process in the same project as you're running, the plugin loader will automatically take a scan of the assembly and the plugin is available for use.

Last updated