Transform Process
Last updated
Last updated
The Transform Process is squeezed between the other two processes. This process is run AFTER the processes and BEFORE the y processes. The key take away about this process is that the data from the source has already been loaded into a DataTable
. This means you can modify known source column names, change the mapping specification, calculate or concatenate fields, etc.
Need to modify the data before it's ever touched by the Transforms? Use .
Need to modify the data, maps, options, etc after Transforms has successfully loaded the data into a table? Use
Need to generate data quality reporting? Use .
Transform Process is perfect when:
The incoming data is already loaded and you'd like to modify known source column names.
There may be unique logic to change the mapping specification before running the transform.
You need to add or remove data, concatenate fields, calculate something before transform.
If you would like to actually create a plugin library (dll
project), follow these steps first and we'll put our code here. Otherwise, , and create the code directly within your project.
Create a new DLL project, and for the time being, set the framework to net6.0
.
Install the latest version of Perigee using install-package perigee
- OR use Nuget Package Manager.
Open the .proj
file by double clicking on the DLL project in your code editor. You should see the XML for the project below.
The two changes you need to make are:
Add the <EnableDynamicLoading>true</EnableDynamicLoading>
to the PropertyGroup
tag
For the PackageReferences
, add <Private>false</Private
and <ExcludeAssets>runtime</ExcludeAssets>
That's it! You've created a new DLL Project that when built, will produce a plugin.dll
that Transforms is able hot reload and run dynamically at runtime.
The plugin can contain many Transformation processes. Each process is defined by a method, and an attribute. Here's what a new process for NewlineRemover
looks like:
The [attribute]
tells the system a few important things, in the order shown above, they are:
Active? - Should the plugin loader use this plugin, is it active? Or is this in development or unavailable.
SortOrder (int) - When multiple steps are defined and activated, which order (ascending) are they run in?
Name - What name is this plugin given? Although it may not be shown anywhere immediately when running locally, the name is used for debugging and shown in certain log messages.
Partition Keys - This is a very important field to fill out. This specifies under what files (partitions) to run the data quality checks. You can partition them for only certain types of files. You may provide multiple keys in a comma separated list like so: "yardi, finanace, FinanceFileA"
It's either blank, "" - which means it can always run.
IsPreTransform (false|true) - This is typically true
, meaning this process is run before the transformation occurs. If you're writing a process to modify the transformed results, then set this to false
.
The ITransformationProcessTable
interface gives the method all of the required data it needs to process the file.
Here's a quick example of the powerful toolset provided by this interface. This checks every string column and removes any newline characters from it. If any are found, it will add a ProcessExecution
type line item to the transformation report.
To see all of the available methods, properties, and helpers, check out the SDK page:
If you're trying to run transform processes manually, just make sure to restrict to the proper pre or post transform types. Get the map, data and iterate over the modules like so:
If you created a plugin.dll
project: Compile the project and drop the .dll
into the Plugins/Transform
folder.
If you wrote the process in the same project as you're running, the plugin loader will automatically take a scan of the assembly and the plugin is available for use.
It has the - Which can automatically be selected when running that specific map.
It has a generic key (like yardi
, custom
, finance
, etc), for which you can specify during the transform process which keys you'd like to run. See the section for more info on running with partition keys