# Transform Process

The Transform Process is squeezed between the other two processes. This process is run **AFTER** the  [File IO](https://docs.perigee.software/transform-sdk/authoring-plugins/file-io-process) processes and **BEFORE** the [Data Qualit](https://docs.perigee.software/transform-sdk/authoring-plugins/data-quality)y processes. The key take away about this process is that the data from the source has already been loaded into a **`DataTable`**. This means you can modify known source column names, change the mapping specification, calculate or concatenate fields, etc. &#x20;

{% hint style="success" %}

* Need to modify the data before it's ever touched by the Transforms? Use [File IO](https://docs.perigee.software/transform-sdk/authoring-plugins/file-io-process).
* Need to modify the data, maps, options, etc after Transforms has successfully loaded the data into a table? Use [This Transform Process](#authoring-the-plugin)
* Need to generate data quality reporting? Use [Data Quality](https://docs.perigee.software/transform-sdk/authoring-plugins/data-quality).
  {% endhint %}

## Use Cases

Transform Process is perfect when:

* The incoming data is already loaded and you'd like to modify known source column names.
* There may be unique logic to change the mapping specification before running the transform.
* You need to add or remove data, concatenate fields, calculate something before transform.&#x20;

## Creating a plugin project

{% hint style="success" %}
If you would like to actually create a plugin library (`dll` project), follow these steps first and we'll put our code here. Otherwise, [skip this step](#authoring-the-plugin), and create the code directly within your project.
{% endhint %}

1. Create a new DLL project, and for the time being, set the framework to **`net6.0`**.&#x20;
2. Install the latest version of Perigee using `install-package perigee` - OR use Nuget Package Manager.&#x20;
3. Open the `.proj` file by double clicking on the DLL project in your code editor. You should see the XML for the project below.
4. The two changes you need to make are:
   * Add the **`<EnableDynamicLoading>true</EnableDynamicLoading>`** to the `PropertyGroup` tag
   * For the `PackageReferences`, add **`<Private>false</Private`** and **`<ExcludeAssets>runtime</ExcludeAssets>`**

```xml
<Project Sdk="Microsoft.NET.Sdk">

	<PropertyGroup>
		<EnableDynamicLoading>true</EnableDynamicLoading>
		<TargetFramework>net6.0</TargetFramework>
		<ImplicitUsings>enable</ImplicitUsings>
		<Nullable>enable</Nullable>
	</PropertyGroup>

	<ItemGroup>
		<PackageReference Include="perigee" Version="24.6.1.1">
			<Private>false</Private>
			<ExcludeAssets>runtime</ExcludeAssets>
		</PackageReference>
	</ItemGroup>

</Project>

```

That's it! You've created a new DLL Project that when built, will produce a `plugin.dll` that Transforms is able hot reload and run dynamically at runtime.

## Authoring the plugin

The plugin can contain many Transformation processes. Each process is defined by a method, and an attribute. Here's what a new process for **`NewlineRemover`** looks like:

```csharp
//Remove all newlines from string columns. They aren't allowed anywhere
[TransformationProcess(false, -99, "Whitespace Remover", "cleanup", IsPreTransform = true)]
public class TR_NewlineRemover : ITransformationProcessTable
{
    public void ProcessTable(TransformDataContext data)
    {
        //Process
    }
}
```

### Attribute

The <mark style="color:orange;">**`[attribute]`**</mark> tells the system a few important things, in the order shown above, they are:

1. <mark style="color:purple;">**Active?**</mark> - Should the plugin loader use this plugin, is it active? Or is this in development or unavailable.  &#x20;
2. <mark style="color:purple;">**SortOrder (int)**</mark> - When multiple steps are defined and activated, which order (ascending) are they run in?
3. <mark style="color:purple;">**Name**</mark> - What name is this plugin given? Although it may not be shown anywhere immediately when running locally, the name is used for debugging and shown in certain log messages.
4. <mark style="color:red;">**Partition Keys**</mark> - This is a very important field to fill out. This specifies under what files (partitions) to run the data quality checks. You can partition them for only certain types of files. You may provide multiple keys in a comma separated list like so: <mark style="color:green;">`"yardi, finanace, FinanceFileA"`</mark>
   * It's either blank, <mark style="color:red;">**""**</mark> - which means it can always run.&#x20;
   * It has the [DataTableName (TransformGroup)](https://docs.perigee.software/transforms/the-mapping-document#transformgroup) - Which can automatically be selected when running that specific map.
   * It has a generic key (like `yardi`, `custom`, `finance`, etc), for which you can specify during the transform process which keys you'd like to run.  See the [MapTo ](https://docs.perigee.software/transform-sdk/mapto)section for more info on running with partition keys
5. <mark style="color:purple;">**IsPreTransform (false|true)**</mark> - This is typically `true`, meaning this process is run before the transformation occurs. If you're writing a process to modify the transformed results, then set this to `false`.&#x20;

### Interface

The <mark style="color:orange;">**`ITransformationProcessTable`**</mark> interface gives the method all of the required data it needs to process the file.&#x20;

Here's a quick example of the powerful toolset provided by this interface. This checks every string column and removes any newline characters from it. If any are found, it will add a `ProcessExecution` type line item to the transformation report.&#x20;

```csharp
//Remove all newlines from string columns. They aren't allowed anywhere
[TransformationProcess(false, -99, "Whitespace Remover", "cleanup", IsPreTransform = true)]
public class TR_NewlineRemover : ITransformationProcessTable
{
    public void ProcessTable(TransformDataContext data)
    {
        var strCols = data.ColumnsOfType(typeof(string));

        data.ProcessRows(() => strCols.Any(), null, (row, indx) =>
            data.EachColumn<string>(row, strCols, (name, str, n) =>
            {
                if (str.IndexOfAny(new char[] { '\r', '\n' }) != -1)
                {
                    row[name] = str.Replace("\n", "").Replace("\r", "");
                    data.report.TransformationLines.Add(new TransformationReport.TransformationLine()
                    {
                        Column = name,
                        Message = $"Removed newlines from source value",
                        RowIndex = indx,
                        Severity = TransformationReport.TransformationItemSeverity.Warning,
                        Type = TransformationReport.TransformationItemType.ProcessExecution,
                        SourceValue = str,
                        DataObjectID = data.map.DataObjectID
                    });
                }
            }, true));

    }
}
```

## SDK

To see all of the available methods, properties, and helpers, check out the SDK page:

{% content-ref url="../sdk-reference/transformdatacontext" %}
[transformdatacontext](https://docs.perigee.software/transform-sdk/sdk-reference/transformdatacontext)
{% endcontent-ref %}

## Running Transform Process Manually (SDK)

If you're trying to run transform processes manually, just make sure to restrict to the proper pre or post transform types. Get the map, data and iterate over the modules like so:

```csharp
var sourceData = new DataTable();
var mapSpec = Transformer.GetMapFromFile("map.xlsx", out var mrpt).FirstOrDefault();
var report = new TransformationReport() { Name = "Custom report" };

foreach (var trProcess in TransformationProcess.GetModules()
    .Where(f => f.transformAttribute.Active && f.transformAttribute.IsPreTransform == false && f.TableType != null)
    .OrderBy(f => f.transformAttribute.Order)) {

    var tdc = new TransformDataContext(null, sourceData, mapSpec, report, false);
    tdc.attribute = trProcess.transformAttribute;
    trProcess.RunTableInstance(trProcess.TableInstance, tdc);
}
```

## Installation in Client App

If you created a `plugin.dll` project: Compile the project and drop the `.dll` into the <mark style="color:purple;">**`Plugins/Transform`**</mark> folder.&#x20;

If you wrote the process in the same project as you're running, the plugin loader will automatically take a scan of the assembly and the plugin is available for use.&#x20;
