๐ŸฅณMapTo

Let's take a look at the .MapTo method (the primary transform method) and understand all of the available options. You can customize this transformation down to every single column or cell that gets read.

MapTo works by starting from a DataTable, sending the data through a map, and producing reports and target results. Every option here assists in that process and allows you to customize exactly what happens at every single cell.

Example 1 - Just a map

It is possible to execute a transform by only having a map and source data. If this is the use case you need, it's only one line of code

var result = new DataTable().MapTo(
    Transformer.GetMapFromFile("map.xlsx", out var mreport).FirstOrDefault());

Example 2 - Maps and a lookup

If your map contains lookups and you need to reference them include them in the list of paramaters.

var result = new DataTable().MapTo(
    Transformer.GetMapFromFile("map.xlsx", out var mreport).FirstOrDefault(),
    
    Transformer.ToGroupHash(Transformer.GetLookupFromFile("lookups.csv", out var lreport)));

The ToGroupHash function takes cares of producing a valid hashed lookup table for the transformer to work on. Before this step, it's very easy to read and modify the raw LookupMap objects.

Example 3 - Adding conversion functions

Transforms allows you to specify many different ways of customizing the conversion process. Here's two of them that are "in place" and don't involve writing Transform Processes.

This will convert colID 1 from the mapping specification to an uppercase string whenever possible.

//Define custom transforms: 
//  - Transform colID 1 ToUpper
var customConversions = new Dictionary<int, Func<object, object, DataRow, MappingColumn, long, object>> {
        { 1, (value, defaultValue, row, map, index) => value?.ToString()?.ToUpper() ?? defaultValue }
    };
    
var result = new DataTable().MapTo(
    Transformer.GetMapFromFile("map.xlsx", out var mreport).FirstOrDefault(),
    Transformer.ToGroupHash(Transformer.GetLookupFromFile("lookups.csv", out var lreport)), 
    
    CustomConversions: customConversions);

Custom Conversion functions allow you to modify the incoming data however you like. Internally these functions are used to convert the data when no registered converter is assigned for the given colID of the map. Values passed into the function are:

  • value of object

  • default value of column (if exists)

  • DataRow of the object being loaded

  • MappingSpecification of the current running process.

  • long row index of the row

Example 4 - OnBeforeTableAdded

Maybe you need to statically set a value, or calculate something before it proceeds to the plugin processes. You may modify the DataRow one last time before it is added to the resulting table.

//Define custom transforms: 
//  - Transform colID 1 ToUpper
var customConversions = new Dictionary<int, Func<object, object, DataRow, MappingColumn, long, object>> {
        { 1, (value, defaultValue, row, map, index) => value?.ToString()?.ToUpper() ?? defaultValue }
    };
    
var result = new DataTable().MapTo(
    Transformer.GetMapFromFile("map.xlsx", out var mreport).FirstOrDefault(),
    Transformer.ToGroupHash(Transformer.GetLookupFromFile("lookups.csv", out var lreport)), 
    CustomConversions: customConversions,
    
    OnBeforeTableAdd: row => { row["Status"] = "New"; });

Example 5 - Plugins

One of the most important and powerful features of the transform engine is the ability to run plugins. Both DataQuality and Transform plugins are baked right into the transform process and you can specify which plugins, as well as which partitions of those plugins to run.

The AutoAddMapToPartitionKeys option tells the transformer to automatically assign the DataTable (Transform Group) name to the list of assigned partitions. This is useful when the partition key contains specific map transform groups.

Partition Keys - This specifies under what files (partitions) to run the data quality and transform processes. This ties DIRECTLY to the Authoring Plugins section about partition keys as these partitions are assigned at the class and attribute level of the defined checks, and they must match what is assigned here.

If a blank is used in the processes partition keys, assigning a blank here will also run that process.

  • Blank "" values are typically used for "global" processes.

  • Friendly key names, like "finance" may be used to group certain processes only when running financial files

  • The DataTable (Transform Group) name can be put in the comma separated list of partitions when authoring plugins. This is how AutoAddMapToPartitionKeys is able to automatically add the partition keys to the registered keys and run additional processes.

//Define custom transforms: 
//  - Transform colID 1 ToUpper
var customConversions = new Dictionary<int, Func<object, object, DataRow, MappingColumn, long, object>> {
        { 1, (value, defaultValue, row, map, index) => value?.ToString()?.ToUpper() ?? defaultValue }
    };
    
var result = new DataTable().MapTo(
    Transformer.GetMapFromFile("map.xlsx", out var mreport).FirstOrDefault(),
    Transformer.ToGroupHash(Transformer.GetLookupFromFile("lookups.csv", out var lreport)), 
    CustomConversions: customConversions,
    OnBeforeTableAdd: row => { row["Status"] = "New"; },

    DataQualityModules: DataQuality.GetModules(),
    DataQualityPartitionKeys: new List<string>() { "general", "" },
    TransformProcessModules: TransformationProcess.GetModules(),
    TransformProcessPartitionKeys: new List<string>() { "general", "" },
    AutoAddMapToPartitionKeys: true);

What about File IO Plugins?

To run File IO Processes, see the plugin authoring page as they are run before any of the MapTo processes start. You'll need to decide which File IO Processes to run manually as they can drastically change the data being processed.

Last updated