🦾The Mapping Document

Mapping is an integral part of any data transformation process. It allows you to specify a target format with data restrictions like length, precision, requirements, and more.

The mapping document also allows you to specify certain rules on how that target field is created. Is it simply a Copy from the source? Is it Defaulted to a value? Do you Map it to another lookup value?

Let's take a look at the mapping document and how easy it easy to get started

A Demo Student Map

Resources

Starting with a new map

Let's ask Transforms to generate us a map of a student

27KB

Student.map.xlsx

This generation methods allows us to take an idea, or concept for a map as a starting place and let AI generate the map for us. This is one of many ways to create a map without having one. 👍

Here's the map, unmodified, from the site:

Using the student data that we have, we need to make some tweaks to this map:

Let's add the Major column, as it's missing
We'd like to null out (or make empty) the DateOfBirth, as our target doesn't need that information.

To do that:

Copy the LastName or FirstName row, as it's close to the data type and information we need.
Change the TargetFieldName to Major.
Change the Ordinal to 6.
Change the TransformType of DateOfBirth to EMPTY.

And just like that, we're ready produce a valid Student file! Let's use Transforms again, but this time, let's take our map AND our sample data from above and let Transforms auto assign the source fields to the destination fields.

Head over to Prepare -> AutoMap. Simply drop your data file, and your map, and re-download the mapping document.

And just like that, we've got ourselves a fully ready to use map! In this case, all of the source fields perfectly matched the target fields. So it would be easy for us to just copy those over. However, in many live scenarios this won't be the case. The AutoMap process uses fuzzy logic to intelligently assign source columns from your input data to try to match the TargetFieldName from the map.

Specification

TransformGroup

This is the ETL name or "Transform Group Name". It's a unique specifier to tell the system what the name of this map is.

SourceFieldName

This is the column name of the source data (the data that is to be processed by Transforms). It should identically match the input data column name.

TargetFieldName

This is the name of the column in the final transformed file. This should match what your database, target system, or specification has for the name.

Ordinal

This tells the map at what location (left to right, 0-n) should this column be.

TransformType

The following types can be used in the TransformType column

TransformType

Description

COPY

This copies the data from the assigned source field. It's still held to the length and precision requirements defined

EMPTY

This is an empty value, specifying no operation is taken on this

DEFAULT

This assigns the value of the "DefaultValue" column in the map

MAP

This is a special type that when supplying a lookup file, will use the "MapKey" column to perform a lookup of the source field, through the defined "MapKey", to a new value.

GUID

Every row will have a newly generated guid in it

MapKey

This key is required when the TransformType is set to MAP. This tells the system which subset of lookup values to pass your source data through.

Taking a look at the example lookups file in the table below will show what exactly this MapKey is tied to:

MapKey

SourceField

TargetField

StateCode

Arizona

StateCode

North Carolina

PropertyCode

A12

AB001200

PropertyCode

Z19

AB001900

77B

LookupValuesMap.csv

DefaultValue

The default value is used when the TransformType is set to DEFAULT. There is no transformation applied to this field, the value is simple "copied" over.

DBDataType

This is the datatype used for this field. It uses MS-SQL data types to define the input types.

DBDataSize

This is the max length for strings, and precision for decimals.

DBDataScale

This only apples when a scale data type is used, like a decimal. This is the scale.

IsDevField

This field is only assigned to true when we specify it's required for a specialized data quality check, or transformtion process. Behind the scenes, it creates a copy of that input column and allows the transformation engine to access the original, unmodified cell data during the process.

Most of time this field should be ignored and kept to "no"

Active

This field indicates that it is active to be used. If set to false, the field will be ignored

IsRequired

Transformations occur regardless of missing fields or not. What this changes however is how the transformation report indicates issues, and their severity. Missing "required" columns generate a higher severity line in the transformation report than "non required" columns.

Header

When loading into a system like Yardi Voyager, you'll need the CSV Topline above the header rows. This column indicates what that line should be if the CSV topline is written.

ExportFileName

If the file has a specific name to be assigned on export, you can supply that here. The output file will be given this name if it exists.

DataObjectID

This is a unique ID that should specifically identify the map. It can be used during the auto identification process when it is unclear what type of input file was given to the system.

CalcExpression

Uses an Fx expression to calculate the value of a column. This is done AFTER all other transformation/conversions (in a post process)

To see the functions and available features, check out the link below

F(x) Expressions

Don't forget to check out the Additional Transforms Fx Features

RowFilter

Uses an Fx expression to filter rows. The expression MUST return a boolean, meaning a comparison operation. Anything that returns true is eliminated.

To see the functions and available features, check out the link below

F(x) Expressions

Don't forget to check out the Additional Transforms Fx Features

OrderBy

You can order the resulting file with a multi-sort by using this column.

Simply supply the order and the direction as such:

TargetColumn

OrderBy

FirstName

2.asc

LastName

1.desc

OrgID

4.asc

Additional Transforms Fx Features

Transforms enables additional F(x) features that make providing F(x) expression inside a map easier.

Aggregate Functions

Aggregate functions allow you to aggregate values and you may also optionally supply order by, partition by, and rows between clauses.

Here's a few examples showing off how to use partition, order and rows clauses. They are very similar if not identical to MSSQL's syntax

// SUM examples
SUM([Amount])
SUM([Amount] PARTITION BY [OrgID])
SUM([Amount] PARTITION BY [OrgID] ORDER BY [Date] rows between current row and unbounded following)
SUM([Amount] PARTITION BY [OrgID] ORDER BY [Date] rows between unbounded preceding and 1)

first_value([Position] ORDER BY [Active] asc, [DateModified] desc)

row_number(order by [Date] asc)
dense_rank(order by [Date] desc)

Here's the list of aggregate functions:

sum
min
max
avg
count
first_value
last_value
row_number
rank
dense_rank

Additional Functions

trannum() - Used for Yardi type transaction numbering. The columns must contain: Book, PostMonth, Property
lookup(mapKey, value) - The same function to lookup a value from an associated lookup values map. The mapKey is the group and the value is the key.

Additional Tokens

[$this] - This selects the current TargetName. If the expression is on a field called Amount, you could use: [$this] > 50 as a way to filter amounts greater than 50
[!<col>] - If you use an exclimation mark, you can actually specify a SourceField column to use as the input to a function or calculation. This is incredibly powerful as you can reference unmapped columns, before any modification, mapping, or conversion has taken place. Example: [!firstName].
[$src] - Source is the equivalent of selecting the source column from the current row. Just like [$this], but for the [!sourcecol].

PreviousTerminology NextTransformation Process

Last updated 19 days ago