🦾The Mapping Document
Mapping is an integral part of any data transformation process. It allows you to specify a target format with data restrictions like length, precision, requirements, and more.
The mapping document also allows you to specify certain rules on how that target field is created. Is it simply a Copy from the source? Is it Defaulted to a value? Do you Map it to another lookup value?
Let's take a look at the mapping document and how easy it easy to get started
A Demo Student Map
Resources
Starting with a new map
Let's ask Transforms to generate us a map of a student
Here's the map, unmodified, from the site:
Using the student data that we have, we need to make some tweaks to this map:
Let's add the Major column, as it's missing
We'd like to null out (or make empty) the DateOfBirth, as our target doesn't need that information.
To do that:
Copy the LastName or FirstName row, as it's close to the data type and information we need.
Change the TargetFieldName to Major.
Change the Ordinal to 6.
Change the TransformType of DateOfBirth to EMPTY.
And just like that, we're ready produce a valid Student file! Let's use Transforms again, but this time, let's take our map AND our sample data from above and let Transforms auto assign the source fields to the destination fields.
Head over to Prepare -> AutoMap. Simply drop your data file, and your map, and re-download the mapping document.
And just like that, we've got ourselves a fully ready to use map! In this case, all of the source fields perfectly matched the target fields. So it would be easy for us to just copy those over. However, in many live scenarios this won't be the case. The AutoMap process uses fuzzy logic to intelligently assign source columns from your input data to try to match the TargetFieldName from the map.
Specification
TransformGroup
This is the ETL name or "Transform Group Name". It's a unique specifier to tell the system what the name of this map is.
SourceFieldName
This is the column name of the source data (the data that is to be processed by Transforms). It should identically match the input data column name.
TargetFieldName
This is the name of the column in the final transformed file. This should match what your database, target system, or specification has for the name.
Ordinal
This tells the map at what location (left to right, 0-n) should this column be.
TransformType
The following types can be used in the TransformType
column
The sequence and rank types allow you to sort the file while supplying the ordering.
This step is a post process step, so the SourceField
column of the map is a list of TargetFields
.
MapKey
This key is required when the TransformType
is set to MAP. This tells the system which subset of lookup values to pass your source data through.
Taking a look at the example lookups file in the table below will show what exactly this MapKey
is tied to:
DefaultValue
The default value is used when the TransformType
is set to DEFAULT. There is no transformation applied to this field, the value is simple "copied" over.
DBDataType
This is the datatype used for this field. It uses MS-SQL data types to define the input types.
DBDataSize
This is the max length for strings, and precision for decimals.
DBDataScale
This only apples when a scale data type is used, like a decimal. This is the scale.
IsDevField
This field is only assigned to true when we specify it's required for a specialized data quality check, or transformtion process. Behind the scenes, it creates a copy of that input column and allows the transformation engine to access the original, unmodified cell data during the process.
Most of time this field should be ignored and kept to "no"
Active
This field indicates that it is active to be used. If set to false, the field will be ignored
IsRequired
Transformations occur regardless of missing fields or not. What this changes however is how the transformation report indicates issues, and their severity. Missing "required" columns generate a higher severity line in the transformation report than "non required" columns.
Header
When loading into a system like Yardi Voyager, you'll need the CSV Topline above the header rows. This column indicates what that line should be if the CSV topline is written.
ExportFileName
If the file has a specific name to be assigned on export, you can supply that here. The output file will be given this name if it exists.
DataObjectID
This is a unique ID that should specifically identify the map. It can be used during the auto identification process when it is unclear what type of input file was given to the system.
Last updated