Page cover image

📰Excel Quick Load

From Data To Analyst In a Few Clicks

Data Load

Need to take your Excel data and load it into your database to get the analysts started?

Instead of taking weeks to develop a new data pipeline, create structure, and work with the business to load your new client data in, just drop the file in Perigee. We will take care of everything from finding and getting the data loaded, to creating a table in only a few seconds. You’ll be a hero!

The Data

Here's a typical messed up Excel file where data is shifted and there are unwanted headers.

Excel window showing misaligned data

The Code

using Microsoft.Extensions.Logging;
using Perigee;
using Perigee.Database.MSSQL;
using Perigee.FileFormats.Excel;
using System.Data;

// Visit https://docs.perigee.software to learn more
// Visit https://perigee.software to purchase a license!
PerigeeApplication.ApplicationNoInit("Quick Data Loads", (c) => {

    //Register
    c.RegisterConnectionString("MSSQL", "main");

    //Read
    using MemoryStream ms = new MemoryStream(File.ReadAllBytes("SampleExcel.xlsx"));
    var dtex = ExcelReader.ReadDataTable(ms, "UserData", out var ddt, true, false, tablename: "clientData", schema: "dbo");

    //Print
    var stats = ddt.GetStatistics();
    c.GetLogger<Program>().LogInformation("Loaded {r} rows, header is on row {n}. Column names: {@c}. Fill Rate %: {Fill:N2}", dtex.Rows.Count, ddt.HeaderRow + 1, dtex.Columns.OfType<DataColumn>().Select(f => f.ColumnName).ToArray(), (stats.FillRate?.Average(f => f.Value) ?? 0.0m) * 100.0m);
    foreach (var col in stats?.FillRate?.Reverse() ?? Array.Empty<KeyValuePair<uint, decimal>>())
    {
        c.GetLogger<Program>().LogInformation("{Column} fill rate %: {fr}", stats!.ColumnNames[col.Key], col.Value * 100.0m);
    }

    //Load, create or drop.
    MSSQLProcessUtilities.ProcessToSQL(dtex, "MSSQL", true, new SQLTableColumnDefaults() { StringType = SqlDbType.NVarChar }, DropTableIfNotExists: true);
    
});

After running the application, you'll see the log showing how many rows it found, where the header was since it wasn't on the top row, and the column's detected. It even prints out the fill rates per column!

Results, showing statistics

Taking a peak at our database, we see the data loaded into our new table dbo.clientData:

Which came directly from the Excel file linked below

The associated files:

The appsettings.json file:

{
  "ConnectionStrings": {
    "main": "data source=host; initial catalog=test; User Id=sa; Password=abc"
  },
  "AppSettings": {
  
  },
  "Perigee": { "HideConsole": false },
  "Serilog": {
    "MinimumLevel": "Debug",
    "WriteTo": [
      { "Name": "Console" }
    ]
  }
}

The SampleExcel file we loaded:

Last updated