🈁Bit Serializer

Perigee's proprietary serialization tool, Bit, stands out as a high-performance, high-compression module for serializing and deserializing data. Comparable to Avro or Protobuf in function but distinct in its foundation, Bit is developed entirely with a unique algorithm in 100% C# code. It is integral to several internal modules at Perigee, facilitating scenarios that demand exceptional performance and minimal disk usage.

Key Features of Bit include:

  • Serialization and deserialization of classes, structs, lists, datatables, interfaced types, and properties. All without special "tagging" and pre-work

  • Support for reserialization and "patching" to update data without full rewrites.

    • If the type can be serialized, you can easily "deep clone" your object tree

  • Capability for partial deserialization, useful for adapting to revised data maps or focusing on specific properties.

  • Built-in compression to minimize data size.

  • Facilities for map reading and generation, enabling dynamic data structuring.

  • Tools to generate C# classes and schemas directly from data maps.

  • Very high performance serialization

  • Attributes for serialization events

  • Utilities for reference type patching

How Bit Works

Bit optimizes data storage by creating a map for each property, determining the necessity of writing each field based on defaults and nullability. This selective approach, combined with the efficient compression of values, significantly reduces the size of the output byte array. In practice, this can compress lists of objects by over 95% compared to their JSON document equivalents.

Why Choose Bit?

Unlike many high-compression algorithms that impose strict data input conditions and support limited data types, Bit offers more flexibility. It's fully recursive, supports a wide array of common data types (including custom classes, structs, lists, dictionaries, etc.), and handles data maps without cumbersome limitations.

It even handles Interfaced types without adding additional code. Just make sure the interfaced type has a no-parameters constructor and all will be fine!

Maps and Resilience to Change

Bit's mapping system ensures data resilience over time. For instance, objects serialized with an initial map version (R0) can be seamlessly reserialized to a newer version (R1) as your data schema evolves. This built-in functionality simplifies the process of keeping serialized data up-to-date without the complexity often associated with such transformations.

Compression Ratios

Bit demonstrates exceptional efficiency in data storage, achieving compression ratios of over 90% in many cases, and even up to 98% when optimizing data types and attribute usage, far surpassing the compression achieved with minimized JSON.

An Example

Here's a simple example where we are serializing a single class that has a string, ushort, enum, and recursive list of related "bit people".

It's extremely easy to use and Bit handles all of the data types and mapping.

var bitPerson = new BitPerson() { 
Name = "Bandit", 
Age = 42, 
PersonType = BitPersonType.adult, 
RelatedPeople = new List<BitPerson>()
{
    new BitPerson() { Name = "Bingo", Age = 7, PersonType = BitPersonType.child }
} };

//Simple example without storing a map
byte[] ser = Bit2.Serialize(bitPerson);
BitPerson? des = Bit2.Deserialize<BitPerson>(ser);

Compression Comparison

If you take this example and compare the byte lengths of JSON to Bit, it's pretty crazy. The compression ratios get even bigger when a larger document is at play, as bit automatically tests for additional compression on larger objects.

decimal compSize = Bit2.SerializeHeadless(BitPerson).Length / (decimal)JsonConvert.SerializeObject(BitPerson).Length;

// 0.108 (or a ~ tenth of the size!)

Documentation

Attributes

Ignore

The first attribute is Bit2Ignore. This tells the serializer to ignore this property and not serialize it. It is important to put this attribute on properties that are not necessary to store, or should be re-calculated after deserialization.

public class BitPerson
{
    public string Name { get; set; }

    [Bit2Ignore]
    public string ignoreExample { get; set; } = "";
}

Serialization Callbacks

There are two serialization attributes as well, they are: Bit2PreSerialize and Bit2PostDeserialize. These attribute allow you to specify a singular method to be called before serialization and after deserialization.

These methods must be instanced (not static) methods and the name does not matter.

public class BitPerson
{
    public string Name { get; set; }

    [Bit2PreSerialize]
    public void PreInvoke()
    {
        //Called before serialization.
        //Null out values, or store additional information needed after deserialization
    }

    [Bit2PostDeserialize]
    public void PostDes()
    {
        //Called after deserialization. Do whatever you need here!
    }
}

PostDeserialize Example

In almsot all cases, reference types likely don't matter. In those rare cases they do, like traversing and comparing a directed graph, those reference types are very important. In this example, we fix that by:

  1. Implementing the post deserialize callback.

  2. Checking if our deserialized class held it's comparer.

  3. If #2 is true, calling ReplaceReferenceTypes - Which is a Bit2 utility method.

All of the adjacency lists instances are now re-referenced. If you were to call object.ReferenceEquals(a,b) for two nodes in this graph, it would return true!

/// <summary>
/// This allows a deserialized graph to "patch" reference types. 
/// </summary>
[Bit2PostDeserialize]
public void PatchReferenceTypes()
{
    if (Comparer == null) return;
    
    Bit2.ReplaceReferenceTypes<T>(Comparer,
    adjacencyList.Keys.ToList(),
    adjacencyList.Select(f => f.Value).ToArray());

}

Serialization Cache

For scenarios where you need extreme performance, you can achieve 70-250X faster compression times using the DeSerCache (Many of our sample test cases were only 2-10 microseconds).

Creating a cache is easy. Simply supply a instantiated object and let the system pre-cache the object and create a bitMap to match. Then supply these objects to any serialization / clone / deserialization calls you want.

var bitPerson = new BitPerson();

//Generate a cache and a mpa
var cd2 = Bit2.DeSerCacheFromObject(bitPerson);

//Serialize with the map and cache
var sk2 = Bit2.Serialize(bitPerson, cd2.map, 0, cd2.cache);

This is incredibly helpful if you're serializing many of the same types over and over again (like a realtime communication application or lots of data store requests).

Serialization / Deserialization

Deserialize

Deserializes a byte array into an instance of a class.

Example:

YourClass obj = Bit2.Deserialize<YourClass>(data, map);

DeserializeHeadless

Deserializes a byte array without a header into an instance of a class. Useful for data serialized without initial header bytes.

Example:

YourClass obj = Bit2.DeserializeHeadless<YourClass>(data);

SerializeHeadless

Serializes an object into a byte array without adding a header.

Example:

byte[] data = Bit2.SerializeHeadless(obj, version);

Serialize

Serializes an object into a byte array, including a header with a version.

Example:

byte[] data = Bit2.Serialize(obj, version);

Serialize with Map Bytes Output

Serializes an object into a byte array and outputs the bytes of the map used for serialization.

Example:

byte[] mapBytes;
byte[] data = Bit2.Serialize(obj, out mapBytes, version);

Reserialize

Reserializes data from one map version to another. It enables transferring data between different revisions of a map, handling changes like new fields, removed fields, or updated data types.

Example:

byte[] newData = Bit2.Reserialize<YourClass>(data, oldMap, newMap);

Reserialize with Modification Callback

Similar to Reserialize, but includes a BeforeReserialization callback allowing modification of the object just before reserialization.

Example:

byte[] newData = Bit2.Reserialize<YourClass>(data, oldMap, newMap, obj => {
    // Modify obj here
    return obj;
});

Patch

Deserializes an object, patches it using a provided callback function, and then reserializes it.

Example:

byte[] patchedData = Bit2.Patch<YourClass>(data, obj => {
    // Patch obj here
}, map);

Utilities

DeepClone

Deep clone an object by serializing and deserializing with the same mapping information.

Please note, this only works with properties that Bit2 is able to serialize. Please verify your data can be serialized properly before relying on DeepClone.

var bitPerson = new BitPerson() 
{ 
    Name = "Bandit", 
    Age = 42, 
    PersonType = BitPersonType.adult, 
    RelatedPeople = new List() { new BitPerson() { 
        Name = "Bingo", 
        Age = 7, 
        PersonType = BitPersonType.child 
        } 
    } 
};

var anotherBandit = Bit2.DeepClone(bitPerson);

ReplaceReferenceTypes

This allows you to supply an IEqualityComparer<T>, and either one list, an array of lists, or an initial list + additional lists to replace all reference types. It uses the comparer to identify identical classes and then replaces those unique instances with references to itself.

This is likely not necessary on most deserialized data. However, if it is necessary, you have a helper here 👍

Example:

Bit2.ReplaceReferenceTypes<T>(Comparer,
    adjacencyList.Keys.ToList(),
    adjacencyList.Select(f => f.Value).ToArray());

MapToCSharpByteString

Converts a map to a C# byte array string. Useful for storing maps internally for later deserialization.

Example:

string mapString = Bit2.MapToCSharpByteString(typeof(YourClass), version);

Map

Maps a type to a BitMap and optionally serializes and returns the bytes of that map.

Example:

BitMap mapped;
byte[] mapBytes = Bit2.Map(typeof(YourClass), version, out mapped, true);

ReadHeader

Reads the BitHeader from a byte array, using a maximum of 6 bytes. It also outputs the version used during serialization if available.

Example:

ushort version;
BitHeader header = Bit2.ReadHeader(bytes, out version);

GetMap

Converts a byte array to a BitMap.

Example:

BitMap map = Bit2.GetMap(mapBytes);

GenerateSchema

Generates a schema from a map, producing full C# classes that can be used to deserialize the data.

Example:

string schema = Bit2.GenerateSchema(mapBytes);

Last updated