Estimating JSON size

I've been working on a system that heavily uses message queuing (RabbitMQ via MassTransit specifically) and occasionally the system needs to deal with large object graphs that need to be processed different - either broken into smaller pieces of work or serialized to an external source and a pointer put into the message instead.

The first idea was to serialize all messages to a MemoryStream but unfortunately this has some limitations, specifically:

a. For smaller messages the entire stream is duplicated due to the way the MassTransit interface works b. For larger messages we waste a lot of memory

For short-lived HTTP requests this is generally not a problem but for long-running message queue processing systems we want to be a bit more careful with GC pressures.

Which led me to two possible solutions:

LengthOnlyStream

This method is 100% accurate taking into consideration JSON attributes etc. and yet only requires a few bytes of memory.

Basically it involves a new Stream class that does not record what is written merely the length.

Pros & cons

  • + 100% accurate
  • + Can work with non-JSON serialization too
  • - Still goes through the whole serialization process

Source

internal class LengthOnlyStream : Stream {
    long length;

    public override bool CanRead => false;
    public override bool CanSeek => false;
    public override bool CanWrite => true;
    public override long Length => length;
    public override long Position { get; set; }
    public override void Flush() { }
    public override int Read(byte[] buffer, int offset, int count) => throw new NotImplementedException();
    public void Reset() => length = 0;
    public override long Seek(long offset, SeekOrigin origin) => 0;
    public override void SetLength(long value) => length = value;
    public override void Write(byte[] buffer, int offset, int count) => length += count - offset;
}

Usage

You can now get the actual serialized size with:

using var countStream = new LengthOnlyStream();
JsonSerializer.Serialize(countStream, damien, typeof(Person), options);
var size = countStream.Length;

JsonEstimator

In our particular case we don't need to be 100% accurate and instead would like to minimize the amount of work done as the trade-off.

This is where JsonEstimator comes into play:

Pros & cons

  • - Not 100% accurate (ignores Json attributes)
  • + Fast and efficient

Source

public static class JsonEstimator
{
    public static long Estimate(object obj, bool includeNulls) {
        if (obj is null) return 4;
        if (obj is Byte || obj is SByte) return 1;
        if (obj is Char) return 3;
        if (obj is Boolean b) return b ? 4 : 5;
        if (obj is Guid) return 38;
        if (obj is DateTime || obj is DateTimeOffset) return 35;
        if (obj is Int16 i16) return i16.ToString(CultureInfo.InvariantCulture).Length;
        if (obj is Int32 i32) return i32.ToString(CultureInfo.InvariantCulture).Length;
        if (obj is Int64 i64) return i64.ToString(CultureInfo.InvariantCulture).Length;
        if (obj is UInt16 u16) return u16.ToString(CultureInfo.InvariantCulture).Length;
        if (obj is UInt32 u32) return u32.ToString(CultureInfo.InvariantCulture).Length;
        if (obj is UInt64 u64) return u64.ToString(CultureInfo.InvariantCulture).Length;
        if (obj is String s) return s.Length + 2;
        if (obj is Decimal dec) {
            var left = (long)Math.Floor(dec % 10);
            var right = BitConverter.GetBytes(decimal.GetBits(dec)[3])[2];
            return right == 0 ? left : left + right + 1;
        }
        if (obj is Double dou) return dou.ToString(CultureInfo.InvariantCulture).Length;
        if (obj is Single sin) return sin.ToString(CultureInfo.InvariantCulture).Length;
        if (obj is IDictionary dict) return EstimateDictionary(dict, includeNulls);
        if (obj is IEnumerable enumerable) return EstimateEnumerable(enumerable, includeNulls);

        return EstimateObject(obj, includeNulls);
    }

    static long EstimateEnumerable(IEnumerable enumerable, bool includeNulls) {
        long size = 0;
        foreach (var item in enumerable)
            size += Estimate(item, includeNulls) + 1; // ,
        return size > 0 ? size + 1 : 2;
    }

    static readonly BindingFlags publicInstance = BindingFlags.Instance | BindingFlags.Public;

    static long EstimateDictionary(IDictionary dictionary, bool includeNulls) {
        long size = 2; // { }
        bool wasFirst = true;
        foreach (var key in dictionary.Keys) {
            var value = dictionary[key];
            if (includeNulls || value != null) {
                if (!wasFirst)
                    size++;
                else
                    wasFirst = false;
                size += Estimate(key, includeNulls) + 1 + Estimate(value, includeNulls); // :,
            }
        }
        return size;
    }

    static long EstimateObject(object obj, bool includeNulls) {
        long size = 2;
        bool wasFirst = true;
        var type = obj.GetType();
        var properties = type.GetProperties(publicInstance);
        foreach (var property in properties) {
            if (property.CanRead && property.CanWrite) {
                var value = property.GetValue(obj);
                if (includeNulls || value != null) {
                    if (!wasFirst)
                        size++;
                    else
                        wasFirst = false;
                    size += property.Name.Length + 3 + Estimate(value, includeNulls);
                }
            }
        }

        var fields = type.GetFields(publicInstance);
        foreach (var field in fields) {
            var value = field.GetValue(obj);
            if (includeNulls || value != null) {
                if (!wasFirst)
                    size++;
                else
                    wasFirst = false;
                size += field.Name.Length + 3 + Estimate(value, includeNulls);
            }
        }
        return size;
    }
}

Usage

var size = JsonSizeCalculator.Estimate(damien, true);

Wrap-up

I also had the chance to play with Benchmark.NET and also to look at various optimizations of the estimator (using Stack<T> instead of a recursion, reducing foreach allocations and a simpler switch that did not involve pattern matching) but none of them yielded positive results on my test set - I likely need a bigger set of objects to see the real benefits.

Regards,

[)amien

Using variable web fonts for speed

Webfonts are now ubiquitous across the web to the point where most of the big players even have their own typefaces and the web looks a lot better for it.

Unfortunately the problem still exists that either the browser has to wait before it draws anything while it is getting the font or it renders without the font then re-renders the page again once the font is available. Neither is a great solution and Google's PageSpeed will hit you for either so what is an enterprising web developer to do?

I was tasked with exactly this problem while redesigning my wife's web site - MKG Marketing Inc. Having been brought in right at the start while design and technologies were still on the table gave me some control that we might be able to mitigate this quite well.

What we need to do is reduce the amount of time it takes to get this font on the screen and then decide whether we want to go with the block or swap approach.

Variable fonts are a boon

The first opportunity here was to go with a good variable font.

Variable fonts - unlike traditional static fonts - have a number of axes which you can think of as sliders in which to push and pull the design.

The most obvious one is how light or bold a font is - rather than just say a light (200), a normal (400) and a bold (700) the browser can make a single font file perform all these roles and anything inbetween. Another one might be how thin or wide a font is - eliminating the need for separate condensed or expanded fonts - another might be how slanted or italic the design is or even whether a font is a serif or a sans-serif... and plenty of small serifs inbetween.

This means instead of having to load 3 or more web fonts we need just one which will decrease our page load time.

Getting a variable font

Google Fonts helpfully has an option to "Show only variable fonts" however it is important to note that while Google can show you them.. it is not capable of serving you the actual variable font to your site! Instead it will use the variable font to make you a set of static fonts to serve up which kind of defeats a lot of the point in using them.

Anyway, feel free to browse their site for a variable font that looks great on your site. Even use their CSS temporary to see what it would look like. Once you're happy it's time to head into the font information to see who created the font and where you can get the actual variable font file from.

For example, the rather nice Readex Pro says in its about section to head over to their GitHub repository to contribute.

Unlike most open source code project binaries for fonts often get checked in as users don't typically have the (sometimes commercial such as Glyphs or FontLab) tool-chain to build them. This is good for us as inside the /fonts/variable/ folder we see two .ttf files. You will note that the filename includes [wght] on one which means it contains the weight axes. The second in this case is [HEXP,wght] which in this case is for "hyper-expansion". Choose the one that has just the axes you need.

You can also check the download link from Google Fonts - that often has a full variable font they are using. It will probably contain much more than you want. Alternatively, some fonts such as Recursive even have an online configurator that lets you choose what options you want and provides a ready-to-go fully-subset and compressed woff2 (so you can skip straight to serving the font!)

Preparing the font

Now you could just load that .ttf file up somewhere but there are still some optimizations we can do. We want to shrink this file down as much as possible and there are two parts to that.

Firstly we want to strip out everything we don't need also known as subsetting. In the case of this font for example if I were using it on damieng.com I could strip out the Arabic characters as I do not write anything in Arabic (browsers will fall back to an installed font that does should they need to such as somebody entering something into your contact form).

Subsetting the font

The open source fontTools comes to the rescue here. I installed it in WSL but Linux and other operating systems should be the same:

pip install fonttools brotli zopfli

The subset command has a number of options to choose what to keep. Unfortunately there is no easy language option but you can specify unicode ranges --unicodes= or even text files full of characters --text-file= or simply provide a list of characters with --text=. If you go the unicode range route then Unicodepedia has some good coverage of what you'll need.

For this site we just needed Latin and the Latin-1 supplement, so:

fonttools subset ReadexPro[wght].ttf --unicodes=U+0020-007F,U+00A0-00FF --flavor=woff2

Which produces a ReadexPro[wght].subset.woff2 weighing in at just 16KB (down from 188KB).

Serving the font

Now at this point you could just upload the woff2 file to your CDN and reference it from your CSS.

Your CSS would look like this:

@font-face {
  font-family: "ReadexPro";
  src: url(/fonts/readexpro[wght].subset.woff2) format("woff2");
  font-weight: 100 900;
}

The important bit here is font-weight: 100 900 which lets the browser know that this is a variable font capable of handling all the font weights from 100 to 900. You can do this with other properties too:

AxesCSS propertyNamed exampleNumeric example
[slnt]font-stylenormal obliquenormal oblique 30deg 50deg
[wdth]font-stretchcondensed expanded50% 200%
[wght]font-weightthin heavy100 900

Do not instead create a @font-face per weight all linking to the same url as the browsers are often not smart enough to de-duplicate the requests and you're back to effectively waiting for multiple resources to load.

Now this might be fast enough, you could measure it and test... but this font is only 16KB so there's one more trick we can do.

With the url in the CSS the browser must first request the HTML, then the CSS, then finally the font. If you're smart you'll have them all on the same host so at least there will be no extra DNS lookups slowing things down in there. (Also note that requesting popular files from well-known CDNs will not reuse local copies any more in order to provide better security so it really is worth hosting your own again especially if you have a CDN in front of your site such as Cloudflare, CloudFront etc).

But we can do better!

Base64 the font into the CSS!

By Base64 encoding the font you can drop it directly into the CSS thereby removing a whole other HTTP request/wait! Simply encode your woff2 file using a either an online tool like Base64Encode.org where you click the "click here to select a file", choose your .woff2 then click Encode and download the encoded text file or the command-line tool such as base64, I used wsl again with:

base64 readexpro[wght].subset.woff -w0 | clip.exe

Which stuffs the base64 encoded version into the clipboard ready to be pasted.

Now change your @font-face replacing the url with a data section, like this:

@font-face {
  font-family: "ReadexPro";
  font-display: block;
  font-weight: 100 900;
  src: url(data:application/x-font-woff;charset=utf-8;base64,d09GMgABAAAAAEK0ABQAAAAAgZAAAEJFAAEz
  [truncated for this blog post]AQ==) format("woff2");
}

And hey presto! You can now happily slip that font-display: block in as well as the font is always going to be available at the same time as the CSS :)

Hope that helps!

Regards

Damien

Migrating from OpenTracing.NET to OpenTelemetry.NET

Background

OpenTracing is an interesting project that allows for the collection of various trace sources to be correlated together to provide a timeline of activity that can span services for reporting in a central system. They were competing with OpenCenus but have now merged to form OpenTelemetry.

I was recently brought in as a consultant to help migrate an existing system that used OpenTracing in .NET that recorded trace data into Jaeger so that they might migrate to the latest OpenTelemetry libraries. I thought it would be useful to document what I learnt as the migration process is not particularly clear.

The first thing to note is that OpenTracing and OpenTelemetry are both multi-platform systems and so support many languages and services. While this is great from a standardization process it does mean that a lot of information you find doesn't necessarily relate to the library you are using.

In this particular case we are using the .NET/C# library so moving from OpenTracing API for .NET to OpenTelemetry .NET and while there are plenty of signs that OpenTracing is deprecated and that all efforts are now in OpenTelemetry it's worth noting that as of time of writing OpenTelemetry .NET isn't quite done - many of the non-core libraries are in beta or release-candidate status.

How OpenTracing is used

The OpenTracing standard basically worked through the ITracer interface. You can register it as a singleton via DI and let it from through where it is needed or access it via the GlobalTracer static.

This ITracer has a IScopeManager with am IScope being basically an active thread and the ISpan being a unit of work which can move between threads.

Typically usage looks a little something like this:

class Runner {
  private ITracer tracer;

  public Runner(ITracer tracer) {
    this.tracer = tracer;
  }

  public void Run(string command) {
      using (var scope = tracer.BuildSpan("Run").StartActive()) {
        // ...
      }
  }
}

With scopes capable of having sub-scopes, tagging and events within them to provide further detail. The moment the StartActive is called the clock starts and the moment the using goes out of scope the clock ends providing you the detail levels you need.

How OpenTelemetry changes things

Now OpenTelemetry changes things a little breaking Tracing, Metrics and Logging into separate things.

While OpenTelemetry uses much of the same terminology as OpenTracing when it comes to the .NET client they decided to take a different approach and rather than implement Span again they use .NET's built in ActivitySource as the replacement so there's less to learn.

So we create an ActivitySource in the class and then use it much as we would have used ITracer.

class Runner {
  static readonly AssemblyName assemblyName = typeof(Runner).Assembly.GetName();
  static readonly ActivitySource activitySource = new ActivitySource(AssemblyName.Name, AssemblyName.Version.ToString());

  public void Run(string command) {
      using (var activity = activitySource.BuildActivity("Run")) {
        // ...
      }
  }
}

This has the advantage of not needing to pass around iTracer and ActivitySource is optimized to immediately return nulls if no tracing is enabled (which the using keyword handles just fine).

If, however, you are using multiple systems and want to stick the the OpenTelemetry terminology of Tracer and Span instead of ActivitySource and Activity then check out the OpenTelemetry.API shim (not to be confused with the OpenTracing shim covered below).

There is a full comparison available of how OpenTelemetry maps to the .NET Activity API available too.

WARNING using and C# 8+

Some code analysis/refactoring tools may recommend changing the using clause from the traditional using (var x...) to the brace-less using var x... of C# 8.

Do not do this with BuildSpan or BuildActivity unless you have only one of them and you're happy for the timing to consider the end when the method exits.

This is because removing the braces removes the defined end of the span or activity. This means that they will continue on beyond their original intended scope and carry on their timings until the end of the method.

How to migrate

It is possible to migrate in steps by switching your application over from OpenTelemetry to OpenTracing while still temporarily supporting ITracer until you have moved although the code to set this up correctly is easy to get wrong (and if you do you will see duplicate spans in your viewer).

In ASP.NET Core you want to register your AddOpenTelemetryTracing with all your necessary instrumentation. For example:

services.AddOpenTelemetryTracing(builder => {
    builder
        .SetResourceBuilder(ResourceBuilder.CreateDefault().AddService("MyServiceName"))
        .AddAspNetCoreInstrumentation()
        .AddHttpClientInstrumentation()
        .AddSqlClientInstrumentation(o => {
            o.SetDbStatementForText = true;
            o.RecordException = true;
        })
        .AddJaegerExporter(o => {
            o.AgentHost = openTracingConfiguration?.Host ?? "localhost";
            o.AgentPort = openTracingConfiguration?.Port ?? 6831;
        });
});

Backwards compatible support for ITracer & GlobalTracer

If you still need to support ITracer/GlobalTracer in the mean time you can use OpenTelemetry.Shims.OpenTracing to forward old requests on. Simply add this block of code to the end of the AddOpenTelemetryTracing covered above:

    services.AddSingleton<ITracer>(serviceProvider => {
        var traceProvider = serviceProvider.GetRequiredService<TracerProvider>();
        var tracer = new TracerShim(traceProvider.GetTracer(applicationName), Propagators.DefaultTextMapPropagator);
        GlobalTracer.RegisterIfAbsent(tracer);
        return tracer;
    });

Note that other combinations of trying to get this working for us resulted in duplicate spans. It is quite possible that duplicate trace providers can cause this.

Jeager

With OpenTracing Jaeger required the use of it's own C# Client for Jaeger however this library is now being deprecated in favor of OpenTelemetry.Exporter.Jaeger.

There are some limitations with this - specifically right now the Jaeger support is for tracing only and uses the UDP collector only which means no authentication and limited message sizes.

Jaeger looked at directly supporting the OpenTelemetry collector system but that experiment has been discontinued and is now planned for Jaeger 2.

In the mean time if these limitations are a problem it's possible the ZipKin exporter might be a better choice given that Jaeger also supports that.

Logs on a Span

OpenTracing helpfully correlated logs from ILogger to the appropriate span so they could appear alongside what they related to in, for example, Jaeger.

OpenTelemetry wants to keep logs separate and instead relies on .NET's ActivityEvent for this purpose but if you already have lots of ILogger usage instead then the OpenTelemetry.Preview NuGet package has your back and will re-dispatch ILogger entries as ActivityEvent ensuring they end up in the right place as they did with OpenTracing.

services.AddLogging(o => {
  o.AddOpenTelemetry(t => {
    c.AttachLogsToActivityEvent();
  })
})

Dealing with pre-release NuGet packages

At time of writing a number of the OpenTelemetry libraries are pre-release. If you are using .NET Analyzers they will prevent a successful build because of this. If you are absolutely sure you want to proceed then add NU5104 to the <NoWarn> section of your .csproj, e.g.

<NoWarn>NU5104</NoWarn>

Additional tracing

OpenTracing has a popular Contrib package that provides support for a variety of sources. Much of this is replaced by additional OpenTelemetry .NET packages, specifically the following packages which will need to be added to your assembly and the necessary .

Technology OpenTelemetry Package Extension method
ASP.NET Core Instrumentation.AspNetCore AddAspNetCoreInstrumentation
Entity Framework Core Contrib.Instrumentation.EntityFrameworkCore AddEntityFrameworkCoreInstrumentation
HttpHandler Instrumentation.Http AddHttpClientInstrumentation
Microsoft.SqlClient Instrumentation.SqlClient AddSqlClientInstrumentation
System.SqlClient Instrumentation.SqlClient AddSqlClientInstrumentation

There are many additional services that were not natively or contrib-supported as well including Elasticsearch, AWS, MassTransit, MySqlData, Wcf, Grpc, StackExchangeRedis etc.

Hope that helps!

[)amien

Developing a great SDK: Guidelines & Principles

A good SDK builds on the fundamentals of good software engineering but SDKs have additional requirements to consider.

Why is an SDK different?

When developing software as a team a level of familiarity is reached between the team members. Concepts, approaches, technologies, and terminology are shaped by the company and the goals are typically aligned.

Even as new members join the team a number of 1-on-1 avenues exist to onboard such as pairing, mentoring, strategic choices of what to work on etc.

Software to be consumed by external developers is not only missing this shared context but each developer will have a unique context of their own. For example: What you think of as an authorization request as a domain expert is unlikely to match what a user thinks authorization means to their app.

The backgrounds of developers can be diverse with varying abilities and requirements each shaped by their experiences with other software and the industries they've worked in.

Onboarding potential customers, developers, or clients with 1-on-1 support simply doesn't scale and the smallest bump in the road can lead them down a different path and away from your service.

Goals

A guiding principle for developing software is to be user-focused throughout.

This is especially important when developing an SDK and yet is more easily overlooked as it is created by a developer for a developer.

It is important to remember that you are not your audience.

You have in-depth knowledge of the how and why that the user is unlikely to have. Indeed they may not care or even want to learn - they have work of their own to be doing delivering the unique functionality of their solution. If they had the interest, time or experience to deal with the intracacies your library is supposed to take care of they wouldn't need it.

Some more specific goals to follow are:

Reduce the steps

Every step is another opportunity for the developer to get confused, distracted or disenfranchised.

Success should involve as few steps as possible.

The primary technique for achieving this is to utilize defaults and convention liberally. Default to what works for the majority of people while still being secure and open to customization.

In the case where multiple steps are required consider combining those steps into a use-case specific flow.

Success can be delivered in parts.

If a user can try a default configuration and see that part working it provides encouragement and incentive to keep going on to the next requirement they have. Each success builds upon the previous to keep the user on-track and invested in this solution.

Simplify concepts and terminology

Terminology is essential to a deep understanding of any field however it can become a massive barrier to adoption for those less versed in the topic.

It is important to use phrases, concepts and terminology your audience will understand rather than specific abstract or generic terms defined in underlying RFCs or APIs. This should be reflected in the primary class names, functions, methods, and throughout the documentation.

When it is necessary to expose less-common functionality you should strive to progressively reveal the necessary detail. In cases where this exposure provides facilities close to the underlying implementation then it becomes advantageous to revert back to the terminology used there for advanced operations.

Guide API discovery

Many platforms and languages have facilities that can be utilized to help guide API discovery primarily through autocompletion mechanisms such as Visual Studio's IntelliSense.

Common functionality should flow-out from the objects or functionality the developer has at that point. If your API has provided a connection to your service you should not then expect them to go and discover all new objects and namespaces.

Many popular pieces of software like to provide the developer with a "context" object. This is an object that exists only for the current request (in web server applications) or current developers instance of the app (in client applications) that could provide access to the current user and the various operations available - authorizing, configuring, performing api requests etc. The act of obtaining this context in an for example an identity SDK could be logging in.

Namespaces can be used to push more advanced functionality away from newer developers allowing them to concentrate on primary use-cases which can be combined into a single common root or default namespace.

The same principle applies to methods and properties especially in strongly-typed languages with rich IDE support. A fluent-style API for optional configuration can not only guide you through the available options but can also prevent you from making incompatible choices right at compile time where it is safe to provide detailed messages in context to the line of code.

Look native

Developers often specialize in just one or two platforms at any one time and become intimately familiar with the design and flavor of those platforms.

SDKs should strive to feel like native citizens of that ecosystem adopting the best practices, naming conventions, calling patterns and integrations that the user. Cross-platform solutions that look the same across platforms are only of interest to other cross-platform developers.

When using your API developers should be able to anticipate how it will function, how error handling, logging, and configuration will work based on their experience with the platform. Take time to understand both what is popular on that platform and which direction things are moving when making choices.

Feeling native further reduces the barrier to entry and can replace it with a feeling of "It just works!".

Resist the temptation to make your SDKs work the same way across platforms when it goes against the grain of that platform. It might make things easier for you and your team but the pain will be pushed onto consumers of your SDK and waste their resources every time your API behaves in a way that is unintuitive to people familiar with the platform.

Inline documentation

Online documentation provides a great place for both advanced topics that require multiple interactions as well as letting new developers see what is involved before they switch into their IDE or download an SDK.

However, when using the code itself the SDK should put concise documentation about classes, functions and parameters at their fingertips where possible. Many IDEs provide the ability to display code annotations e.g. XML Documentation Comments in C# and JSDoc.

This should be leveraged to keep developers engaged once they start using the SDK. Switching to a browser to read documentation presents them with tabs of other things needing attention or other solutions that don't involve using your SDK.

Strategies

Consume an API you wish you had

It can be incredibly advantageous to start by writing the code a user might expect to write to perform the operation in their application.

You want the code to be aligned with the goals:

  • Concise — the minimum number of steps
  • Understandable — only well-known terminology
  • Discoverable — if the editor allows it
  • Familiar — it feels like a native SDK

Start with a small fragment that exercises a specific scenario to both prove it and provide a real-world-like snippet for the documentation.

Better yet adopt or develop a small reference application and show the SDK working as a whole. Such an app can also be published itself as a great reference for programmers looking for concrete examples or best practices and form the basis of starters, examples and tutorials.

These applications also have further long-term value in that they can be used to:

  • See if and how the SDK breaks applications when changes are introduced
  • Form ideas about how deprecations are handled
  • Prove (or disprove) how a proposed SDK change improves the experience

Iterative approach

Traditional up-front design requires you trade off how much research you do before you start designing the system. Undoubtedly no matter how much research you do it will never be enough. Either there are use cases that were missed, small details that negatively impact the design, or implementation constraints that go against the design in awkward ways.

It is important to approach writing a library by implementing pieces one at a time continually refining and refactoring as you go along to ensure that you end up with a design that fits both the problem domain and the unseen constraints either the technology or the intricacies the domain imposes that would have no doubt been missed in an up-front design.

Sample applications and references can really help shape the good design as you go by reflecting how changes to the SDK affect your applications.

Breaking clients is to be avoided where possible so a design should be refined as much as possible before initial release given both the current constraints and future direction. If a piece of the SDK is not well baked strongly consider keeping back the unbaked portions so that the primary developers do not take a dependency on it yet.

Other avenues are available to help bake the design and functionality of new components such as forums, private groups, internal teams or other team members both on this SDK or on others. Design shortcomings are much easier to spot when you are distanced from its creation.

Local git branches are a vitally important safety net to aggressive refactoring - commit after each good step.

Note about unit tests

Unit tests are very important for production quality software, ongoing maintenance and even late stage refactoring however developing unit tests too early in the development cycle can work against aggressive refactoring and redesign.

It is difficult to move functionality, fields and methods around between classes and methods when there are a multitude of unit tests expecting them there especially given that the unit tests at these early phases tend to be nothing more than checking the most basic of functionality.

You should also pay attention to how much influence the unit tests themselves are exerting on the design of the SDK components themselves and whether this makes those components a simpler design for consumers or pushes more requirements to them on how to make many small testable pieces work together as a simple single component to solve traditional use-cases.

Layered design

Some of these goals can be difficult to implement in a single design. For example, how do you:

  • Ensure that SDKs that appear so different are approachable by engineers at Auth0?
  • Avoid a multitude of options when there are choices that need to be made?
  • Provide the ability for advanced consumers to go beyond the basics?
  • A dual-layer design can work very well for client SDKs and help solve these problems.

The lower levels of the design are unit-testable building blocks that very much mirror the underlying APIs, concepts, and RFCs. Classes are focused on a specific task or API but take a variety of inputs ensuring the right kind of request is sent with the right parameters and encodings. This gives a solid foundation to build on that is well understood by the team as well as across SDKs.

The high-level components are responsible for orchestrating the lower pieces into a coherent native-friendly easy-to-use experience for a majority of use cases. They form the basis of quick-starts, initial guidance, and tutorials. Other high-level components may in fact form plug-ins to existing extensions provided by the environment.

This layering also helps developers perhaps unfamiliar with the platform clearly see how the underlying familiarly-named elements are utilized in particular environments or flows.

When consumers need to go beyond the capabilities of high-level components the same underlying building blocks used by the high-level components are available to them to compose, reuse and remix as they need.

For example: A C# SDK might include a high-level component for desktop apps that automatically become part of application's startup and shutdown as well as provides support for opening login windows. Another high-level components might be developed for server-to-server communication and know how to check common configuration patterns such as Azure application settings or web.config files and deal with secure secrets.

Each is tailored specifically to the use case but use the same underlying blocks to achieve the result.

It is also advantageous in environments that support package management to individually package environment-dependent parts with clear labels that describe their use in that environment. This aids in both discovery and ensures the SDK does not bring along additional unneeded sub-dependencies for that environment while bringing along the core shared lower level package.