Category Archives: Uncategorized

Using TypeScript to write Cosmos DB stored procedures with async/await

Disclaimer: I am by no mean a TypeScript expert. In fact, I know very little about JS, npm, gulp, etc. So it’s entirely possible I said something really stupid in this article, or maybe I missed a much simpler way of doing things. Don’t hesitate to let me know in the comments!

Azure Cosmos DB (formerly known as Azure Document DB) is a NoSQL, multi-model, globally-distributed database hosted in Azure. If you come from relational SQL databases, it’s a very different world. Some things are great, for instance modeling data is much easier than in a relational database, and performance is excellent. Other things can be disconcerting, such as the lack of support for ACID. From the client’s perspective, there are no transactions: you can’t update multiple documents atomically. Of course, there’s a workaround: you can write stored procedures and triggers, which execute in the context of a transaction. So, when you really, really need multiple updates to be made atomically, you write a stored procedure to do the job.

The bad news

Unfortunately, on Cosmos DB, stored procedures are written… in Javascript 😢 (I know, plenty of folks love Javascript, but I don’t. Sue me!). All APIs for database operations are asynchronous (which is a good thing), but these APIs are based on callbacks, not on promises, so even though ECMAScript 2017 is supported, you can’t use async/await with them. This fact is enough to turn any non-trivial task (i.e. code that involves branches such as ifs or loops) into a nightmare, at least for a C# developer like me… I typically spend a full day to write and debug a stored procedure that should have taken less than an hour with async/await.

Promise-based wrapper

Of course, I wouldn’t be writing this post if there wasn’t a way to make things better. Cosmos DB Product Manager Andrew Liu was kind enough to show me how to write a wrapper around the callback-based API to enable the use of promises and async/await. Basically, it’s just a few functions that you can add to your stored procedures:

function setFoo() {
    async function main() {
        let { feed, options } = await queryDocuments("SELECT * from c");
        for (let doc of feed) {
   = "bar";
            await replaceDocument(doc);

    main().catch(err => getContext().abort(err));

function queryDocuments(sqlQuery, options) {
    return new Promise((resolve, reject) => {
        let isAccepted = __.queryDocuments(__.getSelfLink(), sqlQuery, options, (err, feed, opts) => {
            if (err) reject(err);
            else resolve({ feed, options: opts });
        if (!isAccepted) reject(new Error(429, "queryDocuments was not accepted."));

function replaceDocument(doc, options) {
    return new Promise((resolve, reject) => {
        let isAccepted = __.replaceDocument(doc._self, doc, (err, result, opts) => {
            if (err) reject(err);
            else resolve({ result, options: opts });
        if (!isAccepted) reject(new Error(429, "replaceDocument was not accepted."));

// and so on for other APIs...

Note that the stored procedure’s entry point (setFoo in this example) cannot be async (if it returns a promise, Cosmos DB won’t wait for it to complete), so you need to write another async function (main), call it from the stored procedure’s entry point, and catch the error that could be thrown. Note the use of getContext().abort(err), which aborts and rolls back the current transaction; without this, the exception would be swallowed.

I’m not going to show the equivalent code using the callback-based API here, because honestly, it makes my head hurt just thinking about it. But trust me on this: it’s not pretty, and much harder to understand.

Using TypeScript

The code shown above is pretty straightforward, once you have the wrapper functions. However, there are at least two issues with it:

  • This is still Javascript, which is weakly typed, so it’s easy to make mistakes that won’t be caught until runtime.
  • Cosmos DB stored procedures and triggers must consist of a single self-contained file; no import or require allowed. Which means you can’t share the wrapper functions across multiple stored procedures, you have to include them in each stored procedure. This is annoying…

First, let’s see how we can write our stored procedure in TypeScript and reduce the boilerplate code.

Let’s start by installing TypeScript. Create a package.json file with the npm init command (it will prompt you for a few details, you can leave everything empty), and run the npm install typescript command. We’ll also need the TypeScript definitions of the Cosmos DB server-side APIs. For this, we’ll install a npm package named @types/documentdb-server which contains the definitions: npm install @types/documentdb-server.

We also need a tsconfig.json file:

    "exclude": [
    "compilerOptions": {
        "target": "es2017",
        "strict": true,

Now, let’s create a few helpers to use in our stored procedures. I put them all in a CosmosServerScriptHelpers folder. The most important piece is the AsyncCosmosContext class, which is basically a strongly-typed, promise-based wrapper for the __ object. It implements the following interface:

export interface IAsyncCosmosContext {

    readonly request: IRequest;
    readonly response: IResponse;

    // Basic query and CRUD methods
    queryDocuments(sqlQuery: any, options?: IFeedOptions): Promise<IFeedResult>;
    readDocument(link: string, options?: IReadOptions): Promise<any>;
    createDocument(doc: any, options?: ICreateOptions): Promise<any>;
    replaceDocument(doc: any, options?: IReplaceOptions): Promise<any>;
    deleteDocument(doc: any, options?: IDeleteOptions): Promise<any>;

    // Helper methods
    readDocumentById(id: string, options?: IReadOptions): Promise<any>;
    readDocumentByIdIfExists(id: string, options?: IReadOptions): Promise<any>;
    deleteDocumentById(id: string, options?: IDeleteOptions): Promise<any>
    queryFirstDocument(sqlQuery: any, options?: IFeedOptions): Promise<any>;
    createOrReplaceDocument(doc: any, options?: ICreateOrReplaceOptions): Promise<any>;

I’m not showing the whole code in this article because it would be too long, but you can see the implementation and auxiliary types in the GitHub repo here:

So, how can we use this? Let’s look at our previous example again, and see how we can rewrite it in TypeScript using our wrappers:

import {IAsyncCosmosContext} from "CosmosServerScriptHelpers/IAsyncCosmosContext";
import {AsyncCosmosContext} from "CosmosServerScriptHelpers/AsyncCosmosContext";

function setFoo() {
    async function main(context: IAsyncCosmosContext) {
        let { feed, options } = await context.queryDocuments("SELECT * from c");
        for (let doc of feed) {
   = "bar";
            await replaceDocument(doc);

    main(new AsyncCosmosContext()).catch(err => getContext().abort(err));

It looks remarkably similar to the previous version, with just the following changes:

  • We no longer have the wrapper functions in the same file, instead we just import them via the AsyncCosmosContext class.
  • We pass an instance of AsyncCosmosContext to the main function.

This looks pretty good already, but what’s bugging me is having to explicitly create the context and do the .catch(...). So let’s create another helper to encapsulate this:

import {IAsyncCosmosContext} from "./IAsyncCosmosContext";
import {AsyncCosmosContext} from "./AsyncCosmosContext";

export class AsyncHelper {
     * Executes the specified async function and returns its result as the response body of the stored procedure.
     * @param func The async function to execute, which returns an object.
    public static executeAndReturn(func: (context: IAsyncCosmosContext) => Promise<any>) {
        this.executeCore(func, true);

     * Executes the specified async function, but doesn't write anything to the response body of the stored procedure.
     * @param func The async function to execute, which returns nothing.
    public static execute(func: (context: IAsyncCosmosContext) => Promise<void>) {
        this.executeCore(func, false);

    private static executeCore(func: (context: IAsyncCosmosContext) => Promise<any>, setBody: boolean) {
        func(new AsyncCosmosContext())
            .then(result => {
                if (setBody) {
            .catch(err => {
                // @ts-ignore

Using this helper, our stored procedure now looks like this:

import {AsyncHelper} from "CosmosServerScriptHelpers/AsyncHelper";

function setFoo() 
    AsyncHelper.execute(async context => {
        let result = await context.queryDocuments("SELECT * from c");
        for (let doc of result.feed) {
   = "bar";
            await context.replaceDocument(doc);

This reduces the boilerplate code to a minimum. I’m pretty happy with it, so let’s leave it alone.

Generate the actual JS stored procedure files

OK, now comes the tricky part… We have a bunch of TypeScript files that import each other. But Cosmos DB wants a single, self-contained JavaScript file, with the first function as the entry point of the stored procedure. By default, compiling the TypeScript files to JavaScript will just generate one JS file for each TS file. The --outFile compiler option outputs everything to a single file, but it doesn’t really work for us, because it still emits some module related code that won’t work in Cosmos DB. What we need, for each stored procedure, is a file that only contains:

  • the stored procedure function itself
  • all the helper code, without any import or require.

Since it doesn’t seem possible to get the desired result using just the TypeScript compiler, the solution I found was to use a Gulp pipeline to concatenate the output files and remove the extraneous exports and imports. Here’s my gulpfile.js:

const gulp = require("gulp");
const ts = require("gulp-typescript");
const path = require("path");
const flatmap = require("gulp-flatmap");
const replace = require('gulp-replace');
const concat = require('gulp-concat');

gulp.task("build-cosmos-server-scripts", function() {
    const sharedScripts = "CosmosServerScriptHelpers/*.ts";
    const tsServerSideScripts = "StoredProcedures/**/*.ts";

    return gulp.src(tsServerSideScripts)
        .pipe(flatmap((stream, file) =>
            let outFile = path.join(path.dirname(file.relative), path.basename(file.relative, ".ts") + ".js");
            let tsProject = ts.createProject("tsconfig.json");
            return stream
                .pipe(replace(/^\s*import .+;\s*$/gm, ""))
                .pipe(replace(/^\s*export .+;\s*$/gm, ""))
                .pipe(replace(/^\s*export /gm, ""))

gulp.task("default", gulp.series("build-cosmos-server-scripts"));

Note that this script requires a few additional npm packages: gulp, gulp-concat, gulp-replace, gulp-flatmap, and gulp-typescript.

Now you can just run gulp and it will produce the appropriate JS file for each TS stored procedure.

To be honest, this solution feels a bit hacky, but it’s the best I’ve been able to come up with. If you know of a better approach, please let me know!

Wrapping up

The out-of-the-box experience for writing Cosmos DB server-side code is not great (to put it mildly), but with just a bit of work, it can be made much better. You can have strong-typing thanks to TypeScript and the type definitions, and you can use async/await to make the code simpler. Note that this approach is also valid for triggers.

Hopefully, a future Cosmos DB update will introduce a proper promise-based API, and maybe even TypeScript support. In the meantime, feel free to use the solution in this post!

The full code for this article is here:

Hosting an ASP.NET Core 2 application on a Raspberry Pi

As you probably know, .NET Core runs on many platforms: Windows, macOS, and many UNIX/Linux variants, whether on x86/x64 architectures or on ARM. This enables a wide range of interesting scenarios… For instance, is a very small machine like a Raspberry Pi, which its low performance ARM processor and small amount of RAM (1 GB on my RPi 2 Model B), enough to host an ASP.NET Core web app? Yes it is! At least as long as you don’t expect it to handle a very heavy load. So let’s see in practice how to deploy an expose an ASP.NET Core web app on a Raspberry Pi.

Creating the app

Let’s start from a basic ASP.NET Core 2.0 MVC app template:

dotnet new mvc

You don’t even need to open the project for now, just compile it as is and publish it for the Raspberry Pi:

dotnet publish -c Release -r linux-arm


We’re going to use a Raspberry Pi running Raspbian, the official Linux distro for Raspberry Pi, which is based on Debian. To run a .NET Core 2.0 app, you’ll need version Jessie or higher (I used Raspbian Stretch Lite). Update: as Tomasz mentioned in the comments, you also need a Raspberry Pi 2 or more recent, with an ARMv7 processor; The first RPi has an ARMv6 processor and cannot run .NET Core.

Even though the app is self-contained and doesn’t require .NET Core to be installed on the RPi, you will still need a few low-level dependencies; they are listed here. You can install them using apt-get:

sudo apt-get update
sudo apt-get install curl libunwind8 gettext apt-transport-https

Deploy and run the application

Copy all files from the bin\Release\netcoreapp2.0\linux-arm\publish directory to the Raspberry Pi, and make the binary executable (replace MyWebApp with the name of your app):

chmod 755 ./MyWebApp

Run the app:


If nothing went wrong, the app should start listening on port 5000. But since it listens only on localhost, it’s only accessible from the Raspberry Pi itself…

Exposing the app on the network

There are several ways to fix that. The easiest is to set the ASPNETCORE_URLS environment variable to a value like http://*:5000/, in order to listen on all addresses. But if you intend to expose the app on the Internet, it might not be a good idea: the Kestrel server used by ASP.NET Core isn’t designed to be exposed directly to the outside world, and isn’t well protected against attacks. It is strongly recommended to put it behind a reverse proxy, such as nginx. Let’s see how to do that.

First, you need to install nginx if it’s not already there, using this command:

sudo apt-get install nginx

And start it like this:

sudo service nginx start

Now you need to configure it so that requests arriving to port 80 are passed to your app on port 5000. To do that, open the /etc/nginx/sites-available/default file in your favorite editor (I use vim because my RPi has no graphical environment). The default configuration defines only one server, listening on port 80. Under this server, look for the section starting with location /: this is the configuration for the root path on this server. Replace it with the following configuration:

location / {
        proxy_pass http://localhost:5000/;
        proxy_http_version 1.1;
        proxy_set_header Connection keep-alive;

Be careful to include the final slash in the destination URL.

This configuration is intentionnally minimal, we’ll expand it a bit later.

Once you’re done editing the file, tell nginx to reload its configuration:

sudo nginx -s reload

From your PC, try to access the app on the Raspberry Pi by entering its IP address in your browser. If you did everything right, you should see the familiar home page from the ASP.NET Core app template!

Note that you’ll need to be patient: the first time the home page is loaded, its Razor view is compiled, which can take a while on the RPi’s low-end hardware. ASP.NET Core 2.0 doesn’t support precompilation of Razor views for self-contained apps; this is fixed in 2.1, which is currently in preview. So for now you have 3 options:

  • be patient and endure the delay on first page load
  • migrate to ASP.NET Core 2.1 preview, as explained here
  • make a non self-contained deployment, which requires .NET Core to be installed on the RPi

For this article, I chose the first options to keep things simple.

Proxy headers

At this point, we could just leave the app alone and call it a day. However, if your app is going to evolve into something more useful, there are a few things that aren’t going to work correctly in the current state. The problem is that the app isn’t aware that it’s behind a reverse proxy; as far as it knows, it’s only listening to requests on localhost on port 5000. Which means it cannot know:

  • the actual client IP (requests seem to come from localhost)
  • the protocol scheme used by the client
  • the actual host name specified by the client

For the app to know these things, it has to be told by the reverse proxy. Let’s change the nginx configuration so that it adds a few headers to incoming requests. These headers are not standard, but they’re widely used by proxy servers.

    proxy_set_header X-Forwarded-For    $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Host   $http_host;
    proxy_set_header X-Forwarded-Proto  http;

X-Forwarded-For contains the client IP address, and optionally the addresses of proxies along the way. X-Forwarded-Host contains the host name initially specified by the client, and X-Forwarded-Proto contains the original protocol scheme (hard-coded to http here since HTTPS is not configured).

(Don’t forget to reload the nginx configuration)

We also need to change the ASP.NET Core app so that it takes these headers into account. This can be done easily using the ForwardedHeaders middleware; add this code at the start of the Startup.Configure method:

app.UseForwardedHeaders(new ForwardedHeadersOptions
    ForwardedHeaders = ForwardedHeaders.All

In case you’re wondering what a middleware is, this article might help!

This middleware will read the X-Forwarded-* headers from incoming requests, and use them to modify:

  • the Host and Scheme of the request
  • the Connection.RemoteIpAddress, which contains the client IP.

This way, the app will behave as if the request was received directly from the client.

Expose the app on a specific path

Our app is now accessible at the URL http://<ip-address>/, i.e. at the root of the server. But if we want to host several applications on the Raspberry Pi, it’s going to be a problem… We could put each app on a different port, but it’s not very convenient. It would be better to have each app on its own path, e.g. with URLs like http://<ip-address>/MyWebApp/.

It’s pretty easy to do with nginx. Edit the nginx configuration again, and replace location / with location /MyWebApp/ (note the final slash, it’s important). Reload the configuration, and try to access the app at its new URL… The home page loads, but the CSS and JS scripts don’t: error 404. In addition, links to other pages are now incorrect, and point to http://<ip-address>/Something instead of http://<ip-address>/MyWebApp/Something. What’s going on?

The reason is quite simple: the app isn’t aware that it’s not served from the root of the server, and generates all its links as if it were… To fix this, we can ask nginx to pass yet another header to our app:

proxy_set_header X-Forwarded-Path   /MyWebApp;

Note that this X-Forwarded-Path header is even less standard than the other ones, since I just made it up… So of course, there’s no built-in ASP.NET Core middleware that can handle it, and we’ll need to do it ourselves. Fortunately it’s pretty easy: we just need to use the path specified in the header as the path base. In the Startup.Configure method, add this after the UseForwardHeaders statement:

// Patch path base with forwarded path
app.Use(async (context, next) =>
    var forwardedPath = context.Request.Headers["X-Forwarded-Path"].FirstOrDefault();
    if (!string.IsNullOrEmpty(forwardedPath))
        context.Request.PathBase = forwardedPath;

    await next();

Redeploy and restart the app, reload the nginx configuration, and try again: now it works!

Run the app as a service

If we want our app to be always running, restarting it manually every time it crashes or when the Raspberry Pi reboots isn’t going to be sustainable… What we want is to run it as a service, so that it starts when the system starts, and is automatically restarted if it stops working. To do this, we’ll take advantage of systemd, which manages services in most Linux distros, including Raspbian.

To create a systemd service, create a MyWebApp.service file in the /lib/systemd/system/ directory, with the following content:

Description=My ASP.NET Core Web App



(replace the name and paths to match your app of course)

Enable the service like this:

sudo systemctl enable MyWebApp

And start it like this (new services aren’t started automatically):

sudo systemctl start MyWebApp

And that’s it, your app is now monitored by systemd, which will take care of starting or restarting it as needed.


As you can see, running an ASP.NET Core 2.0 app on a Raspberry Pi is not only possible, but reasonably easy too; you just need a bit of fiddling with headers and reverse proxy settings. You won’t host the next Facebook or StackOverflow on your RPi, but it’s fine for small utility applications. Just give free rein to your imagination!

Writing a GitHub Webhook as an Azure Function

I recently experimented with Azure Functions and GitHub apps, and I wanted to share what I learned.

A bit of background

As you may already know, I’m one of the maintainers of the FakeItEasy mocking library. As is common in open-source projects, we use a workflow based on feature branches and pull requests. When a change is requested in a PR during code review, we usually make the change as a fixup commit, because it makes it easier to review, and because we like to keep a clean history. When the changes are approved, the author squashes the fixup commits before the PR is merged. Unfortunately, I’m a little absent minded, and when I review a PR, I often forget to wait for the author to squash their commits before I merge… This causes the fixup commits to appear in the main dev branch, which is ugly.

Which leads me to the point of this post: I wanted to make a bot that could prevent a PR from being merged if it had commits that needed to be squashed (i.e. commits whose messages start with fixup! or squash!). And while I was at it, I thought I might as well make it usable by everyone, so I made it a GitHub app: DontMergeMeYet.

GitHub apps

Now, you might be wondering, what on Earth is a GitHub app? It’s simply a third-party application that is granted access to a GitHub repository using its own identity; what it can do with the repo depends on which permissions were granted. A GitHub app can also receive webhook notifications when events occur in the repo (e.g. a comment is posted, a pull request is opened, etc.).

A GitHub app could, for instance, react when a pull request is opened or updated, examine the PR details, and add a commit status to indicate whether the PR is ready to merge or not (this WIP app does this, but doesn’t take fixup commits into account).

As you can see, it’s a pretty good fit for what I’m trying to do!

In order to create a GitHub app, you need to go to the GitHub apps page, and click New GitHub app. You then fill in at least the name, homepage, and webhook URL, give the app the necessary permissions, and subscribe to the webhook events you need. In my case, I only needed read-only access to pull requests, read-write access to commit statuses, and to receive pull request events.

At this point, we don’t yet have an URL for the webhook, so enter any valid URL; we’ll change it later after we actually implemented the app.

Azure Functions

I hadn’t paid much attention to Azure Functions before, because I didn’t really see the point. So I started to implement my webhook as a full-blown ASP.NET Core app, but then I realized several things:

  • My app only had a single HTTP endpoint
  • It was fully stateless and didn’t need a database
  • If I wanted the webhook to always respond quickly, the Azure App Service had to be "always on"; that option isn’t available in free plans, and I didn’t want to pay a fortune for a better service plan.

I looked around and realized that Azure Functions had a "consumption plan", with a generous amount (1 million per month) of free requests before I had to pay anything, and functions using this plan are "always on". Since I had a single endpoint and no persistent state, an Azure Function seemed to be the best fit for my requirements.

Interestingly, Azure Functions can be triggered, among other things, by GitHub webhooks. This is very convenient as it takes care of validating the payload signature.

So, Azure Functions turn out to be a perfect match for implementing my webhook. Let’s look at how to create one.

Creating an Azure Function triggered by a GitHub webhook

It’s possible to write Azure functions in JavaScript, C# (csx) or F# directly in the portal, but I wanted the comfort of the IDE, so I used Visual Studio. To write an Azure Function in VS, follow the instructions on this page. When you create the project, a dialog appears to let you choose some options:

New function dialog

  • version of the Azure Functions runtime: v1 targets the full .NET Framework, v2 targets .NET Core. I picked v1, because I had trouble with the dependencies in .NET Core.
  • Trigger: GitHub webhooks don’t appear here, so just pick "HTTP Trigger", we’ll make the necessary changes in the code.
  • Storage account: pick the storage emulator; when you publish the function, a real Azure storage account will be set instead
  • Access rights: it doesn’t matter what you pick, we’ll override it in the code.

The project template creates a class named Function1 with a Run method that looks like this:

public static class Function1
    public static async Task<HttpResponseMessage> Run(
        [HttpTrigger(AuthorizationLevel.Anonymous, "get", "post", Route = null)]HttpRequestMessage req, TraceWriter log)

Rename the class to something that makes more sense, e.g. GitHubWebHook, and don’t forget to change the name in the FunctionName attribute as well.

Now we need to tell the Azure Functions runtime that this function is triggered by a GitHub webhook. To do this, change the method signature to look like this:

    public static async Task<HttpResponseMessage> Run(
        [HttpTrigger("POST", WebHookType = "github")] HttpRequestMessage req,
        TraceWriter log)

GitHub webhooks always use the HTTP POST method; the WebHookType property is set to "github" to indicate that it’s a GitHub webhook.

Note that it doesn’t really matter what we respond to the webhook request; GitHub doesn’t do anything with the response. I chose to return a 204 (No content) response, but you can return a 200 or anything else, it doesn’t matter.

Publishing the Azure Function

To publish your function, just right click on the Function App project, and click Publish. This will show a wizard that will let you create a new Function App resource on your Azure subscription, or select an existing one. Not much to explain here, it’s pretty straightforward; just follow the wizard!

When the function is published, you need to tell GitHub how to invoke it. Open the Azure portal in your browser, navigate to your new Function App, and select the GitHubWebHook function. This will show the content of the (generated) function.json file. Above the code view, you will see two links: Get function URL, and Get GitHub secret:

Azure Function URL and secret

You need to copy the URL to the Webhook URL field in the GitHub app settings, and copy the secret to the Webhook secret field. This secret is used to calculate a signature for webhook payloads, so that the Azure Function can ensure the payloads really come from GitHub. As I mentioned earlier, this verification is done automatically when you use a GitHub HTTP trigger.

And that’s it, your webhook is online! Now you can go install the GitHub app into one of your repositories, and your webhook will start receiving events for this repo.

Points of interest

I won’t describe the whole implementation of my webhook in this post, because it would be too long and most of it isn’t that interesting, but I will just highlight a few points of interest. You can find the complete code on GitHub.

Parsing the payload

Rather than reinventing the wheel, we can leverage the Octokit .NET library. Octokit is a library made by GitHub to consume the GitHub REST API. It contains classes representing the entities used in the API, including webhook payloads, so we can just deserialize the request content as a PullRequestEventPayload. However, if we just try to do this with JSON.NET, this isn’t going to work: Octokit doesn’t use JSON.NET, so the classes aren’t decorated with JSON.NET attributes to map the C# property names to the JSON property names. Instead, we need to use the JSON serializer that is included in Octokit, called SimpleJsonSerializer:

private static async Task<PullRequestEventPayload> DeserializePayloadAsync(HttpContent content)
    string json = await content.ReadAsStringAsync();
    var serializer = new SimpleJsonSerializer();
    return serializer.Deserialize<PullRequestEventPayload>(json);

There’s also another issue: the PullRequestEventPayload from Octokit is missing the Installation property, which we’re going to need later to authenticate with the GitHub API. An easy workaround is to make a new class that inherits PullRequestEventPayload and add the new property:

public class PullRequestPayload : PullRequestEventPayload
    public Installation Installation { get; set; }

public class Installation
    public int Id { get; set; }

And we’ll just use PullRequestPayload instead of PullRequestEventPayload.

Authenticating with the GitHub API

We’re going to need to call the GitHub REST API for two things:

  • to get the list of commits in the pull request
  • to update the commit status

In order to access the API, we’re going to need credentials… but which credentials? We could just generate a personal access token and use that, but then we would access the API as a "real" GitHub user, and we would only be able to access our own repositories (for writing, at least).

As I mentioned earlier, GitHub apps have their own identity. What I didn’t say is that when authenticated as themselves, there isn’t much they’re allowed to do: they can only get management information about themselves, and get a token to authenticate as an installation. An installation is, roughly, an instance of the application that is installed on one or more repo. When someone installs your app on their repo, it creates an installation. Once you get a token for an installation, you can access all the APIs allowed by the app’s permissions on the repos it’s installed on.

However, there are a few hoops to jump through to get this token… This page describes the process in detail.

The first step is to generate a JSON Web Token (JWT) for the app. This token has to contain the following claims:

  • iat: the timestamp at which the token was issued
  • exp: the timestamp at which the token expires
  • iss: the issuer, which is actually the app ID (found in the GitHub app settings page)

This JWT needs to be signed with the RS256 algorithm (RSA signature with SHA256); in order to sign it, you need a private key, which must be generated from the GitHub app settings page. You can download the private key in PEM format, and store it somewhere your app can access it. Unfortunately, the .NET APIs to generate and sign a JWT don’t handle the PEM format, they need an RSAParameters object… But Stackoverflow is our friend, and this answer contains the code we need to convert a PEM private key to an RSAParameters object. I just kept the part I needed, and manually reformatted the PEM private key to remove the header, footer, and newlines, so that it could easily be stored in the settings as a single line of text.

Once you have the private key as an RSAParameters object, you can generate a JWT like this:

public string GetTokenForApplication()
    var key = new RsaSecurityKey(_settings.RsaParameters);
    var creds = new SigningCredentials(key, SecurityAlgorithms.RsaSha256);
    var now = DateTime.UtcNow;
    var token = new JwtSecurityToken(claims: new[]
            new Claim("iat", now.ToUnixTimeStamp().ToString(), ClaimValueTypes.Integer),
            new Claim("exp", now.AddMinutes(10).ToUnixTimeStamp().ToString(), ClaimValueTypes.Integer),
            new Claim("iss", _settings.AppId)
        signingCredentials: creds);

    var jwt = new JwtSecurityTokenHandler().WriteToken(token);
    return jwt;

A few notes about this code:

  • It requires the following NuGet packages:
    • Microsoft.IdentityModel.Tokens 5.2.1
    • System.IdentityModel.Tokens.Jwt 5.2.1
  • ToUnixTimeStamp is an extension method that converts a DateTime to a UNIX timestamp; you can find it here
  • As per the GitHub documentation, the token lifetime cannot exceed 10 minutes

Once you have the JWT, you can get an installation access token by calling the "new installation token" API endpoint. You can authenticate to this endpoint by using the generated JWT as a Bearer token

public async Task<string> GetTokenForInstallationAsync(int installationId)
    var appToken = GetTokenForApplication();
    using (var client = new HttpClient())
        string url = $"{installationId}/access_tokens";
        var request = new HttpRequestMessage(HttpMethod.Post, url)
            Headers =
                Authorization = new AuthenticationHeaderValue("Bearer", appToken),
                UserAgent =
                Accept =
        using (var response = await client.SendAsync(request))
            var json = await response.Content.ReadAsStringAsync();
            var obj = JObject.Parse(json);
            return obj["token"]?.Value<string>();

OK, almost there. Now we just need to use the installation token to call the GitHub API. This can be done easily with Octokit:

private IGitHubClient CreateGitHubClient(string installationToken)
    var userAgent = new ProductHeaderValue("DontMergeMeYet");
    return new GitHubClient(userAgent)
        Credentials = new Credentials(installationToken)

And that’s it, you can now call the GitHub API as an installation of your app.

Note: the code above isn’t exactly what you’ll find in the repo; I simplified it a little for the sake of clarity.

Testing locally using ngrok

When creating your Azure Function, it’s useful to be able to debug on your local machine. However, how will GitHub be able to call your function if it doesn’t have a publicly accessible URL? The answer is a tool called ngrok. Ngrok can create a temporary host name that forwards all traffic to a port on your local machine. To use it, create an account (it’s free) and download the command line tool. Once logged in to the ngrok website, a page will give you the command to save an authentication token on your machine. Just execute this command:

ngrok authtoken 1beErG2VTJJ0azL3r2SBn_2iz8johqNv612vaXa3Rkm

Start your Azure Function in debug from Visual Studio; the console will show you the local URL of the function, something like http://localhost:7071/api/GitHubWebHook. Note the port, and in a new console, start ngrok like this:

ngrok http 7071 --host-header rewrite

This will create a new hostname and start forwarding traffic to the 7071 port on your machine. The --host-header rewrite argument causes ngrok to change the Host HTTP header to localhost, rather than the temporary hostname; Azure Functions don’t work correctly without this.

You can see the temporary hostname in the command output:

ngrok by @inconshreveable                                                                                                                                                                                         (Ctrl+C to quit)

Session Status                online
Account                       Thomas Levesque (Plan: Free)
Version                       2.2.8
Region                        United States (us)
Web Interface       
Forwarding           -> localhost:7071
Forwarding           -> localhost:7071

Connections                   ttl     opn     rt1     rt5     p50     p90
                              0       0       0.00    0.00    0.00    0.00

Finally, go to the GitHub app settings, and change the webook URL to (i.e. the temporary domain with the same path as the local URL).

Now you’re all set. GitHub will send the webhook payloads to ngrok, which will forward them to your app running locally.

Note that unless you have a paid plan for ngrok, the temporary subdomain changes every time you start the tool, which is annoying. So it’s better to keep it running for the whole development session, otherwise you will need to change the GitHub app settings again.


Hopefully you learned a few things from this article. With Azure Functions, it’s almost trivial to implement a GitHub webhook (the only tricky part is the authentication to call the GitHub API, but not all webhooks need it). It’s much lighter than a full-blown web app, and much simpler to write: you don’t have to care about MVC, routing, services, etc. And if it wasn’t enough, the pricing model for Azure Functions make it a very cheap option for hosting a webhook!

Understanding the ASP.NET Core middleware pipeline


The ASP.NET Core architecture features a system of middleware, which are pieces of code that handle requests and responses. Middleware are chained to each other to form a pipeline. Incoming requests are passed through the pipeline, where each middleware has a chance to do something with them before passing them to the next middleware. Outgoing responses are also passed through the pipeline, in reverse order. If this sounds very abstract, the following schema from the official ASP.NET Core documentation should help you understand:

Middleware pipeline

Middleware can do all sort of things, such as handling authentication, errors, static files, etc… MVC in ASP.NET Core is also implemented as a middleware.

Configuring the pipeline

You typically configure the ASP.NET pipeline in the Configure method of your Startup class, by calling Use* methods on the IApplicationBuilder. Here’s an example straight from the docs:

public void Configure(IApplicationBuilder app)

Each Use* method adds a middleware to the pipeline. The order in which they’re added determines the order in which requests will traverse them. So an incoming request will first traverse the exception handler middleware, then the static files middleware, then the authentication middleware, and will eventually be handled by the MVC middleware.

The Use* methods in this example are actually just "shortcuts" to make it easier to build the pipeline. Behind the scenes, they all end up using (directly or indirectly) these low-level primitives: Use and Run. Both add a middleware to the pipeline, the difference is that Run adds a terminal middleware, i.e. a middleware that is the last in the pipeline.

A basic pipeline with no branches

Let’s look at a simple example, using only the Use and Run primitives:

public void Configure(IApplicationBuilder app)
    // Middleware A
    app.Use(async (context, next) =>
        Console.WriteLine("A (before)");
        await next();
        Console.WriteLine("A (after)");

    // Middleware B
    app.Use(async (context, next) =>
        Console.WriteLine("B (before)");
        await next();
        Console.WriteLine("B (after)");

    // Middleware C (terminal)
    app.Run(async context =>
        await context.Response.WriteAsync("Hello world");

Here, each middleware is defined inline as an anonymous method; they could also be defined as full-blown classes, but for this example I picked the more concise option. Non-terminal middleware take two arguments: the HttpContext and a delegate to call the next middleware. Terminal middleware only take the HttpContext. Here we have two middleware A and B that just log to the console, and a terminal middleware C which writes the response. Here’s the console output when we send a request to our app:

A (before)
B (before)
B (after)
A (after)

We can see that each middleware was traversed in the order in which it was added, then traversed again in reverse order. The pipeline can be represented like this:

Basic pipeline

Short-circuiting middleware

A middleware doesn’t necessarily have to call the next middleware. For instance, if the static files middleware can handle a request, it doesn’t need to pass it down to the rest of the pipeline, it can respond immediately. This behavior is called short-circuiting the pipeline.

In the previous example, if we comment out the call to next() in middleware B, we get the following output:

A (before)
B (before)
B (after)
A (after)

As you can see, middleware C is never invoked. The pipeline now looks like this:

Short-circuited pipeline

Branching the pipeline

In the previous examples, there was only one "branch" in the pipeline: the middleware coming after A was always B, and the middleware coming after B was always C. But it doesn’t have to be that way. You might want a given request to be processed by a completely different pipeline, based on the path or anything else.

There are two types of branches: branches that rejoin the main pipeline, and branches that don’t.

Making a non-rejoining branch

This can be done using the Map or MapWhen method. Map lets you specify a branch based on the request path. MapWhen gives you more control: you can specify a predicate on the HttpContext to decide whether to branch or not. Let’s look at a simple example using Map:

public void Configure(IApplicationBuilder app)
    app.Use(async (context, next) =>
        Console.WriteLine("A (before)");
        await next();
        Console.WriteLine("A (after)");

        new PathString("/foo"),
        a => a.Use(async (context, next) =>
            Console.WriteLine("B (before)");
            await next();
            Console.WriteLine("B (after)");

    app.Run(async context =>
        await context.Response.WriteAsync("Hello world");

The first argument for Map is a PathString representing the path prefix of the request. The second argument is a delegate that configures the branch’s pipeline (the a parameter represents the IApplicationBuilder for the branch). The branch defined by the delegate will process the request if its path starts with the specified path prefix.

For a request that doesn’t start with /foo, this code produces the following output:

A (before)
A (after)

Middleware B is not invoked, since it’s in the branch and the request doesn’t match the prefix for the branch. But for a request that does start with /foo, we get the following output:

A (before)
B (before)
B (after)
A (after)

Note that this request returns a 404 (Not found) response: this is because the B middleware calls next(), but there’s no next middleware, so it falls back to returning a 404 response. To solve this, we could use Run instead of Use, or just not call next().

The pipeline defined by this code can be represented as follows:

Non-rejoining branch

(I omited the response arrows for clarity)

As you can see, the branch with middleware B doesn’t rejoin the main pipeline, so middleware C isn’t called.

Making a rejoining branch

You can make a branch that rejoins the main pipeline by using the UseWhen method. This method accepts a predicate on the HttpContext to decide whether to branch or not. The branch will rejoin the main pipeline where it left it. Here’s an example similar to the previous one, but with a rejoining branch:

public void Configure(IApplicationBuilder app)
    app.Use(async (context, next) =>
        Console.WriteLine("A (before)");
        await next();
        Console.WriteLine("A (after)");

        context => context.Request.Path.StartsWithSegments(new PathString("/foo")),
        a => a.Use(async (context, next) =>
            Console.WriteLine("B (before)");
            await next();
            Console.WriteLine("B (after)");

    app.Run(async context =>
        await context.Response.WriteAsync("Hello world");

For a request that doesn’t start with /foo, this code produces the same output as the previous example:

A (before)
A (after)

Again, middleware B is not invoked, since it’s in the branch and the request doesn’t match the predicate for the branch. But for a request that does start with /foo, we get the following output:

A (before)
B (before)
B (after)
A (after)

We can see that the request passes trough the branch (middleware B), then goes back to the main pipeline, ending with middleware C. This pipeline can be represented like this:

Rejoining branch

Note that there is no Use method that accepts a PathString to specify the path prefix. I’m not sure why it’s not included, but it would be easy to write, using UseWhen:

public static IApplicationBuilder Use(this IApplicationBuilder builder, PathString pathMatch, Action<IApplicationBuilder> configuration)
    return builder.UseWhen(
        context => context.Request.Path.StartsWithSegments(pathMatch),


As you can see, the idea behind the middleware pipeline is quite simple, but it’s very powerful. Most of the features baked in ASP.NET Core (authentication, static files, caching, MVC, etc) are implemented as middleware. And of course, it’s easy to write your own!

Cleanup Git history to remove unwanted files

I recently had to work with a Git repository whose modifications needed to be ported to another repo. Unfortunately, the repo had been created without a .gitignore file, so a lot of useless files (bin/obj/packages directories…) had been commited. This made the history hard to follow, because each commit had hundreds of modified files.

Fortunately, it’s rather easy with Git to cleanup a branch, by recreating the same commits without the files that shouldn’t have been there in the first place. Let’s see step by step how this can be achieved.

A word of caution

The operation we need to perform here involves rewriting the history of the branch, so a warning is in order: never rewrite the history of a branch that is published and shared with other people. If someone else bases works on the existing branch and the branch has its history rewritten, it will become much harder to integrate their commits into the rewritten branch.

In my case, it didn’t matter, because I didn’t need to publish the rewritten branch (I just wanted to examine it locally). But don’t do this on a branch your colleagues are currently working on, unless you want them to hate you 😉.

Create a working branch

Since we’re going to make pretty drastic and possibly risky changes on the repo, we’d better be cautious. The easiest way to avoid causing damage to the original branch is, of course, to work on a separate branch. So, assuming the branch we want to cleanup is master, let’s create a master2 working branch:

git checkout -b master2 master

Identify the files to remove

Before we start to cleanup, we need to identify what needs to be cleaned up. In a typical .NET project, it’s usually the contents of the bin and obj directories (wherever they’re located) and the packages directory (usually at the root of the solution). Which gives us the following patterns to remove:

  • **/bin/**
  • **/obj/**
  • packages/**

Cleanup the branch: the git filter-branch command

The Git command that will let us remove unwanted files is named filter-branch. It’s described in the Pro Git book Pro Git as the "nuclear option", because it’s very powerful and possibly destructive, so it should be used with great caution.

This command works by taking each commit in the branch, applying a filter to it, and recommiting it with the changes caused by the filter. There are several kinds of filter, for instance:

  • --msg-filter : a filter that can rewrite commit messages
  • --tree-filter : a filter that applies to the working tree (causes each commit to be checked out, so it can take a while on a large repo)
  • --index-filter : a filter that applies to the index (doesn’t require checking out each commit, hence faster)

In our current scenario, --index-filter is perfectly adequate, since we only need to filter files based on their path. The filter-branch command with this kind of filter can be used as follows:

git filter-branch --index-filter '<command>'

<command> is a bash command that will be executed for each commit on the branch. In our case, it will be a call to git rm to remove unwanted files from the index:

git filter-branch --index-filter 'git rm --cached --ignore-unmatch **/bin/** **/obj/** packages/**'

The --cached parameter means we’re working on the index rather than on the working tree; --ignore-unmatch tells Git to ignore patterns that don’t match any file. By default, the command only applies to the current branch.

If the branch has a long history, this command can take a while to run, so be patient… Once it’s done, you should have a branch with the same commits as the original branch, but without the unwanted files.

More complex cases

In the example above, there were only 3 file patterns to remove, so the command was short enough to be written inline. But if there are many patterns, of if the logic to remove files is more complex, it doesn’t really work out… In this case, you can just write a (bash) script that contains the necessary commands, and use this script as the command passed to git filter-branch --index-filter.

Cleanup from a specific commit

In the previous example, we applied filter-branch to the whole branch. But it’s also possible to apply it only from a given commit, by specifying a revision range:

git filter-branch --index-filter '<command>' <ref>..HEAD

Here <ref> is a commit reference (SHA1, branch or tag). Note that the end of the range has to be HEAD, i.e. the tip of the current branch: you can’t rewrite the beginning or middle of a branch without touching the following commits, since each commit’s SHA1 depends on the previous commit.

Better timeout handling with HttpClient

The problem

If you often use HttpClient to call REST APIs or to transfer files, you may have been annoyed by the way this class handles request timeout. There are two major issues with timeout handling in HttpClient:

  • The timeout is defined at the HttpClient level and applies to all requests made with this HttpClient; it would be more convenient to be able to specify a timeout individually for each request.
  • The exception thrown when the timeout is elapsed doesn’t let you determine the cause of the error. When a timeout occurs, you’d expect to get a TimeoutException, right? Well, surprise, it throws a TaskCanceledException! So, there’s no way to tell from the exception if the request was actually canceled, or if a timeout occurred.

Fortunately, thanks to HttpClient‘s flexibility, it’s quite easy to make up for this design flaw.

So we’re going to implement a workaround for these two issues. Let’s recap what we want:

  • the ability to specify timeout on a per-request basis
  • to receive a TimeoutException rather than a TaskCanceledException when a timeout occurs.

Specifying the timeout on a per-request basis

Let’s see how we can associate a timeout value to a request. The HttpRequestMessage class has a Properties property, which is a dictionary in which we can put whatever we need. We’re going to use this to store the timeout for a request, and to make things easier, we’ll create extension methods to access the value in a strongly-typed fashion:

public static class HttpRequestExtensions
    private static string TimeoutPropertyKey = "RequestTimeout";

    public static void SetTimeout(
        this HttpRequestMessage request,
        TimeSpan? timeout)
        if (request == null)
            throw new ArgumentNullException(nameof(request));

        request.Properties[TimeoutPropertyKey] = timeout;

    public static TimeSpan? GetTimeout(this HttpRequestMessage request)
        if (request == null)
            throw new ArgumentNullException(nameof(request));

        if (request.Properties.TryGetValue(
                out var value)
            && value is TimeSpan timeout)
            return timeout;
        return null;

Nothing fancy here, the timeout is an optional value of type TimeSpan. We can now associate a timeout value with a request, but of course, at this point there’s no code that makes use of the value…

HTTP handler

The HttpClient uses a pipeline architecture: each request is sent through a chain of handlers (of type HttpMessageHandler), and the response is passed back through these handlers in reverse order. This article explains this in greater detail if you want to know more. We’re going to insert our own handler into the pipeline, which will be in charge of handling timeouts.

Our handler is going to inherit DelegatingHandler, a type of handler designed to be chained to another handler. To implement a handler, we need to override the SendAsync method. A minimal implementation would look like this:

class TimeoutHandler : DelegatingHandler
    protected async override Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
        return await base.SendAsync(request, cancellationToken);

The call to base.SendAsync just passes the request to the next handler. Which means that at this point, our handler does absolutely nothing useful, but we’re going to augment it gradually.

Taking into account the timeout for a request

First, let’s add a DefaultTimeout property to our handler; it will be used for requests that don’t have their timeout explicitly set:

public TimeSpan DefaultTimeout { get; set; } = TimeSpan.FromSeconds(100);

The default value of 100 seconds is the same as that of HttpClient.Timeout.

To actually implement the timeout, we’re going to get the timeout value for the request (or DefaultTimeout if none is defined), create a CancellationToken that will be canceled after the timeout duration, and pass this CancellationToken to the next handler: this way, the request will be canceled after the timout is elapsed (this is actually what HttpClient does internally, except that it uses the same timeout for all requests).

To create a CancellationToken whose cancellation we can control, we need a CancellationTokenSource, which we’re going to create based on the request’s timeout:

private CancellationTokenSource GetCancellationTokenSource(
    HttpRequestMessage request,
    CancellationToken cancellationToken)
    var timeout = request.GetTimeout() ?? DefaultTimeout;
    if (timeout == Timeout.InfiniteTimeSpan)
        // No need to create a CTS if there's no timeout
        return null;
        var cts = CancellationTokenSource
        return cts;

Two points of interest here:

  • If the request’s timeout is infinite, we don’t create a CancellationTokenSource; it would never be canceled, so we save a useless allocation.
  • If not, we create a CancellationTokenSource that will be canceled after the timeout is elapsed (CancelAfter). Note that this CTS is linked to the CancellationToken we receive as a parameter in SendAsync: this way, it will be canceled either when the timeout expires, or when the CancellationToken parameter will itself be canceled. You can get more details on linked cancellation tokens in this article.

Finally, let’s change the SendAsync method to use the CancellationTokenSource we created:

protected async override Task<HttpResponseMessage> SendAsync(
    HttpRequestMessage request,
    CancellationToken cancellationToken)
    using (var cts = GetCancellationTokenSource(request, cancellationToken))
        return await base.SendAsync(
            cts?.Token ?? cancellationToken);

We get the CTS and pass its token to base.SendAsync. Note that we use cts?.Token, because GetCancellationTokenSource can return null; if that happens, we use the cancellationToken parameter directly.

At this point, we have a handler that lets us specify a different timeout for each request. But we still get a TaskCanceledException when a timeout occurs… Well, this is going to be easy to fix!

Throwing the correct exception

All we need to do is catch the TaskCanceledException (or rather its base class, OperationCanceledException), and check if the cancellationToken parameter is canceled: if it is, the cancellation was caused by the caller, so we let it bubble up normally; if not, this means the cancellation was caused by the timeout, so we throw a TimeoutException. Here’s the final SendAsync method:

protected async override Task<HttpResponseMessage> SendAsync(
    HttpRequestMessage request,
    CancellationToken cancellationToken)
    using (var cts = GetCancellationTokenSource(request, cancellationToken))
            return await base.SendAsync(
                cts?.Token ?? cancellationToken);
            when (!cancellationToken.IsCancellationRequested)
            throw new TimeoutException();

Note that we use an exception filter : this way we don’t actually catch the OperationException when we want to let it propagate, and we avoid unnecessarily unwinding the stack.

Our handler is done, now let’s see how to use it.

Using the handler

When creating an HttpClient, it’s possible to specify the first handler of the pipeline. If none is specified, an HttpClientHandler is used; this handler sends requests directly to the network. To use our new TimeoutHandler, we’re going to create it, attach an HttpClientHandler as its next handler, and pass it to the HttpClient:

var handler = new TimeoutHandler
    InnerHandler = new HttpClientHandler()

using (var client = new HttpClient(handler))
    client.Timeout = Timeout.InfiniteTimeSpan;

Note that we need to disable the HttpClient‘s timeout by setting it to an infinite value, otherwise the default behavior will interfere with our handler.

Now let’s try to send a request with a timeout of 5 seconds to a server that takes to long to respond:

var request = new HttpRequestMessage(HttpMethod.Get, "http://foo/");
var response = await client.SendAsync(request);

If the server doesn’t respond within 5 seconds, we get a TimeoutException instead of a TaskCanceledException, so things seem to be working as expected.

Let’s now check that cancellation still works correctly. To do this, we pass a CancellationToken that will be cancelled after 2 seconds (i.e. before the timeout expires):

var request = new HttpRequestMessage(HttpMethod.Get, "http://foo/");
var cts = new CancellationTokenSource(TimeSpan.FromSeconds(2));
var response = await client.SendAsync(request, cts.Token);

This time, we receive a TaskCanceledException, as expected.

By implementing our own HTTP handler, we were able to solve the initial problem and have a smarter timeout handling.

The full code for this article is available here.

Transform T4 templates as part of the build, and pass variables from the project

T4 (Text Template Transformation Toolkit) is a great tool to generate code at design time; you can, for instance, create POCO classes from database tables, generate repetitive code, etc. In Visual Studio, T4 files (.tt extension) are associated with the TextTemplatingFileGenerator custom tool, which transforms the template to generate an output file every time you save the template. But sometimes it’s not enough, and you want to ensure that the template’s output is regenerated before build. It’s pretty easy to set this up, but there are a few gotchas to be aware of.

Transforming templates at build time

If your project is a classic csproj or vbproj (i.e. not a .NET Core SDK-style project), things are actually quite simple and well documented on this page.

Unload your project, and open it in the editor. Add the following PropertyGroup near the beginning of the file:

    <!-- 15.0 is for VS2017, adjust if necessary -->
    <VisualStudioVersion Condition="'$(VisualStudioVersion)' == ''">15.0</VisualStudioVersion>
    <VSToolsPath Condition="'$(VSToolsPath)' == ''">$(MSBuildExtensionsPath32)\Microsoft\VisualStudio\v$(VisualStudioVersion)</VSToolsPath>
    <!-- This is what will cause the templates to be transformed when the project is built (default is false) -->
    <!-- Set to true to force overwriting of read-only output files, e.g. if they're not checked out (default is false) -->
    <!-- Set to false to transform files even if the output appears to be up-to-date (default is true)  -->

And add the following Import at the end, after the import of Microsoft.CSharp.targets or Microsoft.VisualBasic.targets:

<Import Project="$(VSToolsPath)\TextTemplating\Microsoft.TextTemplating.targets" />

Reload your project, and you’re done. Building the project should now transform the templates and regenerate their output.

SDK-style projects

If you’re using the new project format that comes with the .NET Core SDK (sometimes informally called "SDK-style project"), the approach described above will need a small change to work. This is because the default targets file (Sdk.targets in the .NET Core SDK) is now imported implicitly at the very end of the project, so you can’t import the text templating targets after the default targets. This causes the BuildDependsOn variable, which is modified by the T4 targets, to be overwritten, so the TransformAll target doesn’t run before the Build target.

Fortunately, there’s a workaround: you can import the default targets file explicitly, and import the text templating targets after that:

<Import Project="Sdk.targets" Sdk="Microsoft.NET.Sdk" />
<Import Project="$(VSToolsPath)\TextTemplating\Microsoft.TextTemplating.targets" />

Note that it will cause a MSBuild warning in the build output (MSB4011) because Sdk.targets is imported twice; you can safely ignore this warning.

Passing MSBuild variables to templates

At some point, the code generation logic might become too complex to remain entirely in the T4 template file. You might want to extract some of it into a helper assembly, and reference this assembly from the template, like this:

<#@ assembly name="../CodeGenHelper/bin/Release/net462/CodeGenHelper.dll" #>

Of course, specifying the path like this isn’t very very convenient… For instance, if you’re currently in Debug configuration, the Release version of CodeGenHelper.dll might be out of date. Fortunately, Visual Studio’s TextTemplatingFileGenerator custom tool recognizes MSBuild variables from the project, so you can do this instead:

<#@ assembly name="$(SolutionDir)/CodeGenHelper/bin/$(Configuration)/net462/CodeGenHelper.dll" #>

The $(SolutionDir) and $(Configuration) variables will be expanded to their actual values. If you save the template, the template will be transformed using the CodeGenHelper.dll assembly. Nice!

However, there’s a catch… if you configured the project to transform templates on build as described above, the build will now fail, with an error like this:

System.IO.FileNotFoundException: Could not find a part of the path ‘C:\Path\To\The\Project\$(SolutionDir)\CodeGenHelper\bin\$(Configuration)\net462\CodeGenHelper.dll’.

Notice the $(SolutionDir) and $(Configuration) variables in the path? They were not expanded! This is because the MSBuild target that transforms the templates and the TextTemplatingFileGenerator custom tool don’t use the same text transformation engine. And unfortunately, the one used by MSBuild doesn’t recognize MSBuild properties out of the box… Ironic, isn’t it?

All is not lost, though. All you have to do is explicitly specify the variables you want to pass as T4 parameters. Edit your project file again, and create a new ItemGroup with the following items:

    <T4ParameterValues Include="SolutionDir">
    <T4ParameterValues Include="Configuration">

The Include attribute is the name of the parameter as it will be passed to the text transformation engine. The Value element is, well, the value. And the Visible element prevents the T4ParameterValues item from appearing under the project in the solution explorer.

With this change, the build should now successfully transform the templates again.

So, just keep in mind that the TextTemplatingFileGenerator custom tool and the MSBuild text transformation target have different mechanisms for passing variables:

  • TextTemplatingFileGenerator supports only MSBuild variables from the project
  • MSBuild supports only T4ParameterValues

So if you use variables in your template and you want to be able to transform it when you save the template in Visual Studio and when you build the project, the variables have to be defined both as MSBuild variables and as T4ParameterValues.

Common MSBuild properties and items with Directory.Build.props

To be honest, I never really liked MSBuild until recently. The project files generated by Visual Studio were a mess, most of their content was redundant, you had to unload the projects to edit them, it was poorly documented… But with the advent of .NET Core and the new "SDK-style" projects, it’s become much, much better.

MSBuild 15 introduced a pretty cool feature: implicit imports (I don’t know if it’s the official name, but I’ll use it anyway). Basically, you can create a file named Directory.Build.props anywhere in your repo, and it will be automatically imported by any project under the directory containing this file. This makes it very easy to share common properties and items across projects. This feature is described in details in this documentation page.

For instance, if you want to share some metadata across multiple projects, just write a Directory.Build.props file in the parent directory of your projects:


    <Authors>John Doe</Authors>


You can also do more interesting things like enabling and configuring StyleCop for all your projects:


    <!-- Common ruleset shared by all projects -->

    <!-- Add reference to StyleCop analyzers to all projects  -->
    <PackageReference Include="StyleCop.Analyzers" Version="1.0.2" />
    <!-- Common StyleCop configuration -->
    <AdditionalFiles Include="$(MSBuildThisFileDirectory)stylecop.json" />


Note that the $(MSBuildThisFileDirectory) variable refers to the directory containing the current MSBuild file. Another useful variable is $(MSBuildProjectDirectory), which refers to the directory containing the project being built.

MSBuild looks for the Directory.Build.props file starting from the project directory and going up until it finds a matching file, then it stops looking. In some cases you might want to define some properties for all projects in your repo, and add some more properties in a subdirectory. To do this, the "inner" Directory.Build.props file will need to explicitly import the "outer" one:

  • (rootDir)/

  <!-- Properties common to all projects -->
  <!-- ... -->
  • (rootDir)/tests/

  <!-- Import parent -->
  <Import Project="../Directory.Build.props" />

  <!-- Properties common to all test projects -->
  <!-- ... -->

The documentation mentions another approach, using the GetPathOfFileAbove function, but it didn’t seem to work when I tried… Anyway, I think using a relative path is easier to get right.

Using implicit imports brings the following benefits:

  • smaller project files, since common properties and items can be factored to common properties files.
  • single point of truth: if all projects reference the same package, the version to reference is defined in a single place; no more inconsistencies!

It also has a drawback: Visual Studio doesn’t care about where a property or item comes from, so if you change a property or a package reference from the IDE (using the project properties pages or NuGet Package Manager), it will be changed in the project file itself, rather than the Directory.Build.props file. The way I see it, it’s not a major issue, because I got into the habit of editing the projects manually rather than using the IDE features, but it might be annoying for some people.

If you want a real-world example of this technique in action, have a look at the FakeItEasy repository, where we use multiple Directory.Build.props files to keep the project files nice and clean.

Note that you can also create a Directory.Build.targets file, following the same principles, to define common build targets.

Linq performance improvements in .NET Core

By now, you’re probably aware that Microsoft released an open-source and cross-platform version of the .NET platform: .NET Core. This means you can now build and run .NET apps on Linux or macOS. This is pretty cool in itself, but it doesn’t end there: .NET Core also brings a lot of improvements to the Base Class Library.

For instance, Linq has been made faster in .NET Core. I made a little benchmark to compare the performance of some common Linq methods, and the results are quite impressive:

The full code for the benchmark can be found here. As with all microbenchmarks, it has to be taken with a grain of salt, but it gives an idea of the improvements.

Some lines in this table are quite surprising. How can Select run 5000 times almost instantly? First, we have to keep in mind that most Linq operators are lazy: they don’t actually do anything until you enumerate the result, so doing something like array.Select(i => i * i) executes in constant time (it just returns a lazy sequence, without consuming the items in array). This is why I included a call to Count() in my benchmark, to make sure the result is enumerated.

Despite this, it runs 5000 times in 413µs… This is possible due to an optimization in the .NET Core implementation of Select and Count. A useful property of Select is that it produces a sequence with the same number of items as the source sequence. In .NET Core, Select takes advantage of this. If the source is an ICollection<T> or an array, it returns a custom enumerable object that keeps track of the number of items. Count can then just retrieve this value and return it, which produces a result in constant time. The full .NET Framework implementation, on the other hand, naively enumerates the sequence produced by Select, which takes much longer.

It’s interesting to note that in this situation, .NET Core will not execute the projection specified in Select, so it’s a breaking change compared to the desktop framework for code that was relying on side effects of this projection. This has been identified as an issue which has already been fixed on the master branch, so the next release of .NET Core will execute the projection on each item.

OrderBy followed by Count() also runs almost instantly… did Microsoft invent a O(1) sorting algorithm? Unfortunately, no… The explanation is the same as for Select: since OrderBy preserves the item count, the information is recorded so that it can be used by Count, and there is no need to actually sort the input sequence.

OK, so these cases were pretty obvious improvements (which will be rolled back anyway, as mentioned above). What about the SelectAndToArray case? In this test, I call ToArray() on the result of Select, to make sure that the projection is actually performed on each item of the source sequence: no cheating this time. Still, the .NET Core version is 68% faster than the full .NET Framework version. The reason has to do with allocations: since the .NET Core implementation knows how many items are in the result of Select, it can directly allocate an array of the correct size. In the .NET Framework, this information is not available, so it starts with a small array, copies items into it until it’s full, then allocates a larger array, copies the previous array into it, copies the next items from the sequence until the array is full, and so on. This causes a lot of allocations and copies, hence the degraded performance. A few years ago, I suggested an optimized version of ToList and ToArray, where you had to specify the size. The .NET Core implementation basically does the same thing, except that you don’t have to pass the size manually, since it’s passed along the Linq method chain.

Where and WhereAndToArray are both about 8% faster on .NET Core 1.1. Looking at the code (.NET 4.6.2, .NET Core), I can’t see any obvious difference that could explain the better performance, so I suspect it’s mostly due to improvements in the runtime. In this case, ToArray doesn’t know the length of the input sequence, since there is no way to predict how many items Where will yield, so it can’t use the same optimization as with Select and has to build the array the slow way.

We already discussed OrderBy + Count(), which wasn’t a fair comparison since the .NET Core implementation didn’t actually sort the sequence. The OrderByAndToArray case is more interesting, because the sort can’t be skipped. And in this case, the .NET Core implementation is slightly slower than the .NET 4.6.2 one. I’m not sure why this is; again, the implementation is very similar, although there has been a bit of refactoring in .NET Core.

So, on the whole, Linq seems generally faster in .NET Core than in .NET 4.6.2, which is very good news. Of course, I only benchmarked a limited numbers of scenarios, but it shows the .NET Core team is working hard to optimize everything they can.

Easy text parsing in C# with Sprache

A few days ago, I discovered a little gem: Sprache. The name means "language" in German. It’s a very elegant and easy to use library to create text parsers, using parser combinators, which are a very common technique in functional programming. The theorical concept may seem a bit scary, but as you’ll see in a minute, Sprache makes it very simple.

Text parsing

Parsing text is a common task, but it can be tedious and error-prone. There are plenty of ways to do it:

  • manual parsing based on Split, IndexOf, Substring etc.
  • regular expressions
  • hand-built parser that scans the string for tokens
  • full blown parser generated with ANTLR or a similar tool
  • and probably many others…

None of these options is very appealing. For simple cases, splitting the string or using a regex can be enough, but it doesn’t scale to more complex grammars. Building a real parser by hand for non-trivial grammars is, well, non-trivial. ANTLR requires Java, a bit of knowledge, and it relies on code generation, which complicates the build process.

Fortunately, Sprache offers a very nice alternative. It provides many predefined parsers and combinators that you can use to define a grammar. Let’s walk through an example: parsing the challenge in the WWW-Authenticate header of an HTTP response (I recently had to write a parser by hand for this recently, and I wish I had known Sprache then).

The grammar

The WWW-Authenticate header is sent by an HTTP server as part of a 401 (Unauthorized) response to indicate how you should authenticate:

# Basic challenge
WWW-Authenticate: Basic realm="FooCorp"

# OAuth 2.0 challenge after sending an expired token
WWW-Authenticate: Bearer realm="FooCorp", error=invalid_token, error_description="The access token has expired"

What we want to parse is the "challenge", i.e. the value of the header. So, we have an authentication scheme (Basic, Bearer), followed by one or more parameters (name-value pairs). This looks simple enough, we could probably just split by ',' then by '=' to get the values… but the double quotes complicate things, since quoted strings could contain the ',' or '=' characters. Also, the double quotes are optional if the parameter value is a single token, so we can’t rely on the fact they will (or won’t) be there. If we want to parse this reliably, we’re going to have to look at the specs.

The WWW-Authenticate header is described in detail in RFC-2617. The grammar looks like this, in what the RFC calls "augmented Backus-Naur Form" (see RFC 2616 §2.1):

# from RFC-2617 (HTTP Basic and Digest authentication)

challenge      = auth-scheme 1*SP 1#auth-param
auth-scheme    = token
auth-param     = token "=" ( token | quoted-string )

# from RFC2616 (HTTP/1.1)

token          = 1*<any CHAR except CTLs or separators>
separators     = "(" | ")" | "<" | ">" | "@"
               | "," | ";" | ":" | "\" | <">
               | "/" | "[" | "]" | "?" | "="
               | "{" | "}" | SP | HT
quoted-string  = ( <"> *(qdtext | quoted-pair ) <"> )
qdtext         = <any TEXT except <">>
quoted-pair    = "\" CHAR

So, we have a few grammar rules, let’s see how we can encode them in C# code with Sprache, and use them to parse a challenge.

Parsing tokens

Let’s start with the most simple parts of the grammar: tokens. A token is declared as one or more of any characters that are not control chars or separators.

We’ll define our rules in a Grammar class. Let’s start by defining some character classes:

static class Grammar
    private static readonly Parser<char> SeparatorChar =
        Parse.Chars("()<>@,;:\\\"/[]?={} \t");

    private static readonly Parser<char> ControlChar =
        Parse.Char(Char.IsControl, "Control character");

  • Each rule is declared as a Parser<T>; since these rules match single characters, they are of type Parser<char>.
  • The Parse class from Sprache exposes parser primitives and combinators.
  • Parse.Chars matches any character from the specified string, we use it to specify the list of separator characters.
  • The overload of Parse.Char that we use here takes a predicate that will be called to check if the character matches, and a description of the character class. Here we just use System.Char.IsControl as the predicate to match control characters.

Now, let’s define a TokenChar character class to match characters that can be part of a token. As per the RFC, this can be any character not in the previous classes:

    private static readonly Parser<char> TokenChar =
  • Parse.AnyChar, as the name implies, matches any character.
  • Except specifies exceptions to the rule.

Finally, a token is a sequence of one or more of these characters:

    private static readonly Parser<string> Token =
  • A token is a string, so the rule for a token is of type Parser<string>.
  • AtLeastOnce() means one or more repetitions, and since TokenChar is a Parser<char>, it returns a Parser<IEnumerable<char>>.
  • Text() combines the sequence of characters into a string, returning a Parser<string>.

We’re now able to parse a token. But it’s just a small step, and we still have a lot to do…

Parsing quoted strings

The grammar defines a quoted string as a sequence of:

  • an opening double quote
  • any number of either
    • a "qdtext", which is any character except a double quote
    • a "quoted pair", which is any character preceded by a backslash (this is used to escape double quotes inside a string)
  • a closing double quote

Let’s write the rules for "qdtext" and "quoted pair":

    private static readonly Parser<char> DoubleQuote = Parse.Char('"');
    private static readonly Parser<char> Backslash = Parse.Char('\\');

    private static readonly Parser<char> QdText =

    private static readonly Parser<char> QuotedPair =
        from _ in Backslash
        from c in Parse.AnyChar
        select c;

The QdText rule doesn’t require much explanation, but QuotedPair is more interesting… As you can see, it looks like a Linq query: this is Sprache’s way of specifying a sequence. This particular query means: match a backslash (named _ because we ignore it) followed by any character named c, and return just c (quoted pairs are not escape sequences in the same sense as in C, Java or C#, so "\n" isn’t interpreted as "new line" but just as "n").

We can now write the rule for a quoted string:

    private static readonly Parser<string> QuotedString =
        from open in DoubleQuote
        from text in QuotedPair.Or(QdText).Many().Text()
        from close in DoubleQuote
        select text;
  • the Or method indicates a choice between two parsers. QuotedPair.Or(QdText) will try to match a quoted pair, and if that fails, it will try to match a QdText instead.
  • Many() indicates any number of repetition
  • Text() combines the characters into a string

We now have all the basic building blocks, so we can move on to higher level rules.

Parsing challenge parameters

A challenge is made of an auth scheme followed by one or more parameters. The auth scheme is trivial (it’s just a token), so let’s start by parsing the parameters.

Although there isn’t a named rule for it in the grammar, let’s define a rule for parameter values. The value can be either a token or a quoted string:

    private static readonly Parser<string> ParameterValue =

Since a parameter is a composite element (name and value), let’s define a class to represent it:

class Parameter
    public Parameter(string name, string value)
        Name = name;
        Value = value;
    public string Name { get; }
    public string Value { get; }

The T in Parser<T> isn’t restricted to characters and strings, it can be any type. So the rule for parsing parameters will be of type Parser<Parameter>:

    private static readonly Parser<char> EqualSign = Parse.Char('=');

    private static readonly Parser<Parameter> Parameter =
        from name in Token
        from _ in EqualSign
        from value in ParameterValue
        select new Parameter(name, value);

Here we match a token (the parameter name), followed by the '=' sign, followed by a parameter value, and we combine the name and value into a Parameter instance.

Now let’s parse a sequence of one or more parameters. Parameters are separated by commas (','), with optional leading and trailing whitespace (look for "#rule" in RFC 2616 §2.1). The grammar for lists allows several commas without items in between, e.g. "item1 ,, item2,item3, ,item4", so the rule for the delimiter can be written like this:

    private static readonly Parser<char> Comma = Parse.Char(',');

    private static readonly Parser<char> ListDelimiter =
        from leading in Parse.WhiteSpace.Many()
        from c in Comma
        from trailing in Parse.WhiteSpace.Or(Comma).Many()
        select c;

We just match the first comma, the rest can be any number of commas or whitespace characters. We return the comma because we have to return something, but we won’t actually use it.

We could now match the sequence of parameters like this:

    private static readonly Parser<Parameter[]> Parameters =
        from first in Parameter.Once()
        from others in (
            from _ in ListDelimiter
            from p in Parameter
            select p).Many()
        select first.Concat(others).ToArray();

But it’s not very straightforward… fortunately Sprache provides an easier option with the DelimitedBy method:

    private static readonly Parser<Parameter[]> Parameters =
        from p in Parameter.DelimitedBy(ListDelimiter)
        select p.ToArray();

Parsing the challenge

We’re almost done. We now have everything we need to parse the whole challenge. Let’s define a class to represent it first:

class Challenge
    public Challenge(string scheme, Parameter[] parameters)
        Scheme = scheme;
        Parameters = parameters;
    public string Scheme { get; }
    public Parameter[] Parameters { get; }

And finally we can write the top-level rule:

    public static readonly Parser<Challenge> Challenge =
        from scheme in Token
        from _ in Parse.WhiteSpace.AtLeastOnce()
        from parameters in Parameters
        select new Challenge(scheme, parameters);

Note that I made this rule public, unlike the others: it’s the only one we need to expose.

Using the parser

Our parser is done, now we just have to use it, which is pretty straightforward:

void ParseAndPrintChallenge(string input)
    var challenge = Grammar.Challenge.Parse(input);
    Console.WriteLine($"Scheme: {challenge.Scheme}");
    foreach (var p in challenge.Parameters)
        Console.WriteLine($"- {p.Name} = {p.Value}");

With the OAuth 2.0 challenge example from earlier, this produces the following output:

Scheme: Bearer
- realm = FooCorp
- error = invalid_token
- error_description = The access token has expired

If there’s a syntax error in the input text, the Parse will throw a ParseException with a message describing where and why the parsing failed. For instance, if I remove the space between "Bearer" and "realm", I get the following error:

Parsing failure: unexpected ‘=’; expected whitespace (Line 1, Column 12); recently consumed: earerrealm

You can find the full code for this article here.


As you can see, Sprache makes it very simple to parse complex text. The code isn’t particularly short, but it’s completely declarative; there are no loops, no conditionals, no temporary variables, no state… This makes it very easy to understand, and it can easily be compared with the actual grammar definition to check its correctness. It also provides pretty good feedback in case of error, which is hard to accomplish with a hand-built parser.