Cody Shepp

Twitter | Github | LinkedIn

Creating a C# Library from Protocol Buffers

Published on 4/3/2017

In my last post about message queues, I suggested that data contracts - specifically Google Protocol Buffers - can be extremely useful for communicating over queues. Today I wanted to share the process we use at work to build and distribute a C# class library containing compiled protobufs using PowerShell, TFS, and NuGet. We start with raw .proto files and end up with a library distributed via NuGet that we can easily reference in multiple projects.

Protobuf Project Structure

We keep all of the .proto files we share with other teams in a dedicated repo named DataContracts. The structure of the repo looks like this:

/DataContracts
    /Src
        - DataContractA.proto
        - DataContractB.proto
        - DataContractC.proto
    /Output
        /CSharp
            /Src
            /Properties
                - AssemblyInfo.cs
            - DataContracts.csproj
            - DataContracts.nuspec
            - DataContracts.sln
            - packages.config
    /Documentation
    /Deploy
        - Compile.ps1
        - GenerateCsproj.ps1
        - ValidateNuspec.ps1

The top-level Src folder contains all of the proto files. The Output folder has subfolders for each language we target - in this post, I'm only going to talk about the C# output. Inside of the CSharp folder there's a bare-bones Visual Studio solution that includes a class library project with no files, as well as a packages.config file for nuget dependencies.

The Deploy folder contains all of the PowerShell scripts that we use to compile .proto files into .cs files, validate the nuspec version, and modify the .csproj file.

Process Overview

Here's an overview of the process we use to create the library:

  • Validate the version number in the nuspec file
  • Compile .proto files into .cs files
  • Modify the class library project so that it includes all generated .cs files
  • Restore NuGet packages
  • Compile the solution/project
  • Package the .dlls into a NuGet package
  • Push the new package to our NuGet server

Validating Version Numbers

We use some basic criteria to validate the version numbers for our NuGet package. If the branch we're building is develop, the version must have an "-alpha" suffix. If the branch is master, the version must NOT have a suffix. This simple check ensures that prerelease code is always tagged with a prerelease version number.

The PowerShell code for accomplishing the validation is pretty straightforward.

We obtain the current version number like so:


[xml]$xmldata = Get-Content "Output\CSharp\DataContracts.nuspec"
$version = $xmldata.package.metadata.version

Once we have the version number, we check to make sure it matches the pattern we're looking for:


if($branchName -eq "master") {
    if($version.Contain("-alpha")) {
        Write-Error "master branch nuspec cannot contain -alpha version"
        Exit 1
    }
}

if($branchName -eq "develop") {
    if(!$version.Contains("-alpha")) {
        Write-Error "develop branch nuspec *must* contain -alpha version"
        Exit 1
    }
}

Write-Host "Nuspec validation was successful"

The $branchName variable is passed as an argument to the script. In TFS you can get the current branch name by referencing the build variable $(Build.SourceBranch).

Compiling Proto Files

There's nothing complicated about compiling protobufs into C# classes. We execute protoc.exe and specify --proto_path=Src and --csharp_out=Output\CSharp\Src to set the input and output directories, respectively. The protobuf compiler generates the .cs files and drops them in the Output\CSharp\Src folder. We put them in this separate Src folder so that it's easy to find them all later in the build process.

Modifying the .csproj File

Next, we add the generated .cs files to our empty class library project. We do this by manipulating the .csproj file to add each generated file to the list of files that need to be compiled.

Even the most basic .csproj file is quite verbose. There's some boilerplate at the very top, followed by several PropertyGroup elements that define compliation targets. Below that we have several ItemGroup elements. The first ItemGroup contains references - both to NuGet packages and core assemblies. The second ItemGroup contains the files the need to be compiled - this is the element we need to modify. Finally, the third ItemGroup contains files that don't need to be compiled (such as config files).

To add our generated .cs files to that second ItemGroup, we use a PowerShell script.

First we read in the csproj file as XML and find the second ItemGroup element


[xml]$xmldata = Get-Content "Output\CSharp\DataContracts.csproj"
$itemGroup = $xmldata.Project.ItemGroup[1]

Next we loop through the .cs files in our "Output\CSharp\Src" folder and add them as child elements to the ItemGroup.


$files = Get-ChildItem "Output\CSharp\Src"

for($file in $files) {
    $filePath = $Join-Path "Src" $file
    $newChild = $xmldata.CreateElement("Compile")
    $newChild.SetAttribute("Include", $filePath)
    $itemGroup.AppendChild($newChild)
}

Now we save it back out the the file:


$xmldata.Save("Output\CSharp\DataContracts.csproj")

Now we do some funky stuff to clean up the XML that PowerShell generated. When it created the "Compile" elements, it added an extra "xmlns" attribute to each element. We need to remove this attribute from each element or Visual Studio will complain. To do that, we read the file contents back in as a string and then do a simple string replace.


$fileData = Get-Content "Output\CSharp\DataContracts.csproj"
$newFileData = $fileData -replace 'xmlns=""', ''
Out-File -filepath "Output\CSharp\DataContracts.csproj" -inputobject $newFileData

The part of the .csproj file that we modified now looks something like this:


<ItemGroup>
    <Compile Include="Properties\AssemblyInfo.cs" />
    <Compile Include="Src\DataContractA.cs" />
    <Compile Include="Src\DataContractB.cs" />
    <Compile Include="Src\DataContractC.cs" />
</ItemGroup>

Building and Packaging

After we modify the .csproj file, we restore NuGet packages. In our DataContracts project, we only have one referenced package - the Google Protocol Buffers package.

Now we compile, package, and distribute our data contracts. There are built-in steps for all of these actions in TFS, so I won't cover them in detail here.

Hopefully this gives you an idea of how we use Protocol Buffers as data contracts. If you have any questions, tweet me @cas002.