Type Driven Development Creates Better Code

Leveraging C#’s Type System for Robust Code Design

Type driven development was first coined as a term in the title of a book regarding Idris’ powerful and unique type system, however, it has begun being used more extensively in recent years to refer to the process of maximizing the usage of the compiler and type system and creating more accurate and expressive types to better encapsulate invariants and logic. The type system in C# is not as powerful as something like Rust, which allows for additional features such as type aliasing or union types, nor something like Idris, where types can be passed into functions, and new types returned, however, it is still very powerful and is often not leaned on enough by developers.

This blog post aims to share some tips on better-utilizing C#’s type system and bringing it to the forefront of your development experience so that you can use it to eliminate certain classes of errors and problems at compile-time. First, let’s start with a primer and try to reframe how we think of types in general!

What Are Types?

A type constrains a variable to a certain set of possible values: For example, in C#, an int is a variable that can be, at any time, a singular integral value within the range of -2,147,483,648 to 2,147,483,647. Compare this to a boolean, which can only be one of two values, true or false. A big part of type driven development is ensuring that, when defining variables, you use a type that most accurately constrains the set of possible states that the variable can exist in.

Product Types

When thinking of a type as a certain set of all possible states that the variable can exist in, then you can start to think of classes and interfaces differently. For example, consider the following record:

public record Person(string Name, int Age);

We can consider a person a product of a string and an int, meaning it is a product type, which is analogous to the Cartesian product in set theory, in that the total number of possible states is the product of the number of possible states of its parts: The number of possible states that person can exist within is A x B, where A is the number of all possible states of string, and B is the number of all possible states of int.

Sum Types

Less prevalent within C# are sum types, which are types that can hold a value that can be one of several different types. In most languages, this would be a union type, however, the closest thing we have to replicate this within C# is an interface:

public interface IShape {};

public record Circle(int Radius) : IShape;

public record Rectangle(int Width, int Height) : IShape;

We can think of sum types as analogous to the union of sets in set theory, in that the total number of possible states is the sum of the number of possible states of each of its parts: The number of possible states that IShape can exist within is A + B, where A is the number of possible states Circle and B is the number of possible states of Rectangle. This is why, for instance, we can and should switch on an instance of an interface, to ensure we handle all possible instances that it might be at any given time:

IShape shape = new Circle(32);
string shapeType = shape switch { 
	Circle c => $"Circle with radius {c.Radius}", 
	Rectangle r => $"Rectangle with dimensions {r.Width}x{r.Height}",     
    _ => "Unknown shape" 
};

Invalid States Should Be Unrepresentable

Okay, now that was a lot to unpack, so let’s take our new-found perception of types and find a way to apply it to day-to-day programming! If we begin to think of types as the possible range of states that a variable can exist in, then we can start to better model our domain objects to use types that better represent the range of possible states that it should exist in. For example, let’s take the following record:

public record User(string Id, string Username, string Email);

The User type is a product type of three strings, however, this is not an accurate representation of the User type. Let’s focus on the Email field: A string is not the right type to represent the Email field, as an email is not representable by any possible string. Rather, it is accurately represented by a certain subset of strings where at least one @ symbol must be present. To represent that in a record, we would have to do the following:

public readonly record struct Email  
{  
    public readonly string Value;  

    public Email(string email)  
    {
	    if (!email.Contains('@'))  
        {            
	        throw new Exception(); 
        }

        Value = email
    }
}

Now, functions that deal with the Email type can be sure that the email will always be formatted correctly, and furthermore, a User cannot be constructed without a valid Email.

public record User(string Id, string Username, Email Email);

This allows a great deal of safety at compile time and gives us certainty that operations on the Email type regarding the @ symbol within it can be performed safely as a concrete class instance cannot exist without it.

public string GenerateUsernameFromEmail(Email email)
{
	return email.Value.Split('@')[0];
}

This concept combats the idea of primitive obsession, which is the anti-pattern of reaching for primitives instead of smaller, focused types. The latter is prevalent heavily within the C# standard library, with types such as Ipv4, Uri, Guid and so forth.

The Result Type

The value type Email in the previous section is a good start, however, it can be improved. Currently, the constructor throws an exception if a user attempts to create an Email without an @ within it, however, exceptions should be reserved for truly exceptional cases, the ones that we have no way to handle, and a user attempting to sign up with a bad email is not that exceptional.

Furthermore, the presence of an exception within a function makes the function signature tell a lie, as it gives no warning that the function could explode in the user’s face. We can lean on the type system and sum types to express this intent, however, forcing any consumers of this function to handle either possible state, and we can do this with a Result type, which is a sum type representing both the error and success states of the variable:

public readonly record struct Result<TValue>  
{  
  public TValue? Value { get; }
  public string? Error { get; }
  [MemberNotNullWhen(true, nameof(Value))]
  [MemberNotNullWhen(false, nameof(Error))]
  public bool IsSuccess { get; }
  
  private Result(string error)  
  {    
	this.Value = default (TValue);  
    this.Error = error;  
    this.IsSuccess = false;  
  }  

  private Result(TValue value)  
  {    
	this.Value = value;  
    this.Error = (string) null;  
    this.IsSuccess = true;  
  }  

  public static Result<TValue> Success(TValue value) => 
	  new Result<TValue>(value);

  public static Result<TValue> Failure(string error) => 
	  new Result<TValue>(error);
public readonly record struct Email  
{  
    public readonly string Value;  
  
    private Email(string email)  
    {        
	    Value = email;  
    }

    public static Result<Email> Create(string email)  
    {        
	    if (!email.Contains('@'))  
        {            
	        return Result<Email>.Failure("Bad email!");
        }

		var validEmail = new Email(email);  
		return Result<Email>.Success(validEmail);
    }
}

With these changes, we now force the instantiation of the Email type to occur via a static constructor method that returns a Result type, meaning any call site that attempts to create an Email has to expressly handle both the possible error and success states.

A bonus tip, we can use the MemberNotNullWhen attribute to assist the compiler with nullability analysis, meaning that checking the state of IsSuccess will, at compile time, assert that either Value or Error are not null and can be consumed safely.

Parse, Don’t Validate

We’ve neglected to talk about how functions, at a high level, interact with the type system thus far. Functions are first-class citizens within the C# language, and as such, can be stored in variables and consumed like any other type. At a high level, we understand that functions take some input and may potentially produce some output. This intent is expressed via a function’s signature. For example, a validation function might look like the following:

public bool Validate(object obj);

Here, it’s evident that the function is taking an object and deriving a true or false output based on it. This is the premise of validation: We take some specific instance of a type and check to see if it satisfies a set of criteria. We can do this for many reasons, but commonly we do this as we need to ensure that data is valid before we perform operations on it. Does this ring any bells? Suppose a user has supplied an email address to us in a registration flow, we could validate it as soon as it comes in, but then we run into some issues:

  • Every time we want to work with emails, say, within another function down the track, we need to perform the same validation
  • We have no compile-time guarantee that the validation has occurred
  • The validation mechanism must be known to all users and can be forgotten in spots within the code base

We can avoid this by parsing instead of validating. Parsing is the process of taking some input and deriving a more structured output from it. A common example is how a programming languages parser works: it takes a string of code and breaks it down into meaningful symbols and constructs for the compiler to consume. At a high level, this change is a simple adjustment of the function signature above:

public Result<Email> Parse(string email);

By parsing instead of validating, and doing so as close to the original point of retrieving the data as possible, we can make the decision to easily return any errors to the user in a failure case if need be (this almost makes the Host layer a pseudo-anti-corruption layer for the domain layer). In the successful case, we now have a rich data type that functions can rely on instead of a string, removing the need for any further revalidation on any email fields as they will now be compile-time valid Email instances.

Tying It Together

Let’s tie it all together with a slightly fictitious practical example of a function that attempts to create a user within a database. Assume that the user signs up on the website, and somewhere down the line the following function is called:

public static Task<Result<User>> CreateUser(CreateUserRequest request)  
{  
    var email = Email.Create(request.Email);  
    if (!email.IsSuccess)  
    {        
	    return Result<User>.Failure(email.Error);  
    }  

	var id = Guid.NewGuid();
    var user = new User(id, request.Username, email.Value);

	await _repository.SaveAsync(user);
 
    return Result<User>.Success(user);  
}

The above code exemplifies a lot of the concepts we’ve spoken about within the article, and although it is a trite example, its goal is to showcase some of the benefits of leaning on the type system without being too verbose:

  • Parsing Over Validating: The Email.Create method parses the string to a more correct type, meaning that any function receiving an Email instance can be certain it’s dealing with valid data, eliminating the need for repetitive validation checks throughout your codebase.
  • Explicit Error Handling: By returning a Result<User>, we’re making failure a first-class citizen in our API. This forces consumers of our method to explicitly handle both success and failure cases at compile time, leading to more robust error handling. Remember, exceptions are for exceptional cases – expected failure states should be part of your type system!
  • A Focus on Separation of Concerns: The Email type encapsulates all the logic for what constitutes a valid email, keeping our CreateUser method clean and focused on its primary responsibility.
  • Errors as Values: Instead of throwing exceptions for invalid emails, we’re returning errors as values within our Result type. This makes the potential for failure explicit in our method signature, and allows us to easily propagate and aggregate errors up the call stack, allowing consumers to handle it as necessary, such as exposing it to the users as a BadRequest or sending an alert out to Slack or Teams.

Remember, these aren’t hard and fast rules: They’re a set of learnings to be pragmatically applied where they can be most beneficial!

Conclusion

By leveraging C#’s type system more effectively, we can create more robust, self-documenting, and error-resistant code. Throughout this post, we’ve explored several key techniques that allow us to push more of our error handling and business logic into the type system itself, catching potential issues at compile time rather than runtime. By doing so, we not only make our code safer but also more expressive and easier to reason about.

The type system is a powerful tool at our disposal, so let’s better understand it so that we can use it to its full potential.

Further Reading

Below are some resources that explain different concepts relating to type driven development in further detail:

Read more recent blogs

Get started on the right path to cloud success today. Our Crew are standing by to answer your questions and get you up and running.