Jake Donham > Technical Difficulties > Reconstructing TypeScript, part 3

Reconstructing TypeScript, part 3: operators and singleton types

2021-10-06

This post is part of a series about implementing type checking for a TypeScript-like language. In part 2 we added functions and function applications to the core bidirectional type checker from part 1. In this part we'll add operator expressions like:

x + y
x && y
x || y
!x
typeof x
x === y
x !== y

so we'll be able to write more interesting programs. We'll also add a new kind of type: singleton types (also known as literal types) that contain a single value.

These two new features are connected: when the inputs to an operator have singleton types, the type checker can compute the output value at type checking time, and synthesize a singleton output type. This gives the type checker more information to work with, so it can successfully synthesize a type for more programs.

What's a singleton type?

In part 0 I wrote that a type

describes attributes shared by [a collection of] values: what operations are supported; and, for some operations, what result they return.

We say that a type contains a value (or that a value satisfies a type) when the value has all the attributes that are described by the type.

A singleton type is a type that contains a single value. For example: 7 is the type that contains the value 7, true is the type that contains the value true, and so on. You might object—there is not just one value 7, my program is littered with 7s! OK, fair, let's try again: a singleton type contains all the values that are === to a particular value. (JavaScript has a profusion of equalities so we should be specific which one we mean.)

The value of a singleton type has some underlying base type (number for 7, boolean for true, and so on), so we call that the base type of the singleton type. A singleton type supports all the same operations as its base type; but it adds some information about what results are returned by the operations for its particular value. For example: in an expression x === 7, if x has type number we don't know whether the expression returns true or false; but if x has type 7 we know that it returns true.

What about a compound type like

{ foo: 7, bar: true }

? This type contains all the values that have a foo property with type 7 and a bar property with type true, like:

{ foo: 7, bar: true }
{ foo: 7, bar: true, baz: 'hello' }

It's tempting to read it as the type of values equal to { foo: 7, bar: true }, but in TypeScript there are only singleton types of primitive values.

The TypeScript docs call these literal types; they're called singleton types in some other languages.

What are singleton types good for?

By themselves, singleton types aren't very useful. But with unions, we can use them to define enumerations:

type color = 'red' | 'green' | 'blue'

or variants:

type tree =
  { type: 'leaf', value: number } |
  { type: 'node', left: tree, right: tree }

And with intersection types (more on this in part 5), we can use singleton types to describe overloaded functions that return different types based on argument values, like

const getElementsByTagName:
  ((tag: 'div') => HTMLDivElement[]) &
  ((tag: 'p') => HTMLParagraphElement[]) &
  ...

Representing singleton types

To represent singleton types we add an arm to the Type union (see types.ts) containing the base type (as above, we permit only primitive types) and value:

type Type = ... | Singleton;

type Singleton = {
  type: 'Singleton';
  base: Boolean | Number | String;
  value: unknown;
}

When base is Boolean, value is always a boolean, and so on. (We could encode this invariant in the type, but it turns out not to work well with TypeScript's narrowing—give it a try and see what goes wrong; more on this later.)

The type of base is an example of the flexibility of unions: we want to restrict Type to just its Boolean, Number, and String arms, so we simply write the union of those arms. In most languages this is much more painful: in most object-oriented languages we'd need to bake it into the class hierarchy; in most languages with variants we'd need a separate variant—either way, it's painful enough that we'd most likely give base the type Type instead, and check the restriction at run time (losing some development-time checking).

As usual we add a constructor function (see constructors.ts):

function singleton(value: boolean | number | string): Singleton {
  switch (typeof value) {
    case 'boolean': return { type: 'Singleton', base: boolean, value };
    case 'number': return { type: 'Singleton', base: number, value };
    case 'string': return { type: 'Singleton', base: string, value };
  }
}

and a validator function (see validators.ts):

function isSingleton(type: Type): type is Singleton {
  return type.type === 'Singleton';
}

and cases for singleton types in toString.ts and ofTSType.ts.

Synthesizing types from primitive values

Now when synthesizing types from primitive expressions (BooleanLiteral, NumericLiteral, StringLiteral), we return a singleton type instead of the underlying base type (see synth.ts):

function synthBoolean(env: Env, ast: AST.BooleanLiteral): Type {
  return Type.singleton(ast.value);
}

function synthNumber(env: Env, ast: AST.NumericLiteral): Type {
  return Type.singleton(ast.value);
}

function synthString(env: Env, ast: AST.StringLiteral): Type {
  return Type.singleton(ast.value);
}

In actual TypeScript, singleton types are synthesized for immutable values but not mutable values. This makes sense—a mutable value of singleton type is effectively immutable, since you can only mutate it to another value that's === to the original. For example:

const x = 7  // x has type 7
let y = true // y has type boolean

const obj1 = { foo: 7, bar: true }
// obj1 has type { foo: number, bar: boolean }

const obj2 = { foo: 7, bar: true } as const
// obj2 has type { readonly foo: 7, readonly bar: true }

In our language we have only immutable values, so we always synthesize singleton types.

Synthesizing types from operator expressions

We add cases to synth for different kinds of operator expression (see synth.ts):

function synth(env: Env, ast: AST.Expression): Type {
  ...
  case 'BinaryExpression':  return synthBinary(env, ast);
  case 'LogicalExpression': return synthLogical(env, ast);
  case 'UnaryExpression':   return synthUnary(env, ast);
  ...
}

Binary operators

For binary operators (BinaryExpression) we:

synthesize types from the left and right subexpressions;
when both have singleton type, apply the operator and return a singleton type;
otherwise return the appropriate base type,

Here's the code:

function synthBinary(env: Env, ast: AST.BinaryExpression): Type {
  if (!AST.isExpression(ast.left)) bug(`unimplemented ${ast.left.type}`)
  const left = synth(env, ast.left);
  const right = synth(env, ast.right);
  switch (ast.operator) {
    case '===':
      if (Type.isSingleton(left) && Type.isSingleton(right))
        return Type.singleton(left.value === right.value);
      else
        return Type.boolean;

    case '!==':
      if (Type.isSingleton(left) && Type.isSingleton(right))
        return Type.singleton(left.value !== right.value);
      else
        return Type.boolean;

    case '+':
      if (Type.isSubtype(left, Type.number) && Type.isSubtype(right, Type.number)) {
        if (Type.isSingleton(left) && Type.isSingleton(right)) {
          if (typeof left.value !== 'number' || typeof right.value !== 'number')
            bug('unexpected value');
          return Type.singleton(left.value + right.value);
        } else {
          return Type.number;
        }
      } else {
        err('+ expects numbers', ast);
      }

    default: bug(`unimplemented ${ast.operator}`);
  }
}

In the case for +, we check subtyping rather than equality with number, so we can add a singleton number to a number or vice versa. (See below about subtyping singleton types.)

Actual TypeScript does not synthesize singleton types for these operators. I have some ideas about why not, but let's go with it for now and see where it leads.

Logical operators

We'll need some helper functions that determine whether values of a type are known at type checking time to be truthy or falsy. For singletons we can check the specific value; otherwise we know that objects and functions are always truthy, null is falsy, and for numbers, strings, and booleans we don't know.

function isTruthy(type: Type) {
  switch (type.type) {
    case 'Object':    return true;
    case 'Function':  return true;
    case 'Singleton': return type.value;
    default:          return false;
  }
}

function isFalsy(type: Type) {
  switch (type.type) {
    case 'Null':      return true;
    case 'Singleton': return !type.value;
    default:          return false;
  }
}

Now for logical operators (LogicalExpression) we:

synthesize types from the left and right subexpressions;
when we know at type checking time which subexpression is returned, then return its type (recall that in JavaScript logical operators are short-circuit and return the value of the last-evaluated subexpression);
otherwise return boolean (this is not correct but we can't return the correct type until we add unions in part 4).

Here's the code:

function synthLogical(env: Env, ast: AST.LogicalExpression): Type {
  const left = synth(env, ast.left);
  const right = synth(env, ast.right);
  switch (ast.operator) {
    case '&&':
      if (Type.isFalsy(left))       return left;
      else if (Type.isTruthy(left)) return right;
      else                          return Type.boolean; // left | right

    case '||':
      if (Type.isTruthy(left))      return left;
      else if (Type.isFalsy(left))  return right;
      else                          return Type.boolean; // left | right

    default: bug(`unimplemented ${ast.operator}`);
  }
}

Actual TypeScript does synthesize singleton types for these operators—for example:

const a = 7
const b = 9
const x = (a && b) // x has type 9

Unary operators

For unary operators (UnaryExpression) we

synthesize a type from the argument subexpression;
for ! return a singleton if we know whether the argument is truthy or falsy; otherwise return boolean
for typeof return the appropriate singleton given the argument type.

Here's the code:

function typeofType(type: Type): string {
  switch (type.type) {
    case 'Singleton': return typeofType(type.base);
    case 'Boolean': return 'boolean';
    case 'Function': return 'function';
    case 'Null': return 'object';
    case 'Number': return 'number';
    case 'Object': return 'object';
    case 'String': return 'string';
  }
}

function synthUnary(env: Env, ast: AST.UnaryExpression): Type {
  const argument = synth(env, ast.argument);
  switch (ast.operator) {
    case '!':
      if (Type.isTruthy(argument))     return Type.singleton(false);
      else if (Type.isFalsy(argument)) return Type.singleton(true);
      else                             return Type.boolean;

    case 'typeof':
      return Type.singleton(typeofType(argument));

    default: bug(`unimplemented ${ast.operator}`);
  }
}

Actual TypeScript synthesizes singleton types for ! but not for typeof. I don't know why not! But again let's go with it.

Subtyping singleton types

A singleton type is a subtype of another singleton type only when they have the same value.

A singleton type is always a subtype of its base type (as above, it supports all the same operations as its base type), so it's a subtype of another type when its base type is a subtype of the other type. So for example 7 is a subtype of number. (We'll be able to construct less-trivial examples once we add unions.)

Here's the code (see isSubtype.ts):

function isSubtype(a: Type, b: Type): boolean {
  ...
  if (Type.isSingleton(a)) {
    if (Type.isSingleton(b)) return a.value === b.value;
    else                     return isSubtype(a.base, b);
  }
  ...
}

Checking expressions against singleton types

Recall from part 1 that to check an expression against a type, we break down the expression and type, then check their corresponding parts. But since we have singleton types for primitive values only, there's nothing to break down—we just fall back to the default: synthesize a type from the expression, then check that the synthesized type is a subtype of the expected type. So there is no new code to write.

Operator expressions are always synthesized; we can't in general deduce input types from output types. For example, to check x + y against 16, we know that x and y must have singleton number types that sum to 16, but we don't know how to split 16 between them. So we fall back to synthesizing and checking subtyping. (In part 6 we'll see that narrowing involves deducing constraints on input types from output types.)

Try it!

You can try out the type checker below. Click on an example button or type an expression in the top box. In the bottom box you'll see a trace of the type checker execution, ending in a synthesized type (or an error). The trace is a tree of function calls; click on a function call to expand the tree under that call, or mouse over a call to highlight the matching return value.

The plan

For the full code of part 3 see https://github.com/jaked/reconstructing-typescript/tree/part3. To view the changes between part 2 and part 3 see https://github.com/jaked/reconstructing-typescript/compare/part2...part3.

Next time we'll add union types to represent type alternatives like string | boolean.

Thanks to Hazem Alhalabi for helpful feedback on a draft of this post.

Please email me with comments, criticisms, or corrections.