Jake Donham > Technical Difficulties > Reconstructing TypeScript, part 2

Reconstructing TypeScript, part 2: functions and function calls

2021-09-27

This post is part of a series about implementing type checking for a TypeScript-like language. In the last post we walked through the code for type checking a small fragment of the language, with only primitive literals, object expressions, and member expressions.

In this part we'll add function definitions like

(x: number, y: number) => ({ x: x, y: y })

and function calls like

f(7, 9)

Since function definitions bind variables that can be used in the body of the function, we'll need to handle variable expressions like

x
y

and also keep track of the types of variables. Let's start there:

Environments

An environment is a table mapping variable names to types. A binding is a single name-to-type entry in the environment. When type checking a function definition, we add a binding for each argument to the environment; when we encounter a variable in the function body, we look up its type in the environment; and we drop the new bindings after type checking the function, since they're no longer in scope.

A straightforward way to implement this is with "functional update": to add a binding to an environment, copy the old environment and add the new binding. The old environment is not affected, so when we're finished type checking the function we discard the new environment and go back to the using the old one.

Here's the environment type (see env.ts):

type Env = {
  get(name: string): Type | undefined;
  set(name: string, type: Type): Env;
  entries(): IterableIterator<[string, Type]>;
}

We can get the type of a variable (it might be undefined if the variable is unbound) or set the type of a variable (returning a new environment). For debugging we can iterate over the entries (bindings) in an environment.

Here's the environment constructor:

function Env(map: Map<string, Type> | { [name: string]: Type }): Env {
  if (map instanceof Map) {
    return {
      get: (name: string) => map.get(name),
      set: (name: string, type: Type) =>
        Env(new Map([...map, [name, type]]))
      entries: () => map.entries()
    }
  } else {
    return Env(new Map(Object.entries(map)));
  }
}

Bindings are stored in an underlying Map; to get the type of a variable, just look it up in the Map; to set the type of a variable, copy the Map and add the new binding. Like the Type.object constructor from part 1, Env can also take an object mapping names to types.

Finally we have an Env module with an empty environment; and we export these all under the same name:

module Env {
  export const empty = Env({});
}

export default Env;

so we can refer to them uniformly as Env:

import Env from './env';

const env1: Env = Env.empty;
const env2: Env = Env({ foo: Type.string });

(This relies on TypeScript's declaration merging—honestly I don't understand it very well, but it does what I want here.)

Representing function types

To represent function types we add an arm to the Type union from part 1 (see types.ts):

type Type = ... | Function;
 
type Function = {
  type: 'Function';
  args: Type[];
  ret: Type;
}

For example, a function type

(x: number, y: number) => number

is represented as

{
  type: 'Function',
  args: [ Type.number, Type.number ],
  ret: Type.number
}

(We don't store the argument names; they aren't needed during type checking.)

We also add a constructor function (see constructors.ts):

function functionType(args: Types.Type[], ret: Types.Type): Types.Function {
  return { type: 'Function', args, ret };
}

and a validator function (see validators.ts):

export function isFunction(type: Types.Type): type is Types.Function {
  return type.type === 'Function';
}

and cases for function types in toString.ts and ofTSType.ts.

Synthesizing types from expressions

Now we can add cases to synthesize types (see part 1) from variable, function, and call expressions. First we need to update synth to take the current environment as an argument, and pass it down to all its helper functions and recursive calls; then add cases to dispatch to new helper functions (see synth.ts):

function synth(env: Env, ast: AST.Expression): Type {
  ...
  case 'Identifier':              return synthIdentifier(env, ast);
  case 'ArrowFunctionExpression': return synthFunction(env, ast);
  case 'CallExpression':          return synthCall(env, ast);
  ...
}

Variable expressions

To synthesize a type from a variable expression (Identifier), we look it up in the environment:

function synthIdentifier(env: Env, ast: AST.Identifier): Type {
  const type = env.get(ast.name);
  if (!type) err(`unbound identifier '${ast.name}'`, ast);
  return type;
}

or fail if the variable isn't bound.

Function expressions

To synthesize a type from a function expression (ArrowFunctionExpression) like

(x: number, y: number) => ({ x: x, y: y })

we:

add bindings to the environment for each function argument and its type;
synthesize the type of the function body using the new environment; and
return a function type from the argument types to the return type.

Here's the code:

function synthFunction(env: Env, ast: AST.ArrowFunctionExpression): Type {
  if (!AST.isExpression(ast.body)) bug(`unimplemented ${ast.body.type}`)
  const bindings = ast.params.map(param => {
    if (!AST.isIdentifier(param)) bug(`unimplemented ${param.type}`);
    if (!param.typeAnnotation) err(`type required for '${param.name}'`, param);
    if (!AST.isTSTypeAnnotation(param.typeAnnotation)) bug(`unimplemented ${param.type}`);
    return {
      name: param.name,
      type: Type.ofTSType(param.typeAnnotation.typeAnnotation),
    };
  });
  const args = bindings.map(({ type }) => type);
  const bodyEnv =
    bindings.reduce(
      (env, { name, type }) => env.set(name, type),
      env
    );
  const ret = synth(bodyEnv, ast.body);
  return Type.functionType(args, ret);
}

We require that function arguments have type annotations so we can bind them in the environment, and convert the type annotations to our Type representation with ofTSType. As usual we exclude some syntax that we don't handle.

Call expressions

To synthesize a type from a call expression (CallExpression) like

f(7, 9)

we:

synthesize the type of the left-hand side (the callee);
if it's not a function type, fail;
check (see part 1) each argument expression against the corresponding argument type; and
return the callee's return type.

Here's the code:

function synthCall(env: Env, ast: AST.CallExpression): Type {
  if (!AST.isExpression(ast.callee)) bug(`unimplemented ${ast.callee.type}`);
  const callee = synth(env, ast.callee);
  if (!Type.isFunction(callee)) err(`call expects function`, ast.callee);
  if (callee.args.length !== ast.arguments.length)
    err(`expected ${callee.args.length} args, got ${ast.arguments.length} args`, ast);
  callee.args.forEach((type, i) => {
    const arg = ast.arguments[i];
    if (!AST.isExpression(arg)) bug(`unimplemented ${arg.type}`)
    check(env, arg, type);
  });
  return callee.ret;
}

It would be safe to allow a call to have fewer arguments than expected by the callee's type when the missing arguments have types that are subtypes of undefined; I left this out for simplicity. To get this behavior in actual TypeScript you need to explicitly mark the argument as optional (with a trailing ?). It would also be safe to allow calls with extra arguments, but it isn't useful, and might be a bug, so better to flag an error.

Subtyping function types

When is a function type A1 => A2 a subtype of B1 => B2?

Recall from part 1 that we can think of subtyping as an adversarial game: I pass a function of type A1 => A2 to an opponent who may perform any operations supported by type B1 => B2 on it; if my opponent can't perform any unsupported operations on the value, then A1 => A2 is a subtype of B1 => B2.

My opponent may pass the function an argument of type B1, and the function (of type A1 => A2) may perform any operations on the argument that are supported by A1. So B1 must be a subtype of A1, since it must support any operation supported by A1. After the function returns a value (of type A2), my opponent may perform any operations supported by B2. So A2 must be a subtype of B2.

We say that function types are covariant in their return type, because the subtyping goes the same way: if A1 => A2 is a subtype of B1 => B2 then A2 is a subtype of B2. And we say that function types are contravariant in their argument types, because the subtyping goes the opposite way: if A1 => A2 is a subtype of B1 => B2 then B1 is a subtype of A1.

So we add a case for function types to isSubtype as follows (see isSubtype.ts):

function isSubtype(a: Type, b: Type): boolean {
  ...

  if (Type.isFunction(a) && Type.isFunction(b)) {
    return a.args.length === b.args.length &&
      a.args.every((a, i) => isSubtype(b.args[i], a)) &&
      isSubtype(a.ret, b.ret);
  }
  ...
}

It would be safe for a function type (a: A) => C to be a subtype of (a: A, b: B) => C, because the extra argument can be ignored. And it would be safe for a function type (a: A, b: B | undefined) => C to be a subtype of (a: A) => C, because the missing argument can be treated as undefined. I left these cases out for simplicity; actual TypeScript supports them (but again requires an explicit ? marker in the second case).

Checking function expressions against function types

Recall from part 1 that to check an expression against a type, we break down the expression and type and check their corresponding parts. To check a function expression against a function type we:

match up the function expression arguments against the function type arguments;
add each name-to-type binding to the environment; and
check the function body against the return type using the new environment.

Here's the code (see check.ts)

function check(env: Env, ast: AST.Expression, type: Type) {
  ...
  if (AST.isArrowFunctionExpression(ast) && Type.isFunction(type))
    return checkFunction(env, ast, type);
  ...
}

function checkFunction(env: Env, ast: AST.ArrowFunctionExpression, type: Type.Function) {
  if (!AST.isExpression(ast.body)) bug(`unimplemented ${ast.body.type}`)
  if (type.args.length != ast.params.length)
    err(`expected ${type.args.length} args, got ${ast.params.length} args`, ast);
  const bindings = ast.params.map((param, i) => {
    if (!AST.isIdentifier(param)) bug(`unimplemented ${param.type}`);
    return { name: param.name, type: type.args[i] };
  });
  const bodyEnv =
    bindings.reduce(
      (env, { name, type }) => env.set(name, type),
      env
    );
  check(bodyEnv, ast.body, type.ret);
}

Again we could handle some cases here where the argument lists have different lengths; I've left them out for simplicity.

Note that we don't require type annotations on the function arguments. I mentioned in part 0 that a benefit of bidirectional type checking over just synthesis with subtyping is that it reduces necessary type annotations—when checking a function expression against a function type, we already know the argument types, so we can omit them in the expression. This is especially convenient with higher-order functions, like

[1, 2, 3].map(x => x + 1)

However, if you do include argument type annotations we just ignore them; it would be better to check (as actual TypeScript does) that each argument type annotation is a subtype of the expected argument type.

Example?

I tried to write up an example like I did for part 1, but with environments in the mix it got too complicated and tedious—instead I added an interactive call tree to the demo widget so you can work through examples yourself, see below.

Try it!

You can try out the type checker below. In the top box, click on an example button or type an expression (remember that the supported expressions are primitive literals, object expressions, member expressions, variables, functions, function calls, and as ascriptions). In the bottom box you'll see a trace of the type checker execution, ending in a synthesized type (or an error). The trace is a tree of function calls; click on a function call to expand the tree under that call, or mouse over a call to highlight the matching return value.

Notice how the environment changes when we synth a function expression or check a function expression against a function type. Also notice how type checking switches from synth to check when we synth a call expression.

The plan

For the full code of part 2 see https://github.com/jaked/reconstructing-typescript/tree/part2. To view the changes between part 1 and part 2 see https://github.com/jaked/reconstructing-typescript/compare/part1...part2.

Next time we'll add singleton types and some operators so we can write more interesting programs.

Please email me with comments, criticisms, or corrections.