Jake Donham > Technical Difficulties > Reconstructing TypeScript, part 6

Reconstructing TypeScript, part 6: narrowing

2021-11-11

This post is part of a series about implementing type checking for a TypeScript-like language. See part 0 for background, and parts 1, 2, 3, 4, 5 for the implementation so far.

In this part we'll add conditional expressions like

// x: { type: 'a', a: boolean } | { type: 'b', b: string }
x.type === 'a' ? x.a : x.b

and implement narrowing: the type checker will use information from the test to deduce a more-specific type for x in the true and false branches of the conditional expression, so it can check that the property accesses are safe.

What's narrowing?

In a conditional expression like

// x: { type: 'a', a: boolean } | { type: 'b', b: string }
x.type === 'a' ? x.a : x.b

the true branch x.a is executed only when the test x.type === 'a' is true, and the false branch x.b is executed only when the test is false. So while type checking the true branch we can assume that the test is true, then use information deduced from that assumption to give more-specific (narrower) types to variables in the environment; and similarly for the false branch.

While type checking the true branch x.a we assume that x.type === 'a' is true; from that assumption and the current environment we deduce

x.type satisfies the type 'a'
x satisfies the type { type: 'a' }
we know that x satisfies { type: 'a', a: boolean } | { type: 'b', b: string } from the current environment
but { type: 'a' } and { type: 'b' } conflict, so x cannot satisfy the second arm of the union
so x must satisfy { type: 'a', a: boolean }

Using this narrower type for x we see that x.a is safe.

Narrowing as logical implication

It's useful to think of the type environment as a logical statement that asserts facts about the types of variables. For example, this environment

x: { type: 'a', a: boolean } | { type: 'b', b: string }
y: string
z: number

corresponds to this logical statement

x: { type: 'a', a: boolean } | { type: 'b', b: string } AND
y: string AND
z: number

As we saw in part 5, we can also think of intersection and union types as logical AND and OR; so all of these logical statements mean the same thing

x: 'a' | 'b' AND x: 'a'
x: ('a' | 'b') & 'a'
x: 'a'

We can read a conditional test as a logical statement; then to assume it true (when we type check the true branch of a conditional), we conjoin it to the environment

x: { type: 'a', a: boolean } | { type: 'b', b: string } AND
y: string AND
z: number AND
x.type === 'a'

then deduce that this statement implies

x: { type: 'a', a: boolean }

However, type environments hold bindings of variables to types, so we can't represent the statement above directly; but we can work around the limitations:

We can't directly represent a value equality, but we can represent an equivalent singleton type:

x.type: 'a' // represents x.type === 'a'

We can't directly represent the type of a member expression, but we can represent an equivalent variable binding:

x: { type: 'a' } // represents x.type: 'a'

We can't directly represent a conjunction of statements about the same variable (since type environments hold only a single type for each variable), but we can represent an equivalent intersection type:

x: ({ type: 'a', a: boolean } | { type: 'b', b: string }) & { type: 'a' }
// represents
// x: { type: 'a', a: boolean } | { type: 'b', b: string } AND
// x: { type: 'a' }

In the code we'll stick with the existing type environment representation; but keep in mind the view of the environment as a logical statement.

Representing negation

To type check the false branch of the example, we need to represent the logical statement

NOT (x.type === 'a')

There's no way to shoehorn this into the existing implementation—we need another kind of type: negation types, written !t. As usual we extend the Type union (see types.ts):

type Type = ... | Not;
  

type Not = {
  type: 'Not';
  base: Type;
}

add a constructor (see constructors.ts) and validator (see validators.ts):

function not(base: Type): Not {
  return { type: 'Not', base };
}

function isNot(type: Type): type is Not {
  return type.type === 'Not';
}

and a case for Not in toString.ts. Actual TypeScript doesn't support negation types, and the Babel parser doesn't parse any syntax for them, so we don't add a case to ofTSType.ts.

Now, for the false branch x.b, we assume x.type === 'a' is false and deduce

x.type satisfies !'a'
x satisfies { type: !'a' }
x satisfies { type: 'a', a: boolean } | { type: 'b', b: string }
but { type: !'a' } and { type: 'a' } conflict, so x cannot satisfy the first arm of the union
so x must satisfy { type: 'b', b: string }

so x.b is safe.

Negation types could be useful in ordinary programming, but we won't support them in the whole type checker; to keep things simple we'll use them only in narrowing.

Implementing narrowing

Narrowing involves several related functions—it's a lot of code!

function narrow(env: Env, ast: AST.Expression, assume: boolean): Env

When we type check a branch of a conditional expression, we call narrow with the current environment, the test expression, and a flag saying whether the test is assumed to be true or false. It returns an updated environment incorporating the information deduced from the test and assumption.

function narrowPath(env: Env, ast: AST.Expression, type: Type): Env

When we deduce that a subexpression of the test must satisfy a type, we call narrowPath to incorporate it into the environment. Here a path is an expression that references a variable, like x.type; as above, we need to reason backwards until we reach a variable so we can update the environment.

function narrowType(x: Type, y: Type): Type

When we reach a variable, we call narrowType with the current type of the variable from the environment and the deduced type, then update the variable's type with the result. This corresponds to a logical AND, so narrowType is a kind of intersection—more on this below.

narrow

Here's the top-level narrow function (see narrow.ts):

function narrow(env: Env, ast: AST.Expression, assume: boolean): Env {
  switch (ast.type) {
    case 'UnaryExpression':
      return narrowUnary(env, ast, assume);

    case 'LogicalExpression':
      return narrowLogical(env, ast, assume);

    case 'BinaryExpression':
      return narrowBinary(env, ast, assume);

    default:
      return narrowPath(env, ast, assume ? Type.truthy : Type.falsy);
  }
}

For unary, logical, and binary expressions we dispatch to AST-specific functions to break them down. Otherwise the expression is a path, so we call narrowPath to narrow the path to a type that satisfies truthy or falsy (see truthiness.ts).

narrowUnary

Assuming an expression is true is the same as assuming its negation is false (and vice versa); so for ! we narrow the argument with the assumption flipped. Since typeof always returns a truthy string, we deduce nothing (but see the narrowPath case for typeof below, when it appears inside an equality test). Here's the code (see narrow.ts):

function narrowUnary(env: Env, ast: AST.UnaryExpression, assume: boolean) {
  switch (ast.operator) {
    case '!':
      return narrow(env, ast.argument, !assume);

    case 'typeof':
      return env;

    default: bug(`unexpected ${ast.operator}`);
  }
}

narrowLogical

When an &&-expression is assumed true, both sides must be true, so we narrow assuming both sides are true. When an &&-expression is assumed false, one side must be false: if we know the left side is true then we narrow assuming the right side is false; if we know the right side is true then we narrow assuming the left side is false; otherwise we deduce nothing. (The case for ||-expressions follows similar reasoning.) Here's the code (see narrow.ts):

function narrowLogical(env: Env, ast: AST.LogicalExpression, assume: boolean) {
  switch (ast.operator) {
    case '&&':
      if (assume) {
        env = narrow(env, ast.left, true);
        return narrow(env, ast.right, true);
      } else {
        if (Type.isTruthy(synth(env, ast.left)))
          return narrow(env, ast.right, false);
        else if (Type.isTruthy(synth(env, ast.right)))
          return narrow(env, ast.left, false);
        else
          return env;
      }

    case '||':
      if (!assume) {
        env = narrow(env, ast.left, false);
        return narrow(env, ast.right, false);
      } else {
        if (Type.isFalsy(synth(env, ast.left)))
          return narrow(env, ast.right, true);
        else if (Type.isFalsy(synth(env, ast.right)))
          return narrow(env, ast.left, true);
        else
          return env;
      }

    default: bug(`unexpected AST ${ast.operator}`);
  }
}

When we don't know that one side is true, we still know that one side must be false! We could express this by the logical statement

(NOT ast.left) OR (NOT ast.right)

but we have no way to represent this statement using type environments. We have union and negation types, but they describe individual variables; ast.left and ast.right could mention different variables or be arbitrarily complicated logical statements.

Actual TypeScript doesn't do much better here; it seems to deduce the disjunction only for tests involving the same variable:

const z = 1 as 1 | 2 | 3

if (z !== 1 && z !== 2) {
  z // z has type 3
} else if (z !== 1) {
  z // z has type 2
  // because ((NOT (z !== 1)) OR (NOT (z !== 2))) AND (z !== 1)
  // implies NOT (z !== 2)
}

This doesn't work for two different variables, or for two properties of the same object.

narrowBinary

When an ===-expression is assumed true (or a !==-expression assumed false), we can deduce the logical statement

ast.left === ast.right

We can't represent this directly with types and environments, but we can represent the statement that each side has the same type as the other side (since the value must satisfy both types); so we narrow each side to the type of the other.

When a !==-expression is assumed true (or an ===-expression assumed false), we can deduce the logical statement

ast.left !== ast.right

When one side has singleton type we can narrow the other side to the negation of the singleton type; in other cases we have no way to represent the statement.

Here's the code (see narrow.ts):

function narrowBinary(env: Env, ast: AST.BinaryExpression, assume: boolean) {
  if (!AST.isExpression(ast.left)) bug(`unimplemented ${ast.left.type}`);
  const left = synth(env, ast.left);
  const right = synth(env, ast.right);

  if (ast.operator === '===' && assume || ast.operator === '!==' && !assume) {
    env = narrowPath(env, ast.left, right);
    return narrowPath(env, ast.right, left);

  } else if (ast.operator === '!==' && assume || ast.operator === '===' && !assume) {
    if (Type.isSingleton(right))
      env = narrowPath(env, ast.left, Type.not(right));
    if (Type.isSingleton(left))
      env = narrowPath(env, ast.right, Type.not(left));
    return env;

  } else return env;
}

narrowPath

Here's the top-level narrowPath function (see narrow.ts):

function narrowPath(env: Env, ast: AST.Expression, type: Type): Env {
  switch (ast.type) {
    case 'Identifier':
      return narrowPathIdentifier(env, ast, type);

    case 'MemberExpression':
      return narrowPathMember(env, ast, type);

    case 'UnaryExpression':
      return narrowPathUnary(env, ast, type);

    default: return env;
  }
}

For ASTs representing paths we dispatch to AST-specific functions; otherwise we deduce nothing.

narrowPathIdentifier

For an identifier, we look up the current type in the environment, narrow it to the deduced type with narrowType, then update the environment (see narrow.ts):

function narrowPathIdentifier(env: Env, ast: AST.Identifier, type: Type) {
  const ident = env.get(ast.name);
  if (!ident) bug('expected bound identifer');
  return env.set(ast.name, narrowType(ident, type));
}

narrowPathMember

As above, we can't directly represent a logical statement giving a type to a member expression like

x.type: 'a'

so we turn it into the equivalent

x: { type: 'a' }

Here's the code (see narrow.ts):

function narrowPathMember(env: Env, ast: AST.MemberExpression, type: Type) {
  if (ast.computed) bug(`unimplemented computed`);
  if (!AST.isIdentifier(ast.property)) bug(`unexpected ${ast.property.type}`);
  return narrowPath(
    env,
    ast.object,
    Type.object({ [ast.property.name]: type })
  );
}

narrowPathUnary

For a logical statement like

typeof x === 'boolean'

we give the left side the equivalent singleton type (via narrowBinary)

typeof x: 'boolean'

then deduce information about x from the type of typeof x. Here's the code (see narrow.ts):

function narrowPathUnary(env: Env, ast: AST.UnaryExpression, type: Type) {
  switch (ast.operator) {
    case '!':
      return env;

    case 'typeof':
      if (Type.isSingleton(type)) {
        switch (type.value) {
          case 'boolean':
            return narrowPath(env, ast.argument, Type.boolean);
          case 'number':
            return narrowPath(env, ast.argument, Type.number);
          case 'string':
            return narrowPath(env, ast.argument, Type.string);
          case 'object':
            return narrowPath(env, ast.argument, Type.object({}));
          default: return env;
        }
      } else if (Type.isNot(type) && Type.isSingleton(type.base)) {
        switch (type.base.value) {
          case 'boolean':
            return narrowPath(env, ast.argument, Type.not(Type.boolean));
          case 'number':
            return narrowPath(env, ast.argument, Type.not(Type.number));
          case 'string':
            return narrowPath(env, ast.argument, Type.not(Type.string));
          case 'object':
            return narrowPath(env, ast.argument, Type.not(Type.object({})));
          default: return env;
        }
      }
      else return env;

    default: bug(`unexpected ${ast.operator}`);
  }
}

If the typeof a path satisfies the negation of a singleton type tag, we deduce that the path must satisfy the negation of the corresponding type. All object types are subtypes of {}; but we have no way to represent a type that includes all functions, so for 'function' we deduce nothing.

narrowType

As above, narrowType(x, y) corresponds to a logical statement that some variable has type x AND it has type y; so it's a kind of intersection of x and y. However, we don't implement it with Type.intersection. There are two reasons for this:

narrowType handles Not-types; we don't want to hair up the rest of the type checker with Not-types, so we have a specialized function.
Type.intersection doesn't normalize inside object types (see part 5), but our strategy in narrowPathMember produces intersections of object types that we'd like to normalize to produce more readable types.

Here's the code (see narrow.ts):

function narrowType(x: Type, y: Type): Type {
  if (Type.isNever(x) || Type.isNever(y)) return Type.never;
  if (Type.isUnknown(x)) return widenNots(y);
  if (Type.isUnknown(y)) return x;

  if (Type.isUnion(x))
    return Type.union(...x.types.map(a => narrowType(a, y)));
  if (Type.isUnion(y))
    return Type.union(...y.types.map(b => narrowType(x, b)));

  if (Type.isIntersection(x))
    return Type.intersection(...x.types.map(a => narrowType(a, y)));
  if (Type.isIntersection(y))
    return Type.intersection(...y.types.map(b => narrowType(x, b)));

  if (Type.isNot(y)) {
    if (Type.isSubtype(x, y.base)) {
      return Type.never;
    } else if (Type.isBoolean(x) && Type.isSingleton(y.base) && Type.isBoolean(y.base.base)) {
      return Type.singleton(!y.base.value);
    } else {
      return x;
    }
  }

  if (Type.isSingleton(x) && Type.isSingleton(y))
    return (x.value === y.value) ? x : Type.never;
  if (Type.isSingleton(x))
    return (x.base.type === y.type) ? x : Type.never;
  if (Type.isSingleton(y))
    return (y.base.type === x.type) ? y : Type.never;

  if (Type.isObject(x) && Type.isObject(y)) {
    const properties =
      x.properties.map(({ name, type: xType }) => {
          const yType = Type.propType(y, name);
          const type = yType ? narrowType(xType, yType) : xType;
          return { name, type };
          // if there are  fields in `y` that are not in `x`, ignore them
        },
        { }
      );
    if (properties.some(({ type }) => Type.isNever(type)))
      return Type.never;
    else
      return Type.object(properties);
  }

  return Type.intersection(x, y);
}

narrowType is similar to overlaps (see part 5); but instead of returning a flag it returns a type representing the overlap between its arguments (or never if they don't overlap).

narrowType accepts Not-types in its second argument, but does not return Not-types; so we never put Not-types in the environment, and the rest of the type checker doesn't need to deal with them.

Normally unknown & t === t for any type t, but here t might contain Not-types, so in this case we call widenNots (see narrow.ts) to remove them.

If the second argument is !t and the first argument is a subtype of t, then there's no overlap between them, so we return never. Along with the union rule, this makes types like 0|1 & !1 reduce to 0; but since boolean is not explicitly a union, we special-case it so boolean & !true reduces to false (and vice versa).

Using narrowing in type checking

OK! Now we know how to narrow type environments to incorporate information deduced from tests—let's see how to apply it in type checking:

Synthesizing types from conditional expressions

A conditional expression parses into a ConditionalExpression AST node. To synthesize a type from it, we

synthesize a type from the the test expression
synthesize a type from the true branch (also known as the consequent), using an environment narrowed with the assumption that the test is true
synthesize a type from the false branch (also known as the alternate), using an environment narrowed with the assumption that the test is false
return the union of the true and false branch types

Here's the code (see synth.ts):

function synth(env: Env, ast: AST.Expression): Type {
  ...
  case 'ConditionalExpression': return synthConditional(env, ast);
  ...
}

function synthConditional(env: Env, ast: AST.ConditionalExpression): Type {
  const test = synth(env, ast.test);
  const consequent = () => synth(narrow(env, ast.test, true), ast.consequent)
  const alternate = () => synth(narrow(env, ast.test, false), ast.alternate);
  if (Type.isTruthy(test))
    return consequent();
  else if (Type.isFalsy(test))
    return alternate();
  else
    return Type.union(consequent(), alternate());
}

When the test is true, we return the type of the consequent directly, and don't type check the alternate at all (and similarly when it's false). This is a little weird! It means you can put nonsense in an untaken branch, like

true ? 7 : 7(9)

But it's correct in the sense that the untaken branch is never executed, so its type (or whether it has type errors) can't matter. And it makes it possible to check expressions against intersection types. For example, in

(x => 0 + (typeof x === 'number' ? x : x(7))) as
  ((x: number) => number) & ((x: ((n: number) => number)) => number)

we check the function expression against the intersection by separately checking against each part (0 + before the conditional forces it to be synthesized, not checked); when checking against the first part, the environment contains x: number, so x(7) would cause a type error, but typeof x === 'number' is true, so we don't type check the false branch.

One way to think about what's happening here is that if we did type check the untaken branch, x would be narrowed to never. We could add a case to Type.map to propagate never:

function map(t: Type, fn: (t: Type) => Type)) {
  if (Type.isNever(t)) return Type.never;
  ...
}

Then we'd synthesize never from x(7) in the untaken branch, so the example would type check without error. But total nonsense like 7(9) would cause an error. This might be a good idea?

Actual TypeScript flags an unreachable code error for untaken branches, but still type checks both branches.

Checking conditional expressions against types

To check a conditional expression against an expected type we

synthesize a type from the the test expression
check the true branch against the expected type, using an environment narrowed with the assumption that the test is true
check the false branch against the expected type, using an environment narrowed with the assumption that the test is false

Here's the code (see check.ts):

function check(env: Env, ast: AST.Expression, type: Type): void {
  ...
  if (AST.isConditionalExpression(ast))
    return checkConditional(env, ast, type);
  ...
}

function checkConditional(env: Env, ast: AST.ConditionalExpression, type: Type): void {
  const test = synth(env, ast.test);
  const consequent = () => check(narrow(env, ast.test, true), ast.consequent, type)
  const alternate = () => check(narrow(env, ast.test, false), ast.alternate, type)
  if (Type.isTruthy(test)) {
    consequent();
  } else if (Type.isFalsy(test)) {
    alternate();
  } else {
    consequent();
    alternate();
  }
}

Narrowing in logical operators

Because logical operators are short-circuit, we can also narrow the environment when type checking the right side. For an &&-expression, the right side is executed only when the left side is true, so we narrow assuming the left side is true (and similarly for an ||-expression).

We can finally give a precise result type for logical expressions when we don't know whether the left side is true or false: it's the union of the types of the sides, but the left side is narrowed to falsy values for && (when the left side is not falsy the right side is returned) and truthy values for ||.

Here's the code (see synth.ts):

function synthLogical(env: Env, ast: AST.LogicalExpression): Type {
  const left = synth(env, ast.left);
  const right = () => synth(narrow(env, ast.left, ast.operator === '&&'), ast.right);
  switch (ast.operator) {
    case '&&':
      if (Type.isFalsy(left))
        return left;
      else if (Type.isTruthy(left))
        return right();
      else
        return Type.union(narrowType(left, Type.falsy), right());

    case '||':
      if (Type.isTruthy(left))
        return left;
      else if (Type.isFalsy(left))
        return right();
      else
        return Type.union(narrowType(left, Type.truthy), right());

    default: bug(`unimplemented ${ast.operator}`);
  }
}

Aside: narrowing as logical implication, revisited

I wrote the narrowing code by following my nose; later I read a research paper (Logical Types for Untyped Languages) that gave me a different perspective on what the code does, and alerted me to some deficiencies:

As above, there are some logical statements that can't be represented faithfully (such as the disjunction produced by assuming an &&-expression false).

Also, our implementation does redundant work: for example, in

a && b && c

b is typechecked in an environment narrowed with a, and c is typechecked in an environment narrowed with a && b, which narrows a again. The problem is that narrowing information is not returned from synth, so must be reconstructed. In the paper, synth returns a type and also a pair of logical statements representing what can be deduced from the expression if it is assumed true or false.

It would be interesting to rewrite the type checker to use explicit logical statements to avoid these issues.

Also, I wonder if representing union and intersection types as logical statements might address the problems of incomplete subtyping (see part 4 and part 5). Subtyping can be seen as a kind of logical implication; maybe it can be implemented like a SAT solver? If you know something about this please get in touch!

Try it!

You can try out the type checker below. Click on an example button or type an expression in the top box. In the bottom box you'll see a trace of the type checker execution, ending in a synthesized type (or an error). The trace is a tree of function calls; click on a function call to expand the tree under that call, or mouse over a call to highlight the matching return value.

I love it when a plan comes together

For the full code of part 6 see https://github.com/jaked/reconstructing-typescript/tree/part6. To view the changes between part 5 and part 6 see https://github.com/jaked/reconstructing-typescript/compare/part5...part6.

Here's the whole series:

There are other interesting aspects of TypeScript that I might cover in the future, like

generic types and type parameter inference
recursive types
type-level computation

but for now this is the last part of Reconstructing TypeScript. Thanks for reading!

Please email me with comments, criticisms, or corrections.