TypeScript Contribution Diary: Allowing Code in Constructors Before `super()` (Technical Overview)
March 07, 2022
This contribution diary post is much longer than normal because its subject matter is deeper. It also assumes youâve read through previous entries and/or are already familiar with how JavaScript compilers and type checkers work. If thatâs not the case, no worries! Read through a previous entry such as TypeScript Contribution Diary: Improved Syntax Error for Enum Member Colons and Andrew Branchâs Debugging the TypeScript Codebase.
My previous TypeScript Contribution Diary posts were structured as stories explaining the timeline of how those changes made it in. This entryâs pull request had 159 comments over three years â far too many for that format. Iâll instead give a high-level overview of the backing issueâs context, the pull requestâs strategy, and general code changes.
Project Scope
There ended up being two areas of source code I had to change:
- Updating the Type Checker: Adjusting TypeScriptâs type errors to be more lenient
- Updating Transformers: Adjusting output JavaScript for more varieties of constructors
Iâll give a high-level overview for each. Iâd strongly recommend referring back to the pull request in your local editor to understand the flow of code.
Letâs dig in! đ
Updating the Type Checker
Most use cases for including non-this
, non-super
code in the
constructor of a derived class are fairly small. The ones Iâd
seen in the wild were generally about logging and/or creating
a temporary variable to be passed as an argument to the
super()
call. I also didnât
want to spend a great deal of time to handle complicated
logical cases.
Thus, I thought itâd be best to tweak TypeScriptâs type system
logic without overhauling it. Instead of requiring the
super()
call be the
first expression in the constructor, I would make two
requirements:
-
It would need to be a root-level expression:
meaning it couldnât be contained in a block such as an
if
orfor
-
Runtime uses of
this
andsuper
keywords would not be allowed before that root-level expression
You can see the changes in the pull requestâs
src/compiler/checker.ts
file view. These next two blog post sections will give a high-level
overview of them.
Checking for a Root Level
super()
TypeScriptâs type checker already found the first
super()
call in a
constructor using a call to an existing
findFirstSuperCall
function:
const superCall = findFirstSuperCall(node.body!);
That function returns the first node that matches
isSuperCall
, skipping any
function boundary and recursively searching through all other
child nodes:
function findFirstSuperCall(node: Node): SuperCall | undefined {
return isSuperCall(node)
? node
: isFunctionLike(node)
? undefined
: forEachChild(node, findFirstSuperCall);
}
I fortunately didnât need to change
findFirstSuperCall
for my
changes.
I used the existing
superCall
variable for a
check to make sure it was root level with a new
superCallIsRootLevelInConstructor
function:
if (!superCallIsRootLevelInConstructor(superCall, node.body!)) {
error(
superCall,
Diagnostics.A_super_call_must_be_a_root_level_statement_within_a_constructor_of_a_derived_class_that_contains_initialized_properties_parameter_properties_or_private_identifiers
);
}
superCallIsRootLevelInConstructor
checks whether a super()
call expressionâs parent
expression statement is in the body of a constructor:
function superCallIsRootLevelInConstructor(superCall: Node, body: Block) {
const superCallParent = walkUpParenthesizedExpressions(superCall.parent);
return (
isExpressionStatement(superCallParent) &&
superCallParent.parent === body
);
}
To recap TypeScriptâs AST behavior around call statements:
-
Block: Area containing lines of code,
commonly surrounded by
{}
-
Statement: Line of code, commonly a child
of a block
- Examples include expression statements, for statements, and if statements
- Expression Statement: Contains a call expression as its child expression
- Call Expression: A call to a function
I find it easier to remember the distinction by recalling that statements may optionally have a semicolon. In codebases that include semicolons, expression statements contain a child such as a binary expression or call expression plus one character for a semicolon:
super();
|------| <- expression statement
|-----| <- call expression
Checking Constructor Statement Order
Next up was making sure nothing in the constructor accessed
super
or
this
before the
super()
call. I did that
with a for loop over the statements in the constructor. For
each statement:
-
If the statement is an expression statement that contains a
super()
call, mark that we found it and break the loop - If the statement is a âprologue directiveâ, continue
-
If the statement âimmediatelyâ references
super
orthis
, break the loop
for (const statement of node.body!.statements) {
if (
isExpressionStatement(statement) &&
isSuperCall(skipOuterExpressions(statement.expression))
) {
superCallStatement = statement;
break;
}
if (
!isPrologueDirective(statement) &&
nodeImmediatelyReferencesSuperOrThis(statement)
) {
break;
}
}
After the loop, if we hadnât found the
super()
call, issue a type
error with an amusingly long error message for failing to find
it.
if (superCallStatement === undefined) {
error(
node,
Diagnostics.A_super_call_must_be_the_first_statement_in_the_constructor_to_refer_to_super_or_this_when_a_derived_class_contains_initialized_properties_parameter_properties_or_private_identifiers
);
}
âA super call must be the first statement in the constructor to refer to super or this when a derived class contains initialized properties parameter properties or private identifiers.â
Prologue Directives
I had never heard of this term before this pull request. It
refers to string literals used as a statements such as
"use asm;"
and
"use strict";
. They are by
nature allowed to come before any code in a constructor.
In retrospect, I donât recall why I added a special case for them to the function. Ah well.
Edit 4/13/2022: The ECMAScript Spec refers to them as âDirective Prologuesâ. Whoops.
Immediately Referencing
super
or
this
By âimmediatelyâ I mean a node accesses
super
or
this
in code that is known
to execute immediately, such as children of expressions and
blocks. Another way of putting that is ignoring any code that
wonât be immediately executed, such as function or property
declaration. There are a lot of edge cases in there! For
example, a class
extends
clause immediately
executes the base class being extended, but initial values for
properties in any class arenât used in runtime until the
constructor for their class is called.
class Base {}
class Derived extends Base {
constructor() {
// class Middle { ... } executes immediately for Inside to extend it...
class Inside extends class Middle {
// ...while this property is created later, per-instance
woweeMiddle = this;
} {
// ...while this property is created later, per-instance
woweeInside = this;
}
super();
new Inside();
}
}
I wrote a
nodeImmediatelyReferencesSuperOrThis
helper function that, similar to
findFirstSuperCall
,
recursively checks children of a node. It stops searching when
it encounters a node that creates a new class scope or delays
execution of its contents, such as a function or class
property.
function nodeImmediatelyReferencesSuperOrThis(node: Node): boolean {
if (
node.kind === SyntaxKind.SuperKeyword ||
node.kind === SyntaxKind.ThisKeyword
) {
return true;
}
if (isThisContainerOrFunctionBlock(node)) {
return false;
}
return !!forEachChild(node, nodeImmediatelyReferencesSuperOrThis);
}
/**
* @returns Whether the node creates a new 'this' scope for its children.
*/
export function isThisContainerOrFunctionBlock(node: Node): boolean {
switch (node.kind) {
// Arrow functions use the same scope, but may do
// so in a "delayed" manner
// For example, `const getThis = () => this` may be
// before a super() call in a derived constructor
case SyntaxKind.ArrowFunction:
case SyntaxKind.FunctionDeclaration:
case SyntaxKind.FunctionExpression:
case SyntaxKind.PropertyDeclaration:
return true;
case SyntaxKind.Block:
switch (node.parent.kind) {
case SyntaxKind.Constructor:
case SyntaxKind.MethodDeclaration:
case SyntaxKind.GetAccessor:
case SyntaxKind.SetAccessor:
// Object properties can have computed names;
// only method-like bodies start a new scope
return true;
default:
return false;
}
default:
return false;
}
}
With these approximate type checker changes, the type checker
allows for code before the
super()
call as long as it
doesnât immediately reference
super
or
this
. The type checker was
sufficiently updated for my changes. Hooray!
That leaves us with making TypeScriptâs code emit properly transform JavaScript for these new constructor variants.
Updating Transformers
TypeScriptâs code emit converts input TypeScript syntax to
output JavaScript syntax by passing each input AST through a
series of transformers. You can see the impacted transformers
in the
pull request
under src/transformers
.
Theyâre coordinated by a
getScriptTransformers
in
src/compiler/transformer.ts
.
The transformers relevant to this pull request are, in order:
-
transformTypeScript
: Removes type system specific syntax, leaving pure glorious JavaScript. -
transformClassFields
: Massages class fields such as class properties and parameter properties into their JavaScript equivalents. -
transformES....
: For each language version recognized by TypeScript, a transformer of the next language versionâs name transforms it.- These start at ESNext, then decrease sequentially from the newest known language version down to the configured output target language version.
-
For example, if the configured output language version
is
"es2019"
, then as of TypeScript 4.6 the transformers to be run would be:transformESNext
,transformES2021
, andtransformES2020
.
Transformers generally recursively crawl through the nodes in the fileâs AST, applying transformations to specific node types as they find them. These next three blog post sections will give a high-level overview of each of the changed transformers.
transformTypeScript
transformTypeScript
includes a
transformConstructorBody
function that turns any parameter properties into assignments
within the constructor.
For example, this TypeScript class:
class HasParameterProperty {
constructor(public property: number) {
console.log("Hello, world!");
}
}
âŚwould become this JavaScript class (or the equivalent with
Object.defineProperty
if
useDefineForClassFields
is
enabled):
class HasParameterProperty {
constructor(property) {
this.property = property;
console.log("Hello, world!");
}
}
transformTypeScript
previously assumed it could add both prologue directives and
the initial super call all at once when transforming a
constructor with nothing between them. It did so with a
function named
addPrologueDirectivesAndInitialSuperCall
that returned the index of the first statement after them.
I replaced that function with code that computed two important variables:
-
indexAfterLastPrologueStatement
: After copying any prologue statements, the index of the node just after them -
superStatementIndex
: Index of the first foundsuper()
call after prologue statements, or-1
if not found
const indexAfterLastPrologueStatement = factory.copyPrologue(
body.statements,
statements,
/*ensureUseStrict*/ false,
visitor
);
const superStatementIndex = findSuperStatementIndex(
body.statements,
indexAfterLastPrologueStatement
);
function findSuperStatementIndex(
statements: NodeArray<Statement>,
indexAfterLastPrologueStatement: number
) {
for (
let i = indexAfterLastPrologueStatement;
i < statements.length;
i += 1
) {
const statement = statements[i];
if (getSuperCallFromStatement(statement)) {
return i;
}
}
return -1;
}
Using those two variables, this is the order the code now takes to create the transformed constructorâs body in the proper order:
-
If
superStatementIndex
was found, first visit existing statements up to and including it -
Visit any parameter properties and map them into nodes:
-
If
superStatementIndex
was found, place those parameter properties immediately after it -
If
superStatementIndex
wasnât found, place the parameter properties first in the constructor
-
If
-
Add any remaining statements from the body, skipping the
superStatementIndex
index if it was found
// If there was a super call, visit existing statements up to and including it
if (superStatementIndex >= 0) {
addRange(
statements,
visitNodes(
body.statements,
visitor,
isStatement,
indexAfterLastPrologueStatement,
superStatementIndex + 1 - indexAfterLastPrologueStatement
)
);
}
// Transform parameters into property assignments. Transforms this:
//
// constructor (public x, public y) {
// }
//
// Into this:
//
// constructor (x, y) {
// this.x = x;
// this.y = y;
// }
//
const parameterPropertyAssignments = mapDefined(
parametersWithPropertyAssignments,
transformParameterWithPropertyAssignment
);
// If there is a super() call, the parameter properties go immediately after it
if (superStatementIndex >= 0) {
addRange(statements, parameterPropertyAssignments);
}
// Since there was no super() call, parameter properties are the first statements in the constructor
else {
statements = addRange(parameterPropertyAssignments, statements);
}
// Add remaining statements from the body, skipping the super() call if it was found
addRange(
statements,
visitNodes(body.statements, visitor, isStatement, superStatementIndex + 1)
);
transformClassFields
transformClassFields
also
contains a
transformConstructorBody
function. This time itâs used to turn class properties into
assignments within the constructor.
For example, this TypeScript class:
class HasClassProperty {
property = 1;
constructor() {
console.log("Hello, world!");
}
}
âŚwould become this JavaScript class (or the equivalent with
Object.defineProperty
if
useDefineForClassFields
is
enabled):
class HasClassProperty {
constructor() {
this.property = 1;
console.log("Hello, world!");
}
}
This
transformConstructorBody
also inserts a âsyntheticâ
super(...arguments)
if the
class is a derived one with a property initializer and without
its own constructor.
For example, this TypeScript class:
class HasJustClassProperty {
property = 1;
}
âŚneeds to create its own
constructor
and
super(...arguments)
in
order to hold the mapped property in its output JavaScript:
class HasJustClassProperty {
constructor() {
super(...arguments);
this.property = 1;
}
}
In order to account for code being emitted before any class properties and any constructor, the logic is roughly:
-
Map any prologue directives and explicit
super()
call into the new constructor -
If there was a
super()
call, splice any statements preceding it after the prologue statements and before thesuper()
call -
Later depending on whether a
super()
call was found:- If it was, add parameter properties immediately after it
-
If it wasnât but a synthetic
super(...arguments)
was added, add those parameter properties just after it - If neither is the case, add those parameter properties to the top of the constructor
Ordering is tricky!
I also excluded parameter properties from being moved into the
constructor when
useDefineForClassFields
is
enabled, as those properties are then handled elsewhere. I
donât remember where else theyâre handled but I do remember
that when I didnât filter them out, they appeared twice in the
output JavaScript.
Iâve omitted code snippets from this transformerâs explanation for brevity.
transformES2015
The ES2015-to-ES5 transformer is the largest of TypeScriptâs
transformers and contains more lines of code than all the
other ECMAScript transformers combined. I suggested in
#47573: Remove older emit support over time
that TypeScript no longer target ECMAScript versions older
than what any realistically used runtime environment needsâŚ
but until dropping pre-ES2020 happens some years in the future
(đ), ES2015 classes still need to be transformed into
function
prototype
equivalents in
TypeScriptâs compiled output JavaScript.
This TypeScript class:
class HasPropertyAndLog {
message = "world";
constructor() {
console.log("Hello", this.message);
}
}
âŚbecomes roughly this output JavaScript:
var HasPropertyAndLog = /** @class */ (function () {
function HasPropertyAndLog() {
this.message = "world";
console.log("Hello", this.message);
}
return HasPropertyAndLog;
})();
transformES2015
âs
transformConstructorBody
keeps track of two arrays of statement nodes:
-
prologue
: Any existing prologue directives, as well as any nodes added during transformation meant to be added just after them -
statements
: The rest of the statements output for the function body
My change started off by adding three pieces of logic:
-
Captures any previously existing prologue directives in an
existingPrologue
array -
Find the
super()
call, storing it in asuperCall
and its statement index insuperStatementIndex
-
This is done with a new
findSuperCallAndStatementIndex
that loops through constructor body statements after those inexistingPrologue
-
This is done with a new
-
Create a
postSuperStatementsStart
variable to determine where post-er(...)
nodes are meant to be placed:-
If a
super()
call wasnât found, place them just afterexistingPrologue
-
If a
super()
call was found, place them just aftersuperStatementIndex
-
If a
transformConstructorBody
is
then able to use that information to create constructor body
statements:
-
If the
super()
call wasnât synthesized, copy prologue statements intoprologue
-
Create a
superCallExpression
variable to store a newsuper()
call, if a previous one exists:-
If the existing
super()
is synthesized, replace it with the ES5 equivalent:var _this = _super !== null && _super.apply(this, arguments) || this;
-
If the existing
super()
wasnât synthesized, store the result of visiting it
-
If the existing
-
Add any default property value assignments and constructor
rest parameter to the end of
prologue
-
Add any remaining statements from the constructor to
statements
The logic for where to place that
superCallExpression
node
changes based on a few potential cases commented in
src/compiler/transformers/es2015.ts#1056:
-
Whether the constructor is in a derived class
-
If so, whether the constructor ends with a
super()
call and doesnât refer tothis
-
If so, whether the constructor ends with a
-
Whether the
super()
call, if it exists, is the first call in the constructor
Iâve omitted code snippets from this transformerâs explanation for brevity.
I know that was a big wall of text, but if you read through
the contents of
transformConstructorBody
and use its comments as reference, I think it can be reasoned
through. The transformer code has to include a few extra
function calls to properly massage
this
scoping and source
maps from ES2015+ classes to ES5 functions here and there.
Bewildered at that high-level walkthrough? Me too! Please upvote #47573: Remove older emit support over time to make it more likely weâll no longer need to support ES5 eventually! đ
Final Thanks
Iâd like to extend a sincere heartfelt thanks to the several developers who reviewed the pull request over the years. In order of review:
- Klaus Meinhardt: An all-around knowledgeable developer who has previously created a linter (fimbullinter/wotan) and gave helpful pointers early in the pull request â all as a fellow external contributor.
- Wesley Wigham: For giving the pull request a helpful review and its first approval back in 2020.
- Ron Buckton: For an intensely thorough set of reviews containing deep insights into the wild and wacky world of JavaScript and TypeScript classes, along with the final approval in 2022.