TypeScript Contribution Diary: Improved Errors on Invalid Variable Names
September 15, 2020
Photo of Bud Abbott and Lou Costello from their NBC Radio program.
File copied from Wikipedia under the Creative Commons Attribution 2.0 Generic license. [source]
Problem Statement
Reading through error messages emitted by TypeScript can sometimes feel like the old āWhoās on First?ā routine, where every word has two meanings and you can never be sure exactly what any particular phrase refers to.
One example complaint is
Improve wrong usage of
case
compile errors
#19352, first filed in 2017. Without reading it, can you guess what
a reasonable error message for this line of code should be?
const case = "123";
If your guess focused on the
case
keyword not being
allowed as a variable name and was anything like
'case' is not allowed as a variable declaration
name
, then congrats, you win! If not, thatās ok, because
TypeScript <4.1 emits three instances of
not-so-informative error message:
const case = 123;
// ~~~~ Variable declaration expected.
// ~ Variable declaration expected.
// ~~~ Variable declaration expected.
š¤ Not the most useful error message in the world.
This post describes what I did to improve that error to be more understandable. Letās dig in!
But Why?
As with any TypeScript error message improvement
investigation, I started by searching on how the original
emitted error is achieved. I found exactly one usage of it, in
src/compiler/parser.ts
:
function parsingContextErrors(context: ParsingContext): DiagnosticMessage {
switch (context) {
// ...(a bunch of similar looking cases)
case ParsingContext.VariableDeclarations:
return Diagnostics.Variable_declaration_expected;
// ...(a bunch of similar looking cases)
I added a
console.log(new Error())
to
the
ParsingContext.VariableDeclaration
code locally and re-ran to grab the full stack of where this
code gets called on a file containing
const case = 123
:
Error
at parsingContextErrors (C:\Code\temp\node_modules\typescript\lib\tsc.js:16677:33)
at abortParsingListOrMoveToNextToken (C:\Code\temp\node_modules\typescript\lib\tsc.js:16657:38)
at parseDelimitedList (C:\Code\temp\node_modules\typescript\lib\tsc.js:16726:21)
at parseVariableDeclarationList (C:\Code\temp\node_modules\typescript\lib\tsc.js:19406:37)
at parseVariableStatement (C:\Code\temp\node_modules\typescript\lib\tsc.js:19416:36)
at parseDeclarationWorker (C:\Code\temp\node_modules\typescript\lib\tsc.js:19264:28)
at parseDeclaration (C:\Code\temp\node_modules\typescript\lib\tsc.js:19248:24)
at parseStatement (C:\Code\temp\node_modules\typescript\lib\tsc.js:19219:32)
at parseListElement (C:\Code\temp\node_modules\typescript\lib\tsc.js:16495:20)
at parseList (C:\Code\temp\node_modules\typescript\lib\tsc.js:16479:35)
Scanning through the names of the functions and their implementations, hereās generally what I reasoned was happening:
-
parseList
is a general driver for reading in some kind of list; itsparseElement
parameter is the the logic specific to the kind of list. -
parseDeclaration
usesparseDeclarationWorker
to delegate to the corresponding parser to the type of list ā in this case,SyntaxKind.ConstKeyword
. -
parseDelimitedList
continuously adds to the list until either: -
parsingContextErrors
produces the appropriate context error for the list.
In conclusion, that last step is where the actual error message is produced. Reasonable.
New Messaging
Now that I knew where the error message comes from, I needed
to decide on a new message. The
'case' is not allowed as a variable declaration
name
suggestion from above seemed like a pretty reasonable first
guess.
To verify, I checked what a few different JavaScript engines
emit for the
const case = 123
snippet:
-
Chrome:
SyntaxError: Unexpected token 'case'
-
Firefox:
SyntaxError: missing variable name
-
Internet Explorer:
The use of a keyword for an identifier is invalid
Wow. I never thought Iād see the day when the Internet Explorer error message is better than both Chromeās and Firefoxās. š
Anyway, none of these were particularly better than what I was
thinking of, so I added
'{0}' is not allowed as a variable declaration name.
as a new diagnostic to
diagnosticMessages.json
.
Changing the Parser
The first thought that came to mind was to use the new error message in all places. Would all the possible cases of invalid identifiers in variable declaration lists benefit from mentioning the offending token?
To check, I searched for
Variable declaration expected
in test cases. There were a lot. Skimming through
them, there seemed to be a few common cases of errorsā¦
Clear Improvements
Such as in
bitwiseNotOperatorInvalidOperations.ts
:
var mul = ~[1, 2, "abc"], "";
// ~~
'""' is not an allowed variable name.
would be a reasonable diagnostic.
Debatably Equivalent Messages
Such as in
expressionTypeNodeShouldError.ts
:
class C3 {
foo() {
const x: false.typeof(this.foo);
// ~~~~~~
}
}
'typeof' is not an allowed variable name.
is maybe equally confusing as
Variable declaration expected.
? Unclear.
Arguably Degraded Quality
Such as in
getJavaScriptSyntacticDiagnostics02.ts
:
var v = /[]/]/
// ~
/[]/
is a complete regular
expression; saying a declaration is expected after it is a bit
closer to the truth than complaining about a name.
Making Changes
Ok, so the nice new error message isnāt applicable in all
cases. Can we return a different error message in
parsingContextErrors
based
on the token being complained on? Roughly:
-
If the token is one of the known invalid ones such as
case
, we should give a more specific āthis isnāt a valid variable nameā error. - Otherwise, give the same general error as before.
Question: how could we tell whether the token is a banned
keyword? Is there a specific
SyntaxKind
range, or
pre-existing utility, orā¦?
Checking Invalidity Validity
I knew to start that it wouldnāt be easy to set up an
allowlist of ok
SyntaxKind
members, as that
list would huge:
BreakKeyword
,
CaseKeyword
,
ClassKeyword
, ā¦
I ran a few searches for regular expression terms like
/is.*valid.*identifier/
but
didnāt find any good utilities. Fortunately, the cases weāre
looking for are uniformly those where an English keyword (e.g.
case
) is the culprit, soā¦
maybe we could filter on tokens that start with a valid
a-Z
letter?
A little bit of searching for
/string.*token/
and
/token.*string/
found a
tokenToString
utility for
converting a token to its string. Great!
I changed the logic in
parsingContextErrors
for
specifically variable declarations to check the token string:
case ParsingContext.VariableDeclarations:
const tokenString = tokenToString(token());
return tokenString && /\w+/.test(tokenString[0])
? Diagnostics._0_is_not_allowed_as_a_variable_declaration_name
: Diagnostics.Variable_declaration_expected;
Messaging Token Names
Problem: although
parsingContextErrors
was
now returning the correct diagnostic for invalid tokens, it
wasnāt returning the
tokenString
to be used in
it.
It seemed wasteful to change
parseErrorAtCurrentToken
to
call
tokenToString(token())
again. Instead, I changed
parsingContextErrors
to
return a tuple (array) containing at least the diagnostic, and
optionally a string to add to it:
case ParsingContext.VariableDeclarations:
const tokenString = tokenToString(token());
return tokenString && /\w+/.test(tokenString[0])
? [Diagnostics._0_is_not_allowed_as_a_variable_declaration_name, tokenString]
: [Diagnostics.Variable_declaration_expected];
ā¦and
abortParsingListOrMoveToNextToken
to pass all of
parsingContextError
ās
results to
parseErrorAtCurrentToken
:
function abortParsingListOrMoveToNextToken(kind: ParsingContext) {
- parseErrorAtCurrentToken(parsingContextErrors(kind));
+ parseErrorAtCurrentToken(...parsingContextErrors(kind));
parseErrorAtCurrentToken
already took in an optional
arg0
, so this just worked.
Super!
Testing it Out
I built these changes locally and ran
built/local/tsc.js
on a
file containing
const case = 123;
. The
errors wereā¦ somewhat encouraging!
index.ts:6:7 - error TS1389: 'case' is not allowed as a variable declaration name.
6 const case = 123;
~~~~
index.ts:6:12 - error TS1134: Variable declaration expected.
6 const case = 123;
~
index.ts:6:14 - error TS1134: Variable declaration expected.
6 const case = 123;
~~~
Ideally we should really only be emitting an error for the first case and skip emitting anything for the other two. I skimmed the code areas around the parser and was discouraged from making changes pretty quickly. Making changes to the parserās behavior around list abortions would require changing how the parser calculates what is or isnāt in the list, which would result in changes to the emitted (invalid) JavaScript.
Iāve learned the hard way that changes to emitted JavaScript take much longer to get approved (ahem, still waiting on #29374 and #33337ā¦). Improving parser tolerance on invalid identifiers would have to wait for another day.
Pull Request Review
I cleaned up the code changes, updated now-failing baseline tests, and sent it out for review: Specified error diagnostic for invalid variable names #40105. It came back later that day (speedy!) with a few requested changesā¦
isKeyword
Comment: Why not just test whether the token is a keyword? i.e. use
isKeyword
Aha! So there is a utility to check whether a token is a keyword! Super.
case ParsingContext.VariableDeclarations:
- const tokenString = tokenToString(token());
- return tokenString && /\w+/.test(tokenString[0])
- ? [Diagnostics._0_is_not_allowed_as_a_variable_declaration_name, tokenString, tokenString]
+ return isKeyword(token())
+ ? [Diagnostics._0_is_not_allowed_as_a_variable_declaration_name, tokenString, tokenToString(token())]
: [Diagnostics.Variable_declaration_expected];
Avoiding Array Allocations
Comment: Nit: I kinda dislike allocating an array on each of these, but maybe itās fine. I would have just made it take the error-reporting callback.
Reasonable! Creating and spreading arrays is admittedly an odd thing to do here.
parsingContextErrors
is
only ever called by
abortParsingListOrMoveToNextToken
, so I had it directly call
parseErrorAtCurrentToken
instead of a callback.
case ParsingContext.VariableDeclarations:
return isKeyword(token())
- ? [Diagnostics._0_is_not_allowed_as_a_variable_declaration_name, tokenString, tokenToString(token())]
- : [Diagnostics.Variable_declaration_expected];
+ ? parseErrorAtCurrentToken(Diagnostics._0_is_not_allowed_as_a_variable_declaration_name, tokenToString(token()))
+ : parseErrorAtCurrentToken(Diagnostics.Variable_declaration_expected);
function abortParsingListOrMoveToNextToken(kind: ParsingContext) {
- parseErrorAtCurrentToken(...parsingContextErrors(kind));
+ parsingContextErrors(kind);
Clarifying Scope
Comment: Does this also fix #11648. If not, thatās okay, keep the change separate.
Correct, this change does not affect other forms of intolerant parser weirdness.
Wrapping up
I sent up the code changes as a commit, responded to the comments, and was happy to see the PR merged later that week. Yay! š
You can now experience these slightly improved error messages in the nightly builds of TypeScript and any subsequent version >=4.1.
Next Steps
I still want to make list parsing more tolerant of invalid identifiers, so I filed #40317. Hopefully weāll get to have a sequel to this post discussing resolving that followup!
Lastly, many thanks to @DanielRossenwaser for the quick & informative PR reviews. Much appreciated!