You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This RFC itself does not provide a singular actionable goal - I decided to create this issue to have a more structured list of ideas and pain points - in the last several months similar discussion happened on IRC, when someone would complain about errors, then some discussion that ends up on "current errors are suboptimal", "structured errors are good".
RFC mostly serves as a prompt for more structured discussion related to error messages, and formulation of more actionable list of requirements (potentially in other RFC).
Type mismatch errors are horrible
Current error outputs just dumps all possible overloads, without determining their validity in given context. When working on some high-level code, seeing 40+ overloads for add, when you just tried to append to let variable is insanely annoying.
By default, some mismatches are hidden, but particular logic is not entirely clear - it is possible to have one out of four overloads hidden in one case, but for mismatch on + you can get several screens worth of text.
Possible solution would be to introduce score-based system for determining validity of each overload. Possible rules might include:
expected first argument to be var, but found immutable - high similarity score.
Unknown named argument, but close name exists - possible typo. Medium score.
First type is completely different - expected object, found proc (with MCS and missing variable it can happen quite easily) - negative similarity score.
Any other score rules - I just listed more obvious ones.
If all overload alternatives have very low score, enable additional heuristics for error suggestion.
Errors are not machine-readable
Currently only file(line, column) message start can be reliably retrieved from compilation error - everything else is constructed based on string interpolation.
Providing json output for compilation errors would make it easier to build additional heuristics on top of compilation errors (for example hooking up type mismatch to documentation search to provide import suggestions), simplify IDE integration, and mitigate pointless debate for changing error formatting - if someone wants to have all-colorful arrows everywhere, they will have access to all information needed.
Not categorized, not explained, based on string interpolation
Having something like rust compiler error index would be helpful for beginners, makes it less necessary to explain particular details for errors in the manual/tutorial (or worse - leaving users to figure out cause and solution themselves).
Context is not shown, no smart suggestions
Context, original source code
When compilation error happens, it is shows generated code
Hint: used config file '/playground/nim/config/nim.cfg' [Conf]
Hint: used config file '/playground/nim/config/config.nims' [Conf]
......
/usercode/in.nim(5, 11) Error: type mismatch: got <seq[int]>
but expected one of:
proc acceptsInt(a: int)
first type mismatch at position: 1
required type for a: int
but expression '
type
OutType`gensym0 = typeof(
block:
var it: typeof(items(@[1, 2, 3]), typeOfIter)
it + 1, typeOfProc)
block:
let :tmp_4915652 = @[1, 2, 3]
template s2_4915653(): untyped =
:tmp_4915652
var i`gensym0 = 0
var result`gensym0 = newSeq(len(:tmp_4915652))
for it in items(:tmp_4915652):
result`gensym0[i`gensym0] = it + 1
i`gensym0 += 1
result`gensym0' is of type: seq[int]
expression: acceptsInt:
type
OutType`gensym0 = typeof(
block:
var it: typeof(items(@[1, 2, 3]), typeOfIter)
it + 1, typeOfProc)
block:
let :tmp_4915652 = @[1, 2, 3]
template s2_4915653(): untyped =
:tmp_4915652
var i`gensym0 = 0
var result`gensym0 = newSeq(len(:tmp_4915652))
for it in items(:tmp_4915652):
result`gensym0[i`gensym0] = it + 1
i`gensym0 += 1
result`gensym0
Expected message (at least)
/usercode/in.nim(5, 11) Error: type mismatch: got <seq[int]>
but expected one of:
proc acceptsInt(a: int)
first type mismatch at position: 1
required type for a: int
but expression '@[1,2,3].mapIt(it + 1)'
This is relatively mild case, but it doesn't take too much to make it completely unreadable. For example if I had something like add @[1,2,3].mapIt(it + 1) (for example), total compilation error would have 123 lines total - approximately 60x times more than code I wrote.
NOTE: if compiled with --verbosity:2 error location is show - together with 600 lines of extra code and all auto-generated garbage. Using --hints:off removes most of the noise together with original source code. Related - https://forum.nim-lang.org/t/932
Typos, MCS misuses, beginner-unfriendly errors
Beginner-unfriendly
Method call syntax
MCS-related type mismatch:https://forum.nim-lang.org/t/7053 - special case of undefined routine - if there is a variable with the same name, user needs to know how things got interpreted. Or of there is no parenthesis around arguments, which is also indicator of possible errors.
confusing procs and missing fields: Nim procedure names can be easily considered an argument for a function - all it takes is write procedure instead of procedure() when passing parameter to a function. For example if you write kind notin {nnkStrLit} and don't have variable kind defined procedure kind(n: NimNode): NimNodeKind will be considered for resolution, leading to quite misleading error message.
If you have something like ident.name 12, but ident is not defined, you will see all overloads for name in the world, and compiler will be telling you that you can't add use name with arguments ident and 12, because name is actually a proc. For example .kind causes this quite often, due to overload on NimNode
If proc name() exists, it is not possible to get adeqate error message for missing name variable, as compiler will try to match all overloads with procvar instead.
This can be fixed by scored mismatches - if none of the overloads make any sense (for example not a single one accepts this type for any parameter), enable additional pass and check for any identifiers that might've been misspelled.
Overly verbose compilation by default
By default compilation outputs all sorts if unnecesary information that is not particularly useful, especially for regular tasks - number of compiled C lines, compilation time, configuration file, verbose hints for compilation of C code and so on. It might be useful in some cases, but making compilation less verbose by default (and subsequently not spending each user's time on figuring out that you need -hints:off, --verbosity:0)
assigning to missing field
file: f01_assign.nim
typeVal*=object
fld: string
file: f01_main.nim
import f01_assign
var hello: Val
hello.fld ="123"
error output
Error: attempting to call undeclared routine: 'fld='
While it is certainly possible to declare `fld=` proc to overload field assign, this is not commonly used and has nothing to do with particular issue at hand - field is not exported.
Special-casing some errors
Some errors can be special-cased - missing hash()/`==` implementation, to avoid sutiations like this - yes, it might be obvious for someone that using custom types with hashtable require defining hash and equality functions, but not for everyone, especially beginners who come from dynamically typed languages.
Hash it the most notable example - others include, (but of course not limited to)
Use of function with var argument on immutable object
Using addr on let variable
Option[T] with missing get()
If Option[T] does not have particular field, but T itself does qualify, provide possible way to fixing code ("Consider adding get() ?"). In general - some kind of heuristics for searching all possible solutions and providing user with common error types.
Fix suggestions
Providing fix suggestion:
typo corrections
mutability annotations
MCS-related error fixes
Side effect annotation - good idea, bad ergonomics
Compile-time side effect detection is an extremely useful feature for those who cares about side-effect-free code, but currently it's usability is affected by lack of any additional information about what kind of side effect was introduced, and where it was introduced.
More informative text for some messages, additional heuristics
Array subscript error messages: Instead of 'index 1 not in 0..0' use something like
Array subscript is out of bounds - <expression> is of
type array[N, Type], but <subscript expression> is '1' -
cannot prove index access is safe
Inaccessible fields:Error: the field 'value' is not accessible. - when object field is not exported. Better would be: Error: the field 'value' is not accessible - not annotated as export.. Maybe rephrase a little differently, but main issue currently is - the reason why field is not accessible is not clear from the message. I often confuse this with wrong object declaration.
Unsafe construction:Cannot prove it is safe to construct ... - say what kind field needs to be initialized, and what value should be used instead of just bitching and leaving users to figure out what is wrong.
Consider variables defined in same or upper scope: Some error messages are generated due to incorrect use of local variables (typo, forgot to use variable etc.). It might be helpful to show things like 'wrong type for function call, unused variable - it has necessary type. Maybe you wanted to use it instead?'
Check for type mismatch on shadowed variables: if variable in upper scope matches for overload then present this information to used.
Type alias/generic name used for construction: Error when type alias or generic type name used to construct value.
typeAlias=intlet val =Alias()
Not really helpful tbh
Function that uses generic types in arguments:
type U[T] =objectprocgen[T](arg: U): T =discarddiscardgen[int]()
Results in Error: cannot instantiate: 'gen[int]'; got 1 type(s) but expected 2. Can be fixed by specifying type parameter for U e.g: proc gen[T](arg: U[T]) = discard; gen[int](U[int]()) compiles and runs correctly.
Possible solution: before outputting errors check if any procedure parameter is generic type and whether it is specified. Since nim does not support partial generic inference it can be relatively easily decided.
User-defined error messages
While it is possible to create greate DSLs in nim - customized error messages are certainly lacking. There is an std/macros/error(), but it could be improved.
Some pain points for writing custom DSLs (from my experience)
Malformed DSL
Type mismatch errors generated deep within macro-generated code because user violated some assumption and haven't defined particular overload.
http://al6x.com/blog/2020/nim-language - while the article itsel does have some misjudgements, but 'Noisy error messages' and 'Not clean output' are certainly in scope of this RFC.
It is not possible to provide completely correct suggestions for which module to import, but some heuristics for things like items can be implemented. I'm not deep enough into compiler, but I suppose it might be inferred somehow.
https://forum.nim-lang.org/t/5719 - cannot evaluate at compile time. In general would require traversing whole implementation of the expression, but simple cases like let vs const can be diagnosed.
https://forum.nim-lang.org/t/3418 difference between thing(x:42) for object construction and thing(x=42) for object call. "More context depended error messages could really help here." – treeform
Value has to be discarded (partially or fully related)
Similarly to noSideEffect annotation - good feature, but a little more beginner-friendly heuristics could help. Especially considering this is not a thing in most other languages.
Some suggestions are not fully serious, or I don't have a particular opinion that can be expressed in more actionable form, but I decided to list them:
Providing warnings when fib or fibonacchi is declared in code, and compilation is done without -d:release or -d:danger.
Some performance-related hints. I don't have any particular ideas, but this also might be possible. For example when/if strutils2 might be introduced. But this is just thinking aloud at this point.
Large part of this RFC is adressing issues that beginner users might have, but more experienced users don't really need some of this noise most of the time. Introducing --detailed-errors or something like this, and WRITING ABOUT THIS IN CAPS IN TUTORIAL might be a good idea.
For mutable/immutable - suggest using dup for a function?
The text was updated successfully, but these errors were encountered:
This RFC itself does not provide a singular actionable goal - I decided to create this issue to have a more structured list of ideas and pain points - in the last several months similar discussion happened on IRC, when someone would complain about errors, then some discussion that ends up on "current errors are suboptimal", "structured errors are good".
RFC mostly serves as a prompt for more structured discussion related to error messages, and formulation of more actionable list of requirements (potentially in other RFC).
Type mismatch errors are horrible
Current error outputs just dumps all possible overloads, without determining their validity in given context. When working on some high-level code, seeing 40+ overloads for
add
, when you just tried to append tolet
variable is insanely annoying.By default, some mismatches are hidden, but particular logic is not entirely clear - it is possible to have one out of four overloads hidden in one case, but for mismatch on
+
you can get several screens worth of text.Possible solution would be to introduce score-based system for determining validity of each overload. Possible rules might include:
var
, but found immutable - high similarity score.object
, foundproc
(with MCS and missing variable it can happen quite easily) - negative similarity score.Any other score rules - I just listed more obvious ones.
If all overload alternatives have very low score, enable additional heuristics for error suggestion.
Errors are not machine-readable
Currently only
file(line, column) message start
can be reliably retrieved from compilation error - everything else is constructed based on string interpolation.Providing
json
output for compilation errors would make it easier to build additional heuristics on top of compilation errors (for example hooking up type mismatch to documentation search to provide import suggestions), simplify IDE integration, and mitigate pointless debate for changing error formatting - if someone wants to have all-colorful arrows everywhere, they will have access to all information needed.Not categorized, not explained, based on string interpolation
Having something like rust compiler error index would be helpful for beginners, makes it less necessary to explain particular details for errors in the manual/tutorial (or worse - leaving users to figure out cause and solution themselves).
Context is not shown, no smart suggestions
Context, original source code
When compilation error happens, it is shows generated code
Current message
Expected message (at least)
This is relatively mild case, but it doesn't take too much to make it completely unreadable. For example if I had something like
add @[1,2,3].mapIt(it + 1)
(for example), total compilation error would have123
lines total - approximately 60x times more than code I wrote.NOTE: if compiled with
--verbosity:2
error location is show - together with 600 lines of extra code and all auto-generated garbage. Using--hints:off
removes most of the noise together with original source code. Related - https://forum.nim-lang.org/t/932Typos, MCS misuses, beginner-unfriendly errors
Beginner-unfriendly
Method call syntax
procedure
instead ofprocedure()
when passing parameter to a function. For example if you writekind notin {nnkStrLit}
and don't have variablekind
defined procedurekind(n: NimNode): NimNodeKind
will be considered for resolution, leading to quite misleading error message.If you have something like
ident.name 12
, butident
is not defined, you will see all overloads forname
in the world, and compiler will be telling you that you can't add usename
with argumentsident
and12
, becausename
is actually a proc. For example.kind
causes this quite often, due to overload on NimNodeIf
proc name()
exists, it is not possible to get adeqate error message for missingname
variable, as compiler will try to match all overloads with procvar instead.This can be fixed by scored mismatches - if none of the overloads make any sense (for example not a single one accepts this type for any parameter), enable additional pass and check for any identifiers that might've been misspelled.
Overly verbose compilation by default
By default compilation outputs all sorts if unnecesary information that is not particularly useful, especially for regular tasks - number of compiled C lines, compilation time, configuration file, verbose hints for compilation of C code and so on. It might be useful in some cases, but making compilation less verbose by default (and subsequently not spending each user's time on figuring out that you need
-hints:off
,--verbosity:0
)assigning to missing field
file:
f01_assign.nim
file:
f01_main.nim
error output
While it is certainly possible to declare
`fld=`
proc to overload field assign, this is not commonly used and has nothing to do with particular issue at hand - field is not exported.Special-casing some errors
Some errors can be special-cased - missing
hash()/`==`
implementation, to avoid sutiations like this - yes, it might be obvious for someone that using custom types with hashtable require defining hash and equality functions, but not for everyone, especially beginners who come from dynamically typed languages.Hash it the most notable example - others include, (but of course not limited to)
Use of function with
var
argument on immutable objectUsing
addr
onlet
variableOption[T]
with missingget()
If
Option[T]
does not have particular field, butT
itself does qualify, provide possible way to fixing code ("Consider addingget()
?"). In general - some kind of heuristics for searching all possible solutions and providing user with common error types.Fix suggestions
Providing fix suggestion:
Side effect annotation - good idea, bad ergonomics
Compile-time side effect detection is an extremely useful feature for those who cares about side-effect-free code, but currently it's usability is affected by lack of any additional information about what kind of side effect was introduced, and where it was introduced.
More informative text for some messages, additional heuristics
Array subscript error messages: Instead of '
index 1 not in 0..0
' use something likeInaccessible fields:
Error: the field 'value' is not accessible.
- when object field is not exported. Better would be:Error: the field 'value' is not accessible - not annotated as export.
. Maybe rephrase a little differently, but main issue currently is - the reason why field is not accessible is not clear from the message. I often confuse this with wrong object declaration.Unsafe construction:
Cannot prove it is safe to construct ...
- say what kind field needs to be initialized, and what value should be used instead of just bitching and leaving users to figure out what is wrong.Consider variables defined in same or upper scope: Some error messages are generated due to incorrect use of local variables (typo, forgot to use variable etc.). It might be helpful to show things like 'wrong type for function call, unused variable - it has necessary type. Maybe you wanted to use it instead?'
Check for type mismatch on shadowed variables: if variable in upper scope matches for overload then present this information to used.
Type alias/generic name used for construction: Error when type alias or generic type name used to construct value.
Not really helpful tbh
Function that uses generic types in arguments:
Results in
Error: cannot instantiate: 'gen[int]'; got 1 type(s) but expected 2
. Can be fixed by specifying type parameter forU
e.g:proc gen[T](arg: U[T]) = discard; gen[int](U[int]())
compiles and runs correctly.Possible solution: before outputting errors check if any procedure parameter is generic type and whether it is specified. Since nim does not support partial generic inference it can be relatively easily decided.
User-defined error messages
While it is possible to create greate DSLs in nim - customized error messages are certainly lacking. There is an
std/macros/error()
, but it could be improved.Some pain points for writing custom DSLs (from my experience)
Related
https://twitter.com/lzsthw/status/1326931878016901120 - good list of specific bad errors - I decided not to include them in (already long enough) list, but I recommend reading this thread.
http://al6x.com/blog/2020/nim-language - while the article itsel does have some misjudgements, but 'Noisy error messages' and 'Not clean output' are certainly in scope of this RFC.
https://forum.nim-lang.org/t/7053 - already mentioned. New user not familliar with MCS confused by error message.
https://forum.nim-lang.org/t/7343 - already mentioned. New user cannot figure out cause/solution for type mismatch error.
https://irclogs.nim-lang.org/02-01-2021.html#21:23:23 - Type mismatches caused by missing imports
It is not possible to provide completely correct suggestions for which module to import, but some heuristics for things like
items
can be implemented. I'm not deep enough into compiler, but I suppose it might be inferred somehow.https://forum.nim-lang.org/t/932 - show error position in original source code
https://forum.nim-lang.org/t/5719 - cannot evaluate at compile time. In general would require traversing whole implementation of the expression, but simple cases like
let
vsconst
can be diagnosed.https://forum.nim-lang.org/t/3418 difference between
thing(x:42)
for object construction andthing(x=42)
for object call. "More context depended error messages could really help here." – treeformValue has to be discarded (partially or fully related)
Similarly to
noSideEffect
annotation - good feature, but a little more beginner-friendly heuristics could help. Especially considering this is not a thing in most other languages.https://forum.nim-lang.org/t/1460 - better error messages for immutable variables passed as mutable.
Side note
Some suggestions are not fully serious, or I don't have a particular opinion that can be expressed in more actionable form, but I decided to list them:
fib
orfibonacchi
is declared in code, and compilation is done without-d:release
or-d:danger
.strutils2
might be introduced. But this is just thinking aloud at this point.--detailed-errors
or something like this, and WRITING ABOUT THIS IN CAPS IN TUTORIAL might be a good idea.dup
for a function?The text was updated successfully, but these errors were encountered: