Compiler error suggestions #323

haxscramper · 2021-01-26T19:36:29Z

Context, original source code

Current state of error-related suggestions can be described as "non-existent". While compiler has all necessary information about current state of compilation (declared identifier, variable mutability), this information is not made available to the user except for type mismatches and several other cases (that also need to be improved).

Providing basic explanation for error cause when applicable makes it much easier to figure out the solution - making variable mutable, using mitems() instead of items() and so on.

I believe that making compiler do as much thinking as possible is a good tradeoff, as it reduces amount of time spent on figuring out type mismatch errors.

This RFC might look like simple collection of issues, but I think that If we have one bad error this is a bug. ll errors are bad we have a systematic problem, that needs to be adressed in organized way.

Context, original source code

When compilation error happens, it is shows generated code

import sequtils

proc acceptsInt(a: int) = discard

acceptsInt(@[1,2,3].mapIt(it + 1))

Current message

Hint: used config file '/playground/nim/config/nim.cfg' [Conf]
Hint: used config file '/playground/nim/config/config.nims' [Conf]
......
/usercode/in.nim(5, 11) Error: type mismatch: got <seq[int]>
but expected one of:
proc acceptsInt(a: int)
  first type mismatch at position: 1
  required type for a: int
  but expression '
type
  OutType`gensym0 = typeof(
    block:
      var it: typeof(items(@[1, 2, 3]), typeOfIter)
      it + 1, typeOfProc)
block:
  let :tmp_4915652 = @[1, 2, 3]
  template s2_4915653(): untyped =
    :tmp_4915652

  var i`gensym0 = 0
  var result`gensym0 = newSeq(len(:tmp_4915652))
  for it in items(:tmp_4915652):
    result`gensym0[i`gensym0] = it + 1
    i`gensym0 += 1
  result`gensym0' is of type: seq[int]

expression: acceptsInt:
  type
    OutType`gensym0 = typeof(
      block:
        var it: typeof(items(@[1, 2, 3]), typeOfIter)
        it + 1, typeOfProc)
  block:
    let :tmp_4915652 = @[1, 2, 3]
    template s2_4915653(): untyped =
      :tmp_4915652

    var i`gensym0 = 0
    var result`gensym0 = newSeq(len(:tmp_4915652))
    for it in items(:tmp_4915652):
      result`gensym0[i`gensym0] = it + 1
      i`gensym0 += 1
    result`gensym0

Expected message (at least)

/usercode/in.nim(5, 11) Error: type mismatch: got <seq[int]>
but expected one of:
proc acceptsInt(a: int)
  first type mismatch at position: 1
  required type for a: int
  but expression '@[1,2,3].mapIt(it + 1)'

This is relatively mild case, but it doesn't take too much to make it completely unreadable. For example if I had something like add @[1,2,3].mapIt(it + 1) (for example), total compilation error would have 123 lines total - approximately 60x times more than code I wrote.

NOTE: if compiled with --verbosity:2 error location is show - together with 600 lines of extra code and all auto-generated garbage. Using --hints:off removes most of the noise together with original source code. Related - https://forum.nim-lang.org/t/932

Typos, MCS misuses, beginner-unfriendly errors

Beginner-unfriendly

Method call syntax
- MCS-related type mismatch: https://forum.nim-lang.org/t/7053 - special case of undefined routine - if there is a variable with the same name, user needs to know how things got interpreted. Or of there is no parenthesis around arguments, which is also indicator of possible errors.
- confusing procs and missing fields: Nim procedure names can be easily considered an argument for a function - all it takes is write procedure instead of procedure() when passing parameter to a function. For example if you write kind notin {nnkStrLit} and don't have variable kind defined procedure kind(n: NimNode): NimNodeKind will be considered for resolution, leading to quite misleading error message.
If you have something like ident.name 12, but ident is not defined, you will see all overloads for name in the world, and compiler will be telling you that you can't add use name with arguments ident and 12, because name is actually a proc. For example .kind causes this quite often, due to overload on NimNode

If proc name() exists, it is not possible to get adeqate error message for missing name variable, as compiler will try to match all overloads with procvar instead.

This can be fixed by scored mismatches - if none of the overloads make any sense (for example not a single one accepts this type for any parameter), enable additional pass and check for any identifiers that might've been misspelled.
Overly verbose compilation by default

By default compilation outputs all sorts if unnecesary information that is not particularly useful, especially for regular tasks - number of compiled C lines, compilation time, configuration file, verbose hints for compilation of C code and so on. It might be useful in some cases, but making compilation less verbose by default (and subsequently not spending each user's time on figuring out that you need -hints:off, --verbosity:0)
assigning to missing field

file: f01_assign.nim
```
type
  Val* = object
    fld: string
```
file: f01_main.nim
```
import f01_assign

var hello: Val
hello.fld = "123"
```
error output
```
Error: attempting to call undeclared routine: 'fld='
```
While it is certainly possible to declare `fld=` proc to overload field assign, this is not commonly used and has nothing to do with particular issue at hand - field is not exported.
Special-casing some errors

Some errors can be special-cased - missing hash()/`==` implementation, to avoid sutiations like this - yes, it might be obvious for someone that using custom types with hashtable require defining hash and equality functions, but not for everyone, especially beginners who come from dynamically typed languages.

Hash it the most notable example - others include, (but of course not limited to)
1. Use of function with var argument on immutable object
2. Using addr on let variable
3. Option[T] with missing get()
4. Using & without importing std/strformat - in most languages string interpolation
  is built-in, and it still confuses me from time to time (especially with lost of overloads
  for & in the scope)
  
  If Option[T] does not have particular field, but T itself does qualify, provide possible way to fixing code ("Consider adding get() ?"). In general - some kind of heuristics for searching all possible solutions and providing user with common error types.
Fix suggestions

Providing fix suggestion:
- typo corrections
- mutability annotations
- MCS-related error fixes

Side effect annotation - good idea, bad ergonomics

Compile-time side effect detection is an extremely useful feature for those who cares about side-effect-free code, but currently it's usability is affected by lack of any additional information about what kind of side effect was introduced, and where it was introduced.

More informative text for some messages, additional heuristics

Array subscript error messages: Instead of 'index 1 not in 0..0' use something like

Array subscript is out of bounds - <expression> is of
type array[N, Type], but <subscript expression> is '1' -
cannot prove index access is safe

Inaccessible fields: Error: the field 'value' is not accessible. - when object field is not exported. Better would be: Error: the field 'value' is not accessible - not annotated as export.. Maybe rephrase a little differently, but main issue currently is - the reason why field is not accessible is not clear from the message. I often confuse this with wrong object declaration.
Unsafe construction: Cannot prove it is safe to construct ... - say what kind field needs to be initialized, and what value should be used instead of just bitching and leaving users to figure out what is wrong.
Consider variables defined in same or upper scope: Some error messages are generated due to incorrect use of local variables (typo, forgot to use variable etc.). It might be helpful to show things like 'wrong type for function call, unused variable - it has necessary type. Maybe you wanted to use it instead?'

Check for type mismatch on shadowed variables: if variable in upper scope matches for overload then present this information to used.
Type alias/generic name used for construction: Error when type alias or generic type name used to construct value.
```
type Alias = int
let val = Alias()
```
Not really helpful tbh
Function that uses generic types in arguments:
```
type U[T] = object
proc gen[T](arg: U): T = discard
discard gen[int]()
```
Results in Error: cannot instantiate: 'gen[int]'; got 1 type(s) but expected 2. Can be fixed by specifying type parameter for U e.g: proc gen[T](arg: U[T]) = discard; gen[int](U[int]()) compiles and runs correctly.

Possible solution: before outputting errors check if any procedure parameter is generic type and whether it is specified. Since nim does not support partial generic inference it can be relatively easily decided.

User-defined error messages

While it is possible to create greate DSLs in nim - customized error messages are certainly lacking. There is an std/macros/error(), but it could be improved.

Some pain points for writing custom DSLs (from my experience)

Malformed DSL
Type mismatch errors generated deep within macro-generated code because user violated some assumption and haven't defined particular overload.

https://twitter.com/lzsthw/status/1326931878016901120 - good list of specific bad errors - I decided not to include them in (already long enough) list, but I recommend reading this thread.
http://al6x.com/blog/2020/nim-language - while the article itsel does have some misjudgements, but 'Noisy error messages' and 'Not clean output' are certainly in scope of this RFC.
https://forum.nim-lang.org/t/7053 - already mentioned. New user not familliar with MCS confused by error message.
https://forum.nim-lang.org/t/7343 - already mentioned. New user cannot figure out cause/solution for type mismatch error.
https://irclogs.nim-lang.org/02-01-2021.html#21:23:23 - Type mismatches caused by missing imports

It is not possible to provide completely correct suggestions for which module to import, but some heuristics for things like items can be implemented. I'm not deep enough into compiler, but I suppose it might be inferred somehow.
https://forum.nim-lang.org/t/932 - show error position in original source code
https://forum.nim-lang.org/t/5719 - cannot evaluate at compile time. In general would require traversing whole implementation of the expression, but simple cases like let vs const can be diagnosed.
https://forum.nim-lang.org/t/3418 difference between thing(x:42) for object construction and thing(x=42) for object call. "More context depended error messages could really help here." – treeform
Value has to be discarded (partially or fully related)
- https://forum.nim-lang.org/t/2078
- https://forum.nim-lang.org/t/4084
Similarly to noSideEffect annotation - good feature, but a little more beginner-friendly heuristics could help. Especially considering this is not a thing in most other languages.
https://forum.nim-lang.org/t/1460 - better error messages for immutable variables passed as mutable.

Side note

Some suggestions are not fully serious, or I don't have a particular opinion that can be expressed in more actionable form, but I decided to list them:

Providing warnings when fib or fibonacchi is declared in code, and compilation is done without -d:release or -d:danger.
Some performance-related hints. I don't have any particular ideas, but this also might be possible. For example when/if strutils2 might be introduced. But this is just thinking aloud at this point.
Large part of this RFC is adressing issues that beginner users might have, but more experienced users don't really need some of this noise most of the time. Introducing --detailed-errors or something like this, and WRITING ABOUT THIS IN CAPS IN TUTORIAL might be a good idea.
For mutable/immutable - suggest using dup for a function?

The text was updated successfully, but these errors were encountered:

haxscramper · 2021-01-26T19:36:50Z

Split into separate RFC from #322

Araq · 2021-01-27T15:15:27Z

Feel free to improve the compiler's error messages. There is not much to disagree here with. (For me the error messages are good enough but I don't use mapIt nor noSideEffect.)

94: Structured reports r=haxscramper a=haxscramper Full rework of the compiler message handling pipeline. Remove old-style message generation that was based purely on strings that were formatted in-place, and instead implement structured logging where main compiler code only instantiates objects with required information. This would allow to - decouple error presentation logic from the semantic checking and other parts of the compiler. Added bonus: user-facing error messages can be improved much faster, since changes don't have to be propagated through the test suite each time. - allow compilation reports to be printed out as a stream of S-expression or JSON lines - Added bonus: user-facing error messages can be improved much faster, since changes don't have to be propagated through the test suite each time. In addition to architectural improvements this PR also reorganizes and documents various error kinds. Whole thing has to be rewritten from the ground up, so RFCs like nim-lang/RFCs#323 nim-lang/RFCs#324 nim-lang/RFCs#325 will be implemented as a collateral Decoupling of the data and presentation also allows to properly test compilation errors, warnings and hints generated by the compiler and finally untie compiler unit tests from the hardcoded string formatting in the semantic checking phase. This PR is an orthogonal to the `nkError` project, and is only concerned with a *content* of the error report, while `nkError` is all about *presence* of errors during compilation. Co-authored-by: haxscramper <haxscramper@gmail.com>

EriKWDev · 2022-02-06T15:53:32Z

Let's take a moment and imagine what could be possible. I love nim-lang/Nim#16356 and #87. However, I want more! Suggestions, colors, more details, context ... The sky is the limit xP
Here are some mockups I made for fun. All with --hints:off. Some missalignments and perhaps not perfect formulations, but I would love more details, readability and suggestions...

Many type mismatches

file pr.nim

let stringValue = "Hmm"
stringValue.add(10)

Real vs. Mockup

Undeclared identifier, Possible typo

file: typo.nim

let myVariable = "hello"

echo myVarible

Real vs. Mockup

Assignment to unexported field

file: f01_assign.nim

type
  Val* = object
    fld: string

file: f01_main.nim

import f01_assign

var hello: Val
hello.fld = "123"

Real vs. Mockup

Type mismatch (mapit...)

file mapit.nim

import sequtils

proc acceptsInt(a: int) = discard

acceptsInt(@[1, 2, 3].mapIt(it + 1))

Mockup vs. Real

Araq added the Accepted RFC label Jan 27, 2021

haxscramper mentioned this issue Apr 11, 2021

Structured compilation errors #324

Closed

quantimnot mentioned this issue Jul 3, 2021

Extended side effect error messages nim-lang/Nim#18418

Merged

haxscramper mentioned this issue Dec 7, 2021

Structured reports nim-works/nimskull#94

Merged

EriKWDev mentioned this issue Feb 6, 2022

[RFC] type mismatch error message improvement #87

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compiler error suggestions #323

Compiler error suggestions #323

haxscramper commented Jan 26, 2021 •

edited

Loading

haxscramper commented Jan 26, 2021

Araq commented Jan 27, 2021

EriKWDev commented Feb 6, 2022 •

edited

Loading

Compiler error suggestions #323

Compiler error suggestions #323

Comments

haxscramper commented Jan 26, 2021 • edited Loading

Context, original source code

Typos, MCS misuses, beginner-unfriendly errors

Beginner-unfriendly

Side effect annotation - good idea, bad ergonomics

More informative text for some messages, additional heuristics

User-defined error messages

Related

Side note

haxscramper commented Jan 26, 2021

Araq commented Jan 27, 2021

EriKWDev commented Feb 6, 2022 • edited Loading

Many type mismatches

Real vs. Mockup

Undeclared identifier, Possible typo

Real vs. Mockup

Assignment to unexported field

Real vs. Mockup

Type mismatch (mapit...)

Mockup vs. Real

haxscramper commented Jan 26, 2021 •

edited

Loading

EriKWDev commented Feb 6, 2022 •

edited

Loading