Skip to content

Commit

Permalink
Iterate on diagnostic structure (#4321)
Browse files Browse the repository at this point in the history
Part of this is also a plan to change "ERROR:" -> "error:" in
diagnostics. I think I'm the odd one out on sentence casing. The rest is
mostly trying to figure out advice that I think we can live with.

Note I'm not immediately planning to rewrite all diagnostics (unless
maybe there's a simple regex). I'm more trying to redirect style a
little to where preferences lie, particularly for new diagnostics.
  • Loading branch information
jonmeow authored Sep 19, 2024
1 parent 2044366 commit f7304b2
Showing 1 changed file with 54 additions and 40 deletions.
94 changes: 54 additions & 40 deletions toolchain/docs/diagnostics.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,71 +158,85 @@ different kinds of diagnostic.
## Diagnostic parameter types
Here are some types you might consider for the parameters to a diagnostic:
- `llvm::StringLiteral`. Note that we don't use `llvm::StringRef` to avoid
lifetime issues.
- `std::string`
- Carbon types `T` that implement `llvm::format_provider<T>` like:
- `Lex::TokenKind`
- `Lex::NumericLiteral::Radix`
- `Parse::RelativeLocation`
- integer types: `int`, `uint64_t`, `int64_t`, `size_t`
- `char`
- Other
[types supported by llvm::formatv](https://llvm.org/doxygen/FormatVariadic_8h_source.html)
Diagnostic parameters should have informative types. We rely on three different
methods for formatting arguments:
- Builtin
[llvm::formatv](https://llvm.org/doxygen/FormatVariadic_8h_source.html)
support.
- This includes `char` and integer types (`int`, `int32_t`, and so on).
- String types can be added as needed, but stringifying values using the
methods noted below is preferred.
- Use `llvm::StringLiteral` where appropriate; use `std::string` when
allocations are required.
- `llvm::StringRef` is disallowed due to lifetime issues.
- `llvm::format_provider<...>` specializations.
- This can be used when formatting the parameter doesn't require
additional context.
- For example, `Lex::TokenKind` and `Parse::RelativeLoc` provide
diagnostic formatting this way.
- `DiagnosticConverter::ConvertArg` overrides.
- This can provide additional context to a formatter.
- For example, formatting `SemIR::NameId` accesses the IR's name list.
## Diagnostic message style guide
In order to provide a consistent experience, Carbon diagnostics should be
written in the following style:
We want Carbon's diagnostics to be helpful for developers when they run into an
error, and phrased consistently across diagnostics. In addition, Carbon
diagnostics may be mixed with Clang diagnostics when compiling interoperable
code, so we are borrowing some features of Clang's
[Diagnostic Wording](https://clang.llvm.org/docs/InternalsManual.html#diagnostic-wording).
Carbon's diagnostic style aims to balance these concerns. Our style is:
- Start diagnostics with a capital letter or quoted code, and end them with a
period.
- Start diagnostics with a lower case letter or quoted code, and omit trailing
periods.
- Quoted code should be enclosed in backticks, for example:
``"`{0}` is bad."``
- Quoted code should be enclosed in backticks, for example: ``"`{0}` is bad"``
- Phrase diagnostics as bullet points rather than full sentences. Leave out
articles unless they're necessary for clarity.
- Diagnostics should describe the situation the toolchain observed and the
language rule that was violated, although either can be omitted if it's
clear from the other. For example:
- Semicolons can be used to separate sentence fragments.
- `"Redeclaration of X."` describes the situation and implies that
- Diagnostics should describe the situation the toolchain observed. The
language rule violated can be mentioned if it wouldn't otherwise be clear.
For example:
- `"redeclaration of X"` describes the situation and implies that
redeclarations are not permitted.
- ``"`self` can only be declared in an implicit parameter list."``
describes the language rule and implies that you declared `self`
somewhere else.
- ``"`self` declared in invalid context; can only be declared in implicit parameter list"``
describes the language rule.
- It's OK for a diagnostic to guess at the developer's intent and provide
a hint after explaining the situation and the rule, but not as a
substitute for that. For example,
``"Add an `as String` cast to format this integer as a string."`` is not
sufficient as an error message, but
``"Cannot add i32 to String. Add an `as String` cast to format this integer as a string."``
``"add `as String` to convert `i32` to `String`"`` is not sufficient as
an error message, but
``"cannot implicitly convert `i32` to `String`; add `as String` for explicit conversion"``
could be acceptable.
- Use "cannot" if needed, but try to use phrasing that doesn't require it.
Avoid "allowed", "legal", "permitted", "valid", and related wording. For
example:
- ``"`export` in `impl` file"`` rather than
``"`export` is only allowed in API files"``.
- ``"`extern library` specifies current library"`` rather than
`` "`extern library` cannot specify the current library"``.
- Try to structure diagnostics such that inputs can be extracted without
string parsing; prefer [typed parameters](#diagnostic-parameter-types). We
would like to keep a path for diagnostics to be an API. There can be
exceptions where this is particularly difficult.
- TODO: Should diagnostics be atemporal and non-sequential ("multiple
declarations of X", "additional declaration here"), present tense but
sequential ("redeclaration of X", "previous declaration is here"), or
temporal ("redeclaration of X", "previous declaration was here")? We could
try to sidestep difference between the latter two by avoiding verbs with
tense ("previously declared here", "Y declared here", with no is/was).
- TODO: Word choices:
- For disallowed constructs, do we say they're not permitted / not allowed
/ not valid / not legal / illegal / ill-formed / disallowed? Do we say
"X cannot be Y" or "X may not be Y" or "X must not be Y" or "X shall not
be Y"?
- TODO: Is structuring diagnostics such that inputs can be parsed without
string parsing important? that is, when is passing strings in as part of the
message templating okay?
- TODO: When do we put identifiers or expressions in diagnostics, versus
requiring notes pointing at relevant code? Is it only avoided for values, or
only allowed for types?
Expand Down

0 comments on commit f7304b2

Please sign in to comment.