The following syntaxes are supported:
- Integer literals
12345
(decimal)0x1FE
(hexadecimal)0b1010
(binary)
- Real number literals
123.456
(digits on both sides of the.
)123.456e789
(optional+
or-
after thee
)0x1.2p123
(optional+
or-
after thep
)
- Digit separators (
_
) may be used, but only in conventional locations
Note that real number literals always contain a .
with digits on both sides,
and integer literals never contain a .
.
Literals are case-sensitive. Unlike in C++, literals do not have a suffix to indicate their type.
Decimal integers are written as a non-zero decimal digit followed by zero or
more additional decimal digits, or as a single 0
.
Integers in other bases are written as a 0
followed by a base specifier
character, followed by a sequence of digits in the corresponding base. The
available base specifiers and corresponding bases are:
Base specifier | Base | Digits |
---|---|---|
b |
2 | 0 and 1 |
x |
16 | 0 ... 9 , A ... F |
The above table is case-sensitive. For example, 0b1
and 0x1A
are valid, and
0B1
, 0X1A
, and 0x1a
are invalid.
A zero at the start of a literal can never be followed by another digit: either
the literal is 0
, the 0
begins a base specifier, or the next character is a
decimal point (see below). No support is provided for octal literals, and any C
or C++ octal literal (other than 0
) is invalid in Carbon.
Real numbers are written as a decimal or hexadecimal integer followed by a
period (.
) followed by a sequence of one or more decimal or hexadecimal
digits, respectively. A digit is required on each side of the period. 0.
and
.3
are both invalid.
A real number can be followed by an exponent character, an optional +
or -
(defaulting to +
if absent), and a character sequence matching the grammar of
a decimal integer with some value N. For a decimal real number, the exponent
character is e
, and the effect is to multiply the given value by
10±N. For a hexadecimal real number, the exponent character
is p
, and the effect is to multiply the given value by
2±N. The exponent suffix is optional for both decimal and
hexadecimal real numbers.
Note that a decimal integer followed by e
is not a real number literal. For
example, 3e10
is not a valid literal.
When a real number literal is interpreted as a value of a real number type, its value is the representable real number closest to the value of the literal. In the case of a tie, the conversion to the real number type is invalid.
The decimal real number syntax allows for any decimal fraction to be expressed -- that is, any number of the form a x 10-b, where a is an integer and b is a non-negative integer. Because the decimal fractions are dense in the reals and the set of values of the real number type is assumed to be discrete, every value of the real number type can be expressed as a real number literal. However, for certain applications, directly expressing the intended real number representation may be more convenient than producing a decimal equivalent that is known to convert to the intended value. Hexadecimal real number literals are provided in order to permit values of binary floating or fixed point real number types to be expressed directly.
As described above, a real number literal that lies exactly between two
representable values for its target type is invalid. Such ties are extremely
unlikely to occur by accident: for example, when interpreting a literal as
Float64
, 1.
would need to be followed by exactly 53 decimal digits (followed
by zero or more 0
s) to land exactly half-way between two representable values,
and the probability of 1.
followed by a random 53-digit sequence resulting in
such a tie is one in 553, or about
0.000000000000000000000000000000000009%. For Float32
, it's about
0.000000000000001%, and even for a typical Float16
implementation with 10
fractional bits, it's around 0.00001%.
Ties are much easier to express as hexadecimal floating-point literals: for
example, 0x1.0000_0000_0000_08p+0
is exactly half way between 1.0
and the
smallest Float64
value greater than 1.0
, which is 0x1.0000_0000_0000_1p+0
.
Whether written in decimal or hexadecimal, a tie provides very strong evidence that the developer intended to express a precise floating-point value, and provided one bit too much precision (or one bit too little, depending on whether they expected some rounding to occur), so rejecting the literal is preferred over making an arbitrary choice between the two possible values.
If digit separators (_
) are included in literals, they must meet the
respective condition:
- For decimal integers, the digit separators shall occur every three digits
starting from the right. For example,
2_147_483_648
. - For hexadecimal integers, the digit separators shall occur every four digits
starting from the right. For example,
0x7FFF_FFFF
. - For real number literals, digit separators can appear in the decimal and
hexadecimal integer portions (prior to the period and after the optional
e
or mandatoryp
) as described in the previous bullets. For example,2_147.483648e12_345
or0x1_00CA.FEF00Dp+24
- For binary literals, digit separators can appear between any two digits. For
example,
0b1_000_101_11
.
The design provides a syntax that is deliberately close to that used both by C++ and many other languages, so it should feel familiar to developers. However, it selects a reasonably minimal subset of the syntaxes. This minimal approach provides benefits directly in line with the goal that Carbon code should be easy to read, understand, and write:
- Reduces unnecessary choices for programmers.
- Simplifies the syntax rules of the language.
- Improves consistency of written Carbon code.
That said, it still provides sufficient variations to address important use cases for the goal of not leaving room for a lower level language:
- Hexadecimal and binary integer literals.
- Scientific notation floating point literals.
- Hexadecimal (scientific) floating point literals.
- Proposal #143: Numeric literals