Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended ISO string representation if time zone is an offset (not an IANA name) #703

Closed
justingrant opened this issue Jun 24, 2020 · 66 comments · Fixed by #1081
Closed

Extended ISO string representation if time zone is an offset (not an IANA name) #703

justingrant opened this issue Jun 24, 2020 · 66 comments · Fixed by #1081
Assignees
Labels
documentation Additions to documentation non-prod-polyfill THIS POLYFILL IS NOT FOR PRODUCTION USE! spec-text Specification text involved zoned-type Part of the effort for a type with timestamp+timezone

Comments

@justingrant
Copy link
Collaborator

From #700 open issues:

What extended ISO string representation should be used when the time zone is an offset, not an IANA name? Should the offset be in brackets?

Example:

Temporal.LocalDateTime.from({ absolute: '2020-03-08T09:30Z', timeZone: '+07:00' }).toString();
// => "2020-03-08T16:30+07:00[+07:00]"
// but should it be this? => "2020-03-08T16:30+07:00"

The argument to repeat the offset in brackets is to prevent LocalDateTime from parsing ISO strings that lack a time zone identifier, in order to prevent implicitly treating a timezone-less value as if it had a time zone. The whole idea behind LocalDateTime is that time zones should never be implicit (with now.localDateTime as the only exception) because implicitly assuming a time zone is the path to timezone issues and DST bugs like we get with legacy Date.

But what's the argument on the other side to avoid emitting the duplicated offset in brackets? And if we did that, then should LocalDateTime still forbid parsing of bracket-less ISO strings?

@ptomato
Copy link
Collaborator

ptomato commented Jun 25, 2020

What we currently do for the time zone part of an ISO string, is treat -04:00 as an offset time zone (no DST changes) and -04:00[America/New_York], as an IANA time zone (which may or may not have rules for DST changes). Could LocalDateTime refuse to parse, say, 2020-06-25T15:20 but accept both 2020-06-25T15:20-04:00 and 2020-06-25T15:20-04:00[America/New_York] as meaning two different things?

@justingrant
Copy link
Collaborator Author

justingrant commented Jun 26, 2020

After doing more research on this topic today, I now believe that LocalDateTime.prototype.from should not accept bracket-less ISO strings. Here's why: there are at least two major platforms (.NET and SQL Server) have awareness of offsets but not time zones. (I think @sffc mentioned this in a previous issue.) Anyway, on these platforms, offset-aware strings will likely be persisted like this: 2020-06-25T15:20-04:00.

So a reasonable user who's not familiar with DST might assume that parsing one of those persisted-by-another-platform strings is good enough to use in Temporal. But that leads to buggy code which will have a local time that's off by one hour, like this:

LocalDateTime.from('2020-06-25T15:20-04:00').plus({months: 6});

So I think it makes sense to put a roadblock in front of those users so that they at least get an exception to help them figure out that what they really need is this.

LocalDateTime.from({absolute: '2020-06-25T15:20-04:00', timeZone: 'America/New_York'}).plus({months: 6});

There will be cases where users really mean that the time zone should be -04:00', but based on what I've learned about persist-offset-but-not-time-zones platforms, I think that the "I really want -04:00'" case is rare but the "I got this from my database or web service so it's good, right?" case will be common. So I think we should mandate the redundant offset in brackets to force the user to opt in if that's what they really intend.

What do you think?

@sffc
Copy link
Collaborator

sffc commented Jun 30, 2020

@justingrant's points make sense to me.

If you have a string without an IANA time zone, you can still get an Absolute for it.

@ptomato
Copy link
Collaborator

ptomato commented Jun 30, 2020

I think it would be very weird if LocalDateTime could only accept an ISO string if it included an unofficial extension.

@sffc
Copy link
Collaborator

sffc commented Jun 30, 2020

I think it would be very weird if LocalDateTime could only accept an ISO string if it included an unofficial extension.

I assume it would accept the Z time zone, which is in the spec, but just not an offset timezone without the corresponding IANA name. @justingrant is that correct?

@justingrant
Copy link
Collaborator Author

justingrant commented Jun 30, 2020

This was very useful feedback that prompted me to look at Temporal parsing more broadly with an eye towards preventing the (unfortunately common) error of parsing a string into the "wrong" Temporal type. For example, I suspect that the code below is very likely (almost always?) a bug:

Temporal.DateTime.from("2020-06-30T18:56:32.260Z")

I think this code should throw. The string is explicitly declaring that it's an absolute string so the chance that the user really intends it to be a DateTime is questionable at best. In the (unusual, I suspect) case that users really intend this to be a UTC DateTime, then they can either append [UTC] to the string or (better, IMHO) parse it into an Absolute and then use .toDateTime('UTC'). But the basic idea is that this pattern is a likely bug, so it's worth adding an opt-in barrier to prevent accidental use.

Error messages could help developers understand how it works. If an ISO string doesn't parse, we could try to parse it using other formats and if any of them match then we could tune the error message accordingly. For example:

Temporal.DateTime.from("2020-06-30T18:56:32.260Z")
// => throws new RangeError(
// `Cannot parse \`Temporal.DateTime\` from '${s}'. For absolute date/time parsing, use \'Absolute.prototype.from()\''
// );

Temporal.Time.from("P15D")
// => throws new RangeError(
// `Cannot parse \`Temporal.Time\` from '${s}'. For duration parsing, use \'Duration.prototype.from()\''
// );

BTW, even without LocalDateTime, in the current Temporal developers will need to learn which ISO string formats correspond to which Temporal types. LocalDateTime would add to that list, but the list is already there:

  • Strings that end with "Z" or an offset string -> Absolute.from
  • Strings that look like normal date/times and don't end with anything weird -> DateTime.from
  • Strings that look like normal dates -> Date.from
  • Strings that look like normal times -> Time.from
  • TODO - entries for MonthDay and YearMonth
  • Strings that start with "P" which stands for "period" -> Duration.from
  • (new) Strings that end with a [Time_Zone] -> LocalDateTime.from
  • Types can sometimes accept the same formats that other types do, but will throw if parsing could be ambiguous or could cause unexpected results or likely bugs.
    • Absolute.from can parse strings that end with a time zone. In case of conflicts between the time zone offset and IANA time zone suffix, the prefer option can control which one wins. (I'm assuming we should make this change in Absolute)
    • DateTime.from can parse date strings (but not time strings) because times default to midnight but there's no default for dates. This method can also parse strings that end in a time zone.
    • Date.from, Time.from, YearMonth.from, and MonthDay.from can all parse the same strings that DateTime.from can.
    • DateTime.from cannot parse ISO strings that end in "Z" because treating these absolute-time strings as a DateTime is usually a bug. Instead, you should use Absolute.from to parse the string and then use Absolute.prototype.toDateTime to convert to the desired time zone, using 'UTC' for the time zone identifier if that's what's really intended.

It may be worth a page in the docs (e.g. "Parsing Temporal Objects from ISO 8601 Strings") that turns the list above into an easier-to-understand format like a table with examples. Also the page should include an explanation of our extensions for calendars and time zones. I'd volunteer to write a page like this if you think it'd be helpful.

What do you think?

I ask because I went through the same process to determine the current parsing behavior of LocalDateTime:

  1. Figure out likely error cases, especially where the non-buggy case is less common than the buggy one
  2. Throw exceptions in those cases with helpful messages
  3. Allow an alternate opt-in way to do the task without an exception, and document that way.

Anyway, with that context in mind...

@ptomato I think it would be very weird if LocalDateTime could only accept an ISO string if it included an unofficial extension.

A fundamental challenge we have is that ISO 8601 knows about offsets (including Z) but not time zones. If LocalDateTime is fundamentally about timezone-aware data, I don't see how LocalDateTime could accept a standard ISO 8601 string because a standard ISO string can never encode a time zone.

That said, Z strings are definitely something that users will try, so it probably deserves a clarifying error message, per discussion above. Here's what's currently used for object initializers. We could do something similar for string initializers too:

if (item instanceof Temporal.Absolute) {
  throw new TypeError('Time zone is missing. Try `{absolute, timeZone}`.');
}

@sffc I assume it would accept the Z time zone, which is in the spec, but just not an offset timezone without the corresponding IANA name. @justingrant is that correct?

Nope. Per discussion above, to avoid likely error cases my intent was that callers explicitly opted in to provide a time zone, either by using an Extended ISO string, or by explicitly specifying the time zone to force the caller to confirm that they're really intending to use UTC or any other offset-based "time zone". Like this:

Temporal.LocalDateTime.from({
  absolute: Temporal.Absolute.from('2020-06-30T18:56:32.260Z'),
  timeZone: 'UTC'
});
// OR 
Temporal.Absolute.from('2020-06-30T18:56:32.260Z').toLocalDateTime('UTC');

To me this seems similar to the case where Temporal.DateTime.from("18:56:32.260") throws. There are certainly valid use cases where developers might expect it to not throw and instead default the date to 1970-01-01, but this would be a bug most of the time so we don't allow it. LocalDateTime requires a time zone, but "Z" isn't really a "time zone"-- it's more of an offset, so I figured it made sense to treat it the same as other offsets which is that parsing isn't allowed without brackets.

@sffc
Copy link
Collaborator

sffc commented Jun 30, 2020

On Z: I see Z == UTC as being the only real time zone supported by ISO 8601, and I think Z counts as a clear enough intent. UTC has no daylight transitions and is common in computing. "Z" can be seen as different than "+00:00", which could correspond to Europe/London in the winter or Atlantic/Azores in the summer.

On explicit time zones in general: This is consistent with Temporal design principles. <rant> My hope for the last few quarters has been that we apply the same "explicit is better" mentality that we have for time zones to calendars. Time zones and calendars are equals in the data model, so why not in the API? (Main issue: #292) </rant>

@ptomato
Copy link
Collaborator

ptomato commented Jul 1, 2020

I'm not sure I agree that being explicit here is better either, so I don't feel I'm being inconsistent 😄 🤷

@niklasR
Copy link

niklasR commented Jul 11, 2020

All the points around parsing with the .from() make sense - For Absolute, stick to ISO 8601, but I can see how flexibility in parsing it is helpful for the DateTime, as long as the edge cases and variants are defined and tested.
Maybe a "strict" mode that throws if it's not exactly what's expected could be helpful for some? Though I don't know what TC39 think about that.

What I am more concerned about though, is the toString() output. The docs says that it "returns the date in ISO 8601 date format", for both DateTime and Absolute, and as far as I can see ISO 8601 does not include IANA names at all. If a timezone is specified in the toString(), the offset takes that into account, and surely we should stick with the standard?

@thojanssens
Copy link

I don't see how LocalDateTime could accept a standard ISO 8601 string

I didn't look into your LocalDateTime proposal yet, but it can convert all given ISO strings to UTC.
See in Elixir: https://hexdocs.pm/elixir/DateTime.html#from_iso8601/2

Since ISO 8601 does not include the proper time zone, the given string will be converted to UTC and its offset in seconds will be returned as part of this function. Therefore offset information must be present in the string.

What I am more concerned about though, is the toString() output. The docs says that it "returns the date in ISO 8601 date format", for both DateTime and Absolute, and as far as I can see ISO 8601 does not include IANA names at all.

We should stick to ISO for output at least. I expect the following:

Temporal.now.absolute().toString()

"2020-07-09T03:23:19.986994478Z"

Temporal.now.absolute().toString('-0700')

"2020-07-08T20:23:14.478676279-07:00"

Temporal.now.absolute().toString('America/Los_Angeles')

"2020-07-08T20:24:52.496999986-07:00"

And if we really want to output a non-ISO string (which I think we shouldn't because of my arguments in the other issues, see below)

Temporal.now.absolute().toString('America/Los_Angeles', { timeZone: true })

"2020-07-08T20:24:52.496999986-07:00[America/Los_Angeles]"

With the current API, to output an ISO string from an absolute and a time zone, we have to code:

const absolute = Temporal.now.absolute()
absolute.toString(Temporal.TimeZone.from('America/Los_Angeles').getOffsetStringFor(absolute))

It then looks that we favor the extended format (non-ISO) rather than ISO.

The issue below demonstrates why offset and time zone together in a datetime string leads to problems, to more code to handle those problems, etc. See my arguments why we shouldn't even accept any extended format:
#716 (comment)
It requires an open mind because I know we are used to see the time zone in brackets for a long time.

Another related issue: #741

@justingrant
Copy link
Collaborator Author

justingrant commented Jul 11, 2020

@niklasR @thojanssens - I copied your feedback into #741. That issue is focused on exactly the topic you're concerned about above, which is whether the time zone time zone is included in the output of Absolute.prototype.toString().

@ptomato
Copy link
Collaborator

ptomato commented Aug 3, 2020

Here are my opinions on this topic:

  • LocalDateTime only accepting the unofficial extension ISO format would be very weird (repeat of what I commented above)
  • Outputting -04:00[-04:00] would also be very weird, because AFAIK none of the other libraries that use this extended ISO format do so
  • I'm fine with outputting -04:00[Etc/UTC+4] to be explicit, as someone suggested in the meeting on Friday, but as someone else pointed out that doesn't work for -04:30.
  • I'm not sure I agree with the point about LocalDateTime.from('2020-08-03T15:28-07:00') always being a bug. If you get that string from your SQL DB, then either a time zone name is stored with it (in which case the obvious thing to do is LocalDateTime.from(isoString).with({ timeZone: timeZoneName }) because where else are you going to put that data?), or no time zone name is stored, in which case you would just have to make one up if you wanted to use LocalDateTime and it only accepted strings with a time zone name. That would be even worse in my opinion.

@sffc
Copy link
Collaborator

sffc commented Aug 4, 2020

I see this thread as circular. On the one hand, it is bug-prone for LocalDateTime to parse an offset time zone. On the other hand, if a LocalDateTime is explicitly built with an offset time zone, we need a way to serialize it in .toString(). We have to either:

  1. Accept a more complicated ISO string syntax (e.g., with the bracketed offset).
  2. Accept the fact that LocalDateTime.parse can be bug-prone.
  3. Programatically prevent LocalDateTime from ever existing with an offset datetime, e.g., by splitting TimeZone into two types, one for IANA time zones and the other for offset time zones.

in which case the obvious thing to do is LocalDateTime.from(isoString).with({ timeZone: timeZoneName })

In that case, you have two choices that are compatible with the proposal on hand: Absolute.from, or DateTime.from. Pick the one that corresponds to whether you want the offset to win or the datetime to win. Since you usually want the absolute to win, you end up doing:

Temporal.Absolute.from(isoString).toLocalDateTime(timeZoneName, "iso")

Actually, I think starting your original snippet, starting with LocalDateTime.from, is fundamentally flawed: since the offset time zone and IANA time zone are known at different points, you can't properly resolve the "should offset or datetime win?" question.

@ptomato
Copy link
Collaborator

ptomato commented Aug 4, 2020

Yes, I'm OK with accepting that LocalDateTime.from can be bug-prone here, I find that the "least worst" option. The bracketed offset seems to harm interoperability, and I would like to avoid adding more types. (Although I learned recently that Java makes a distinction between java.time.ZonedDateTime and java.time.OffsetDateTime for this reason.)

I guess new Temporal.LocalDateTime(Temporal.Absolute.from(isoString), timeZoneName) as currently described in the proof-of-concept branch would do the right thing. Maybe we need to make sure that there is a way to achieve this with Temporal.LocalDateTime.from?

@ryzokuken
Copy link
Member

Summarizing my objection: I dislike the idea of catering to what seems to me a hypothetical misunderstanding against the "correct" usage.

@justingrant
Copy link
Collaborator Author

justingrant commented Aug 9, 2020

I think the root question is this: in 2020-08-10T03:43:36+05:30, is "+05:30" an offset or a time zone?

  • An "offset" lets you convert this single DateTime instance to an Absolute, or vice versa.
  • A "time zone" lets you convert many other DateTimes or Absolutes, like the results of plus and minus.

My concern with implicitly treating an offset as a time zone is that it silently turns LocalDateTime into DateTime, with all the disadvantages of DateTime math: results will usually look correct but will break around DST transitions.

But I also see the value of being able to easily parse and emit zoneless ISO strings for use-cases like formatting where DST is a non-issue because the underlying instant never changes after it's parsed.

Here's a few ideas to try to address both concerns:

  • Should we allow LocalDateTime to parse offset-only strings, but throw if the user calls methods or property getters that require a real time zone? Throwing methods would be: plus, minus, difference, with (unless a timeZone prop is included), hoursInDay, startOfDay, and isTimeZoneOffsetTransition.
const zoneless = Temporal.LocalDateTime.from('2020-08-10T03:43:36+05:30');
zoneless.toString(); // => 2020-08-10T03:43:36+05:30
zoneless.timeZoneOffsetString; // => "+05:30"
zoneless.timeZone; // => undefined?  null?
zoneless.plus({days: 1, hours: 12}); // throws
zoneless.with({hour: 0}); // throws
zoneless.with({timeZone: 'Asia/Kolkata', hour: 0}); // OK
zoneless.hoursInDay; // throws
zoneless.isTimeZoneOffsetTransition; // throws

const zoned = zoneless.with({timeZone: 'Asia/Kolkata'}); 
zoned.plus({days: 1, hours: 12}); // OK
zoneless.toString(); // => 2020-08-10T03:43:36+05:30[Asia/Kolkata]

const offsetZoned = zoneless.with({timeZone: '+05:30'}); // explicit offset time zone
offsetZoned.plus({days: 1, hours: 12}); // OK
offsetZoned.toString();  // => 2020-08-10T03:43:36+05:30[+05:30] (or 2020-08-10T03:43:36+05:30) ? 

AND/OR

  • Should LocalDateTime.from have a defaultTimeZone option whose value could be a TimeZone instance, an offset string, an IANA string, or a special string (e.g. 'offset') to use the input's offset as the time zone if it's omitted? Default could be to throw if the string doesn't have a bracketed zone.

AND/OR

  • Regardless of how the offset-only parsing is handled, IMHO LocalDateTime should have some way to serialize an instance using only DateTime and offset (w/o bracketed time zone). This is needed for interop. It could be toString({timeZone: false}), toISOString(), etc. I like the former because we could also allow toString({offset: false}) it to omit the offset in cases where users always want the time zone to win when the value is parsed back in.

@ptomato The bracketed offset seems to harm interoperability, and I would like to avoid adding more types. (Although I learned recently that Java makes a distinction between java.time.ZonedDateTime and java.time.OffsetDateTime for this reason.)

FWIW, java.time.ZonedDateTime (the Java equivalent to Temporal.LocalDateTime) accepts bracketed offsets in parsing, but won't emit them in toString(). See https://repl.it/@JustinGrant/NeglectedGraciousAlgorithms.

import java.time.ZonedDateTime;
class Main {
  public static void main(String[] args) {
    ZonedDateTime withZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[Asia/Kolkata]");
    System.out.println("with zone: " + withZone.toString());
    // => with zone: 2017-06-16T21:25:37.258+05:30[Asia/Kolkata]
    ZonedDateTime offsetOnly = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30");
    System.out.println("offset only: " + offsetOnly.toString());
    // => offset only: 2017-06-16T21:25:37.258+05:30
    ZonedDateTime offsetTimeZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[+05:30]");
    System.out.println("offset time zone: " + offsetTimeZone.toString());
    // => offset time zone: 2017-06-16T21:25:37.258+05:30
  }
}

@sffc
Copy link
Collaborator

sffc commented Aug 10, 2020

Should we allow LocalDateTime to parse offset-only strings, but throw if the user calls methods or property getters that require a real time zone?

Personally I would be okay with this. It sounds a lot like the option I offered in #292 dubbed "partial ISO" where calendar-dependent operations would throw if a calendar wasn't specified.

Note that in the previous calendar discussions, others pointed out that given the strong typing nature of Temporal, it might be cleaner to split the type into two explicit types rather than making it have different behavior depending on whether or not the time zone (or calendar) is available.

Should LocalDateTime.from have a defaultTimeZone option

Lemma A: Developers tend to know ahead of time, or can easily obtain, the syntax of the string they are parsing. For example, developers know, or can find out, whether their strings have only an offset (-06:00) or an offset plus time zone (-06:00[America/Chicago]).

Given Lemma A, I do not believe that all of the choices in defaultTimeZone are useful. Instead of passing a TimeZone instance, offset string, or IANA string, the programmer should use Temporal.Absolute.from(str).toLocalDateTime(tz, "iso") instead.

The one choice I think is useful is "offset", allowing the programmer to "opt in" to accepting the offset as a time zone.

If we go with this option, we should discuss the naming of the option and the argument.

IMHO LocalDateTime should have some way to serialize an instance using only DateTime and offset

Without additional API, one can do this by replacing the LocalDateTime's IANA TimeZone with an offset TimeZone.

FWIW, java.time.ZonedDateTime (the Java equivalent to Temporal.LocalDateTime) accepts bracketed offsets in parsing, but won't emit them in toString()

Interesting.

@ptomato
Copy link
Collaborator

ptomato commented Aug 10, 2020

Should we allow LocalDateTime to parse offset-only strings, but throw if the user calls methods or property getters that require a real time zone?

I feel strongly against this, for the same reason that I felt strongly against the "partial ISO" proposal: for all intents and purposes it's an internal "is this object broken" flag.

Should LocalDateTime.from have a defaultTimeZone option whose value could be a TimeZone instance, an offset string, an IANA string, or a special string (e.g. 'offset') to use the input's offset as the time zone if it's omitted? Default could be to throw if the string doesn't have a bracketed zone.

I'm not in favour of that default (as per last week's discussion, I'm strongly against the default being to throw on strings that can represent a valid use case) but at first glance I think the option is a good idea.

Regardless of how the offset-only parsing is handled, IMHO LocalDateTime should have some way to serialize an instance using only DateTime and offset (w/o bracketed time zone). This is needed for interop. It could be toString({timeZone: false}), toISOString(), etc. I like the former because we could also allow toString({offset: false}) it to omit the offset in cases where users always want the time zone to win when the value is parsed back in.

Not sure how I feel about adding another option for this. It seems like (`${ldt.toDateTime()}${ldt.offsetNanoseconds}`) is easy enough to do in userland if that's necessary?

Developers tend to know ahead of time, or can easily obtain, the syntax of the string they are parsing. For example, developers know, or can find out, whether their strings have only an offset (-06:00) or an offset plus time zone (-06:00[America/Chicago]).

I don't think I agree with this lemma. I think it would encourage naive attempts such as /\[[a-zA-Z]+\/[a-zA-Z]+\]/.test(str) which would match a string in the America/Vancouver time zone but would miss America/Los_Angeles and America/Indiana/Indianapolis. That said, I think I nonetheless agree with the conclusion, that offset and reject might be the only choices needed.

@sffc
Copy link
Collaborator

sffc commented Aug 10, 2020

I don't think I agree with this lemma. I think it would encourage naive attempts such as /\[[a-zA-Z]+\/[a-zA-Z]+\]/.test(str) which would match a string in the America/Vancouver time zone but would miss America/Los_Angeles and America/Indiana/Indianapolis.

What I meant is, I claim that most of the time, the developer knows while writing the code what format the strings are going to be in. More often than not, the strings come from the same source, whether that's a Postgres database, a CSV file, a JSON API, a date picker component, etc. The developer can look at the typical output from that source to see what is the correct Temporal parsing function to use.

@ptomato
Copy link
Collaborator

ptomato commented Aug 10, 2020

Ah, gotcha.

I also forgot to say in my earlier comment that if java.time.ZonedDateTime accepts the weird extra offset in brackets, that makes it less weird, and I would be more easily convinced to make Temporal's behaviour the same.

@justingrant
Copy link
Collaborator Author

This is a long thread so I'll summarize my concern: users shouldn't be able to perform math operations (or other DST-sensitive operations like .hoursInDay) without explicitly including a time zone in the string or object initializer in LocalDateTime.from.

For example, the code below should throw because it's trying to perform hybrid-duration math on an implicitly defined offset time zone. This is almost certain to be a bug.

Temporal.LocalDateTime.from('2017-06-16T21:25:37.258+05:30').plus({days: 1, hours: 12});

As long as the code above isn't allowed, I'm open to many different solutions, including:

  1. an option that allows users to opt in to implicit offset time zones (with default of 'reject' or false)
  2. require a bracketed time zone (either an IANA name or an offset) when parsing from a string
  3. Same as (2), but when parsing also accept a [] suffix to mean "use the offset as the time zone". This would be easier than callers having to parse the input string in order to pull out the offset so that it could be repeated in brackets. The [] syntax isn't accepted by Java's parser, though, so toString() would either have to omit the brackets or would put the offset in brackets.
  4. don't throw on .from, but throw on DST-sensitive methods instead. @ptomato really doesn't like this one. ;-)
  5. a Temporal.OffsetDateTime type (like Java has) which is a subset of LocalDateTime that omits math and other DST-sensitive methods. Its goal is simply an ergonomic, read-only wrapper around a (DateTime,Absolute) pair. I'm not a huge fan of this, given that it doesn't seem to add much value over what DateTime and Absolute already provide.

My preferred solution would be either (1) or (3), but I don't feel that strongly as long as the buggy math is disallowed.

IMHO the toString behavior is less consequential than the parsing behavior because there's no possibility to end up with DST-unsafe behavior regardless of how toString behaves. The worst possible outcome of toString would be breaking round-trip string serialization, which honestly doesn't seem that bad.

Depending on how we solve this problem, we may or may not want to offer the same behavior for object initializers. For example, if we choose (1) above, then I assume we'd want to offer the same option for object initializers.

@ptomato I also forgot to say in my earlier comment that if java.time.ZonedDateTime accepts the weird extra offset in brackets, that makes it less weird, and I would be more easily convinced to make Temporal's behaviour the same.

Here's another interesting tidbit: Java doesn't require the offsets to match. In other words, internally its parsing will parse the DateTime+offset string into a java.time.Instant using the provided offset, but will use the bracketed offset timezone when it comes time to serialize to string. In other words, Java's parsing has no special-casing for offset time zones-- they're treated just like any other time zone. Furthermore, there's no attempt to verify when parsing that the offset is valid for the time zone-- either for offset time zones or IANA ones. AFAIK, Java's behavior corresponds to the {offset: 'use'} option in the current implementation of LocalDateTime.from.

ZonedDateTime offsetTimeZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[+05:30]");
System.out.println("offset time zone: " + offsetTimeZone.toString());
// => offset time zone: 2017-06-16T21:25:37.258+05:30
ZonedDateTime mismatchedOffsetZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[+06:30]");
System.out.println("mismatched offset time zone: " + mismatchedOffsetZone.toString());
// => mismatched offset time zone: 2017-06-16T22:25:37.258+06:30
ZonedDateTime invalidOffsetForZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[America/Los_Angeles]");
System.out.println("invalid offset for zone: " + invalidOffsetForZone.toString());
// => invalid offset for zone: 2017-06-16T08:55:37.258-07:00[America/Los_Angeles]

@ptomato Not sure how I feel about adding another option for this. It seems like (`${ldt.toDateTime()}${ldt.offsetNanoseconds}`) is easy enough to do in userland if that's necessary?

I don't feel very strongly about this one, so I'm inclined to agree with you. We could always add this later if this is a source of user confusion. BTW, the actual code is a little different:

`${ldt.toDateTime()}${ldt.timeZoneOffsetString}`

@ptomato I'm not in favour of that default (as per last week's discussion, I'm strongly against the default being to throw on strings that can represent a valid use case)

Could you explain your position in more detail? From my perspective, allowing plus on a LocalDateTime that was created with an implicit offset time zone is very, very unlikely to be a "valid use case". There's precedent elsewhere in Temporal to throw in cases where ambiguity is present and we're concerned that there's no safe default, e.g. if there's a conflict between the offset and the time zone. Here, the ambiguity is whether the offset is just an offset or is an offset time zone. Why is this ambiguity case different?

Per @sffc's comments above (which I agree with), developers are likely to know the format of the strings that they're parsing, so recovering from an exception will be trivial in most cases. This is unlike, for example, offset vs. timezone conflicts which by definition will only show up after an app has been in production long enough for time zone rules to change. If the caller gets an exception the first time they call .from with a zoneless string, it doesn't seem like a big obstacle. Could you explain more about why you are concerned that it's problematic to throw by default?

BTW, I agree that some operations are safe (aka "valid use case") on a LocalDateTime with an implicit offset time zone, but that seems like an argument for a separate OffsetDateTime type or the "partial-ISO-like" solution. IMHO, throwing by default seems to be the less confusing solution relative to either of those alternatives.

@ptomato
Copy link
Collaborator

ptomato commented Aug 12, 2020

I think my disconnect with your explanation boils down to: I don't see 2020-08-12T09:40-07:00 as a string with an offset and an "implicit" offset time zone. I just see it as a string with an offset time zone. If we give it an offset time zone object when we parse it, then we are not making any unwarranted assumptions, we are just doing what it says on the tin. @pipobscure pointed out in the meeting last week that such a string does represent a valid use case, in maritime shipping (https://en.wikipedia.org/wiki/Nautical_time). If we instead assume that the user means something else when they specify such a string, then we are requiring programmers to opt in to the correct behaviour, just because it is uncommon. That's what I object to.

(At least, the above point about nautical time zones holds if the offset is whole-hour. I'd maybe be fine with throwing on 2020-08-12T09:40-07:15, if that wouldn't make things more confusing...)

@justingrant
Copy link
Collaborator Author

@ptomato I don't see 2020-08-12T09:40-07:00 as a string with an offset and an "implicit" offset time zone. I just see it as a string with an offset time zone.
If we instead assume that the user means something else when they specify such a string

I think that you're highlighting a core challenge with the bracketless syntax: it's ambiguous. Reasonable people can reasonably disagree about whether -07:00 above means an "offset" (meaning that it only applies to this LocalDateTime) or a "time zone" (meaning that it should be applied to other LocalDateTime values derived from this one, like the result of plus or with).

I don't think that either interpretation is wrong. Both have pros and cons. But I'm also confident that we'll see both interpretations among users. Many developers simply won't understand the difference. Given that both interpretations exist, my preference is for "offset" because it's easier for developers to realize that they have the "wrong" interpretation:

  • If we assume offset, then developers who assumed time zone will learn their mistake on their first call to LocalDateTime.from(if parsing thows) or their first illegal operation like plus (if parse doesn't throw). This will happen while the app is in development and will never make it to production. (Unless the source data is the rare case of a mix of mostly IANA-bracketed and a little bit of offset-zoned data.)
  • If we assume timezone, then developers who assumed offset will not see any errors even though they have a bug in their code if they do any math or call DST-sensitive methods because they haven't told Temporal about the actual time zone. Because this bug only will show up around DST transitions, it's easy to get into production.

If we did go with "offset" (meaning LocalDateTime.from requires an opt-in suffix, e.g. [-07:00] or []), the developers who would get worse ergonomics would be developers legitimately using offset time zones, e.g. for ocean shipping. This is admittedly a rare case. Parsing and serialization would each require one extra method call:

Temporal.Absolute.from(s).toLocalDateTime(s);
`${ldt.toAbsolute()}${ldt.timeZoneOffsetString}`

The non-rare case is where the source data simply doesn't have time zone information. It's just a DateTime+Offset value that was lossily stored. For example, AFAIK there's no DBMS today has a native data type that stores DateTime+offset+TimeZone, so lossy storage is the norm. These values are absolutely not safe to perform LocalDateTime math, with, etc. So we should really be pushing these use-cases to DateTime and/or Absolute instead. If the data doesn't have a real time zone, LocalDateTime adds little/no value, and can actually makes things worse via DST bugs.

@ptomato such a string does represent a valid use case, in maritime shipping (https://en.wikipedia.org/wiki/Nautical_time).

FWIW, there are specific Etc/* IANA time zones for all 24 nautical time zones. The names of these time zones (e.g. GMT+7 which is -07:00) are sign-reversed which apparently aligns with nautical usage if I'm interpreting the Wikipedia article correctly.

@sffc The developer can look at the typical output from that source to see what is the correct Temporal parsing function to use.

If we went with "time zone" then what would our docs for the 2020-08-12T09:40-07:00 format say? Something like this?

This format creates an Absolute. It can also be parsed into a DateTime, Date, YearMonth, MonthDay, or Time.

NOTE: this format can also be parsed by LocalDateTime. The offset is treated as the time zone. This is appropriate for naval communications using nautical time zones and other rare use cases. But it's not appropriate for most mainstream use cases because LocalDateTime requires a real time zone to adjust for DST. For most mainstream use cases, use Absolute or DateTime, or use LocalDateTime with a real time zone.

One of the reasons I'm pushing for "offset" is to avoid the need for that second paragraph. ;-)

@pipobscure
Copy link
Collaborator

I’ll divide my response into 2 parts:

  1. An argument that does not go toward the actual topic, but aims at the validity of your argument
  2. An argument on the merits alone.

Any argument that calls upon “the 99% case” is prima facie spurious unless actually providing data and evidence to support the statement that this is in fact “the 99% case”. The argument above fails to do so.

To quote C.Hitchens: Any argument presented without evidence can be dismissed without evidence

But things get worse: the little bit of evidence provided above:

AFAIK there's no DBMS today has a native data type that stores DateTime+offset+TimeZone, so lossy storage is the norm.

could just as well serve as evidence for the other side of the argument; the fact that no DBMS has such a type indicates that “the 99% case” is that people don’t care about doing DST correct calculations; on the contrary that is something only the 1% of people writing calendaring & scheduling software ever care about. The “actual 99% case” is the one where “correct DST” calculations are actually uninteded.

Note: I did not provide any evidence for this assertion and am fine with it being dismissed without evidence. It’s sole intent was to demonstrate that the argument above was just as spurious and to be dismissed.

The argument to the merit is slightly different. Here a few of the assumptions I am making:

  • there is no ISO/RFC standard way to currently express a datetime with a timezone
  • the bracket extension is something we invented to allow us a way to be correct for scenarios where this is currently impossible based on ISO/RFC formats
  • we do not want to require “custom format rules” to operate on tempral types

Based on these assumptions the conclusion must be that all temporal types should be operable without the bracket extension. This most definitely includes LocalDateTime.

Beyond that your use-case for LocalDateTime and what you imagine its distinguishing virtues to be are by far not the only ones. As such the statement:

... LocalDateTime adds little/no value, and can actually makes things worse via DST bugs.

seems blatantly incorrect to me; it’s simply not a use-case you operate with.

For the same reason I also object to the statement (from proposed documentation):

... But it's not appropriate for most mainstream use cases because LocalDateTime requires a real time zone to adjust for DST. For most mainstream use cases, use Absolute or DateTime, or use LocalDateTime with a real time zone.

To my mind the only correct part of that “second paragraph” is:

NOTE: this format can also be parsed by LocalDateTime. The offset is treated as the time zone.

Which is basically just saying:

We have created a non-standard bracket extension which we define to give additional IANA information.

and that’s needed no matter what because of the fact that we defined the non-standard bracket extension. So that “second paragraph” is necessary either way.

One could even argue that this “second paragraph” would be even more necessary for the case where we don’t accept pure ISO strings, except it would have to read:

ATTENTIONS!!! DANGER!!! LocalDateTime requires the use of a non-standard ISO/RFC string with an extension that we invented. It is therefore not interoperable with anything else in the world. So please pay attention and don’t ever use it.

And that in my mind would be the point where we’d have to reevaluate whether such a non-standard thing should be in Temporal at all.

Given the general usefulness of LocalDateTime however I’d very much regret going that route. As such I’ll stick to the proposal that we simply accept the correct simple syntax for strings that do not include a bracketed IANA zone but rather only an offset.


I hope this exposition makes as much sense to people reading it as it did when I was writing it in 33C weather. If not, I’m happy to discuss live.

@sffc
Copy link
Collaborator

sffc commented Aug 13, 2020

If the IANA "Etc" time zones cover the nautical time use case, can we just remove the concept of arbitrary offet time zones from the spec? People can still implement them as a custom time zone if they really want them.

@justingrant
Copy link
Collaborator Author

justingrant commented Aug 14, 2020

I admit that I'm having trouble figuring out exactly where we agree vs. disagree on this thread. Could we try to clarify? I'll list a few statements below-- let me know which you agree vs. disagree with. They're roughly in order-- if you don't agree with one, then you probably won't agree with ones below it.

Assumption 1: Offsets are different than time zones

An "offset" applies to only one single DateTime, but it has nothing to say about the offsets of other DateTime values. Knowing the offset of one DateTime does not let you know what the offset will be one day later.

On the other hand, a "time zone" can be used to calculate the offsets of other values derived from this one, e.g. via plus, minus, difference, with, startOfDay, hoursOfDay, etc.

An "offset time zone" is a time zone that always has the same constant offset. UTC and -07:00 are examples of offset time zones.

Assumption 2: Offsets act just like time zones for static date/time values

Offsets and offset time zones act the same unless mutation is involved. Time zones only matter if you change the value, so operations that don't change the value (e.g. .year or compare()) work just as well for values with offsets vs. values with time zones.

Assumption 3: Offsets that are not timezones are unsafe for DST-sensitive operations like plus

Operations that create new values, like plus, minus, with, startofDay, hoursInDay, etc., will not return correct results if an offset that's not a time zone is assumed to be a time zone. This is not specific to Temporal; it applies to any platform. For example, if arrival_time is a DATETIMEOFFSET-typed column that originally had a real time zone before being stored in SQL Server, then the query below will cause DST bugs:

SELECT DATEADD(day, 1, arrival_time) from flights

Assumption 4: many (most?) developers won't understand the subtle difference between offsets and offset time zones

Given developer confusion about time zones and DST in general, I expect that many (maybe most) developers won't understand the subtle difference between offsets and time zones, and specifically they may not understand why math with a timezone-less offset is buggy.

Assumption 5: There are two main use cases for offsets

AFAIK, there are two main use cases using offsets:
A. Reading data that originally had a time zone but was lossily stored in a relational database or any other platform (e.g. .NET) that doesn't natively store time zone info along with DateTime+offset data.
B. Ocean shipping communications or similar applications using offset time zones instead of IANA Etc zones.

Assumption 6: the "Lossy Storage" use case's offset is not a time zone

If a DateTime+offset is representing a value that originally had a time zone, but the time zone was lost in persistence, then its offset is just an offset, not a time zone. This means that it's not safe to do math or other DST-sensitive operations using that offset.

Assumption 7: The "Lossy Storage" use case is much more common than "Ocean Shipping" use case, but both are important

We don't need research to know that storing temporal data in a SQL database is much more popular than oceanic transport and other similar use cases. We can argue about whether the ratio is 20:1, 100:1, or 1000:1, but I'm not sure that the actual ratio matters much, as long as we can agree that:

  • (A) above is a vastly more popular use of offsets than (B)
  • Regardless of popularity, both use cases are important enough for Temporal to provide support for them

It's not 33C here (@pipobscure where are you vacationing?) but I have had a few lagers during Zoom calls this evening so I'll end this here before I start assuming even crazier things. ;-)

If we disagree on any of above, let's try to resolve those disagreements before moving on to debating conclusions.

@ptomato ptomato added non-prod-polyfill THIS POLYFILL IS NOT FOR PRODUCTION USE! spec-text Specification text involved labels Oct 9, 2020
@justingrant
Copy link
Collaborator Author

Technically in 2.4.2 there is no default, because of 2.7.2.6. Otherwise 👍, this looks ready for implementation.

OK I updated the text which should cure the ambiguity. Now fractionalSecondDigits always depends on smallestUnit which has a new default as follows:

  • 2.4.2 If omitted, the default is the smallest nonzero unit in the Duration, or 'seconds', whichever is the smaller unit.

@justingrant
Copy link
Collaborator Author

Technically in 2.4.2 there is no default, because of 2.7.2.6. Otherwise 👍, this looks ready for implementation.

OK I updated the text which should cure the ambiguity. Now fractionalSecondDigits always depends on smallestUnit which has a new default as follows:

  • 2.4.2 If omitted, the default is the smallest nonzero unit in the Duration, or 'seconds', whichever is the smaller unit.

Actually, my revision above wasn't accurate. I replaced with this one:

  • 2.4.2 If omitted, the default behavior is that seconds and larger units will always be displayed, and if there are nonzero fractional seconds units then they will be displayed too unless overridden by fractionalSecondDigits.

@ptomato ptomato self-assigned this Oct 30, 2020
ptomato added a commit that referenced this issue Oct 30, 2020
The 'calendar' option takes values 'auto', 'always', and 'never', and
controls whether to display the calendar in toString() for all the
calendar types.

See: #703
ptomato added a commit that referenced this issue Oct 30, 2020
These options control whether the time zone name annotation and the time
zone offset, respectively, are printed in the output string.

See: #703
ptomato added a commit that referenced this issue Oct 30, 2020
These options control whether the time zone name annotation and the time
zone offset, respectively, are printed in the output string.

See: #703
ptomato added a commit that referenced this issue Oct 30, 2020
The 'calendar' option takes values 'auto', 'always', and 'never', and
controls whether to display the calendar in toString() for all the
calendar types.

See: #703
ptomato added a commit that referenced this issue Oct 30, 2020
These options control whether the time zone name annotation and the time
zone offset, respectively, are printed in the output string.

See: #703
ptomato added a commit that referenced this issue Nov 2, 2020
These options control whether the time zone name annotation and the time
zone offset, respectively, are printed in the output string.

See: #703
ptomato added a commit that referenced this issue Nov 2, 2020
I've taken the liberty of renaming this, otherwise it has a conflicting
meaning compared with the 'timeZone' option passed to Instant.toString()
(which will be added in #741.)

Making the distinction between 'timeZone' and 'timeZoneName' in this way
aligns exactly with what Intl.DateTimeFormat does.

See: #703
@ptomato
Copy link
Collaborator

ptomato commented Nov 2, 2020

I've taken the liberty of renaming timeZone to timeZoneName since otherwise the meaning conflicts with that of the timeZone option to be added in #741. This distinction between timeZone and timeZoneName is also exactly the distinction that Intl.DateTimeFormat makes, so I think it works out well.

@justingrant
Copy link
Collaborator Author

Sounds good. I edited #703 (comment) accordingly.

ptomato added a commit that referenced this issue Nov 2, 2020
The 'calendar' option takes values 'auto', 'always', and 'never', and
controls whether to display the calendar in toString() for all the
calendar types.

See: #703
ptomato added a commit that referenced this issue Nov 2, 2020
These options control whether the time zone name annotation and the time
zone offset, respectively, are printed in the output string.

See: #703
ptomato added a commit that referenced this issue Nov 2, 2020
I've taken the liberty of renaming this, otherwise it has a conflicting
meaning compared with the 'timeZone' option passed to Instant.toString()
(which will be added in #741.)

Making the distinction between 'timeZone' and 'timeZoneName' in this way
aligns exactly with what Intl.DateTimeFormat does.

See: #703
ptomato added a commit that referenced this issue Nov 2, 2020
The 'calendar' option takes values 'auto', 'always', and 'never', and
controls whether to display the calendar in toString() for all the
calendar types.

See: #703
ptomato added a commit that referenced this issue Nov 2, 2020
These options control whether the time zone name annotation and the time
zone offset, respectively, are printed in the output string.

See: #703
ptomato added a commit that referenced this issue Nov 2, 2020
I've taken the liberty of renaming this, otherwise it has a conflicting
meaning compared with the 'timeZone' option passed to Instant.toString()
(which will be added in #741.)

Making the distinction between 'timeZone' and 'timeZoneName' in this way
aligns exactly with what Intl.DateTimeFormat does.

See: #703
Ms2ger pushed a commit that referenced this issue Nov 3, 2020
The 'calendar' option takes values 'auto', 'always', and 'never', and
controls whether to display the calendar in toString() for all the
calendar types.

See: #703
Ms2ger pushed a commit that referenced this issue Nov 3, 2020
These options control whether the time zone name annotation and the time
zone offset, respectively, are printed in the output string.

See: #703
Ms2ger pushed a commit that referenced this issue Nov 3, 2020
I've taken the liberty of renaming this, otherwise it has a conflicting
meaning compared with the 'timeZone' option passed to Instant.toString()
(which will be added in #741.)

Making the distinction between 'timeZone' and 'timeZoneName' in this way
aligns exactly with what Intl.DateTimeFormat does.

See: #703
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Additions to documentation non-prod-polyfill THIS POLYFILL IS NOT FOR PRODUCTION USE! spec-text Specification text involved zoned-type Part of the effort for a type with timestamp+timezone
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants