-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extended ISO string representation if time zone is an offset (not an IANA name) #703
Comments
What we currently do for the time zone part of an ISO string, is treat |
After doing more research on this topic today, I now believe that So a reasonable user who's not familiar with DST might assume that parsing one of those persisted-by-another-platform strings is good enough to use in Temporal. But that leads to buggy code which will have a local time that's off by one hour, like this: LocalDateTime.from('2020-06-25T15:20-04:00').plus({months: 6}); So I think it makes sense to put a roadblock in front of those users so that they at least get an exception to help them figure out that what they really need is this. LocalDateTime.from({absolute: '2020-06-25T15:20-04:00', timeZone: 'America/New_York'}).plus({months: 6}); There will be cases where users really mean that the time zone should be What do you think? |
@justingrant's points make sense to me. If you have a string without an IANA time zone, you can still get an Absolute for it. |
I think it would be very weird if LocalDateTime could only accept an ISO string if it included an unofficial extension. |
I assume it would accept the |
This was very useful feedback that prompted me to look at Temporal parsing more broadly with an eye towards preventing the (unfortunately common) error of parsing a string into the "wrong" Temporal type. For example, I suspect that the code below is very likely (almost always?) a bug: Temporal.DateTime.from("2020-06-30T18:56:32.260Z") I think this code should throw. The string is explicitly declaring that it's an absolute string so the chance that the user really intends it to be a Error messages could help developers understand how it works. If an ISO string doesn't parse, we could try to parse it using other formats and if any of them match then we could tune the error message accordingly. For example: Temporal.DateTime.from("2020-06-30T18:56:32.260Z")
// => throws new RangeError(
// `Cannot parse \`Temporal.DateTime\` from '${s}'. For absolute date/time parsing, use \'Absolute.prototype.from()\''
// );
Temporal.Time.from("P15D")
// => throws new RangeError(
// `Cannot parse \`Temporal.Time\` from '${s}'. For duration parsing, use \'Duration.prototype.from()\''
// ); BTW, even without
It may be worth a page in the docs (e.g. "Parsing Temporal Objects from ISO 8601 Strings") that turns the list above into an easier-to-understand format like a table with examples. Also the page should include an explanation of our extensions for calendars and time zones. I'd volunteer to write a page like this if you think it'd be helpful. What do you think? I ask because I went through the same process to determine the current parsing behavior of
Anyway, with that context in mind...
A fundamental challenge we have is that ISO 8601 knows about offsets (including That said, Z strings are definitely something that users will try, so it probably deserves a clarifying error message, per discussion above. Here's what's currently used for object initializers. We could do something similar for string initializers too: if (item instanceof Temporal.Absolute) {
throw new TypeError('Time zone is missing. Try `{absolute, timeZone}`.');
}
Nope. Per discussion above, to avoid likely error cases my intent was that callers explicitly opted in to provide a time zone, either by using an Extended ISO string, or by explicitly specifying the time zone to force the caller to confirm that they're really intending to use UTC or any other offset-based "time zone". Like this: Temporal.LocalDateTime.from({
absolute: Temporal.Absolute.from('2020-06-30T18:56:32.260Z'),
timeZone: 'UTC'
});
// OR
Temporal.Absolute.from('2020-06-30T18:56:32.260Z').toLocalDateTime('UTC'); To me this seems similar to the case where |
On Z: I see Z == UTC as being the only real time zone supported by ISO 8601, and I think Z counts as a clear enough intent. UTC has no daylight transitions and is common in computing. "Z" can be seen as different than "+00:00", which could correspond to Europe/London in the winter or Atlantic/Azores in the summer. On explicit time zones in general: This is consistent with Temporal design principles. <rant> My hope for the last few quarters has been that we apply the same "explicit is better" mentality that we have for time zones to calendars. Time zones and calendars are equals in the data model, so why not in the API? (Main issue: #292) </rant> |
I'm not sure I agree that being explicit here is better either, so I don't feel I'm being inconsistent 😄 🤷 |
All the points around parsing with the What I am more concerned about though, is the |
I didn't look into your LocalDateTime proposal yet, but it can convert all given ISO strings to UTC.
We should stick to ISO for output at least. I expect the following:
And if we really want to output a non-ISO string (which I think we shouldn't because of my arguments in the other issues, see below)
With the current API, to output an ISO string from an absolute and a time zone, we have to code:
It then looks that we favor the extended format (non-ISO) rather than ISO. The issue below demonstrates why offset and time zone together in a datetime string leads to problems, to more code to handle those problems, etc. See my arguments why we shouldn't even accept any extended format: Another related issue: #741 |
@niklasR @thojanssens - I copied your feedback into #741. That issue is focused on exactly the topic you're concerned about above, which is whether the time zone time zone is included in the output of |
Here are my opinions on this topic:
|
I see this thread as circular. On the one hand, it is bug-prone for LocalDateTime to parse an offset time zone. On the other hand, if a LocalDateTime is explicitly built with an offset time zone, we need a way to serialize it in
In that case, you have two choices that are compatible with the proposal on hand: Absolute.from, or DateTime.from. Pick the one that corresponds to whether you want the offset to win or the datetime to win. Since you usually want the absolute to win, you end up doing: Temporal.Absolute.from(isoString).toLocalDateTime(timeZoneName, "iso") Actually, I think starting your original snippet, starting with LocalDateTime.from, is fundamentally flawed: since the offset time zone and IANA time zone are known at different points, you can't properly resolve the "should offset or datetime win?" question. |
Yes, I'm OK with accepting that LocalDateTime.from can be bug-prone here, I find that the "least worst" option. The bracketed offset seems to harm interoperability, and I would like to avoid adding more types. (Although I learned recently that Java makes a distinction between java.time.ZonedDateTime and java.time.OffsetDateTime for this reason.) I guess |
Summarizing my objection: I dislike the idea of catering to what seems to me a hypothetical misunderstanding against the "correct" usage. |
I think the root question is this: in
My concern with implicitly treating an offset as a time zone is that it silently turns LocalDateTime into DateTime, with all the disadvantages of DateTime math: results will usually look correct but will break around DST transitions. But I also see the value of being able to easily parse and emit zoneless ISO strings for use-cases like formatting where DST is a non-issue because the underlying instant never changes after it's parsed. Here's a few ideas to try to address both concerns:
const zoneless = Temporal.LocalDateTime.from('2020-08-10T03:43:36+05:30');
zoneless.toString(); // => 2020-08-10T03:43:36+05:30
zoneless.timeZoneOffsetString; // => "+05:30"
zoneless.timeZone; // => undefined? null?
zoneless.plus({days: 1, hours: 12}); // throws
zoneless.with({hour: 0}); // throws
zoneless.with({timeZone: 'Asia/Kolkata', hour: 0}); // OK
zoneless.hoursInDay; // throws
zoneless.isTimeZoneOffsetTransition; // throws
const zoned = zoneless.with({timeZone: 'Asia/Kolkata'});
zoned.plus({days: 1, hours: 12}); // OK
zoneless.toString(); // => 2020-08-10T03:43:36+05:30[Asia/Kolkata]
const offsetZoned = zoneless.with({timeZone: '+05:30'}); // explicit offset time zone
offsetZoned.plus({days: 1, hours: 12}); // OK
offsetZoned.toString(); // => 2020-08-10T03:43:36+05:30[+05:30] (or 2020-08-10T03:43:36+05:30) ? AND/OR
AND/OR
FWIW, import java.time.ZonedDateTime;
class Main {
public static void main(String[] args) {
ZonedDateTime withZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[Asia/Kolkata]");
System.out.println("with zone: " + withZone.toString());
// => with zone: 2017-06-16T21:25:37.258+05:30[Asia/Kolkata]
ZonedDateTime offsetOnly = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30");
System.out.println("offset only: " + offsetOnly.toString());
// => offset only: 2017-06-16T21:25:37.258+05:30
ZonedDateTime offsetTimeZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[+05:30]");
System.out.println("offset time zone: " + offsetTimeZone.toString());
// => offset time zone: 2017-06-16T21:25:37.258+05:30
}
} |
Personally I would be okay with this. It sounds a lot like the option I offered in #292 dubbed "partial ISO" where calendar-dependent operations would throw if a calendar wasn't specified. Note that in the previous calendar discussions, others pointed out that given the strong typing nature of Temporal, it might be cleaner to split the type into two explicit types rather than making it have different behavior depending on whether or not the time zone (or calendar) is available.
Lemma A: Developers tend to know ahead of time, or can easily obtain, the syntax of the string they are parsing. For example, developers know, or can find out, whether their strings have only an offset ( Given Lemma A, I do not believe that all of the choices in The one choice I think is useful is If we go with this option, we should discuss the naming of the option and the argument.
Without additional API, one can do this by replacing the LocalDateTime's IANA TimeZone with an offset TimeZone.
Interesting. |
I feel strongly against this, for the same reason that I felt strongly against the "partial ISO" proposal: for all intents and purposes it's an internal "is this object broken" flag.
I'm not in favour of that default (as per last week's discussion, I'm strongly against the default being to throw on strings that can represent a valid use case) but at first glance I think the option is a good idea.
Not sure how I feel about adding another option for this. It seems like
I don't think I agree with this lemma. I think it would encourage naive attempts such as |
What I meant is, I claim that most of the time, the developer knows while writing the code what format the strings are going to be in. More often than not, the strings come from the same source, whether that's a Postgres database, a CSV file, a JSON API, a date picker component, etc. The developer can look at the typical output from that source to see what is the correct Temporal parsing function to use. |
Ah, gotcha. I also forgot to say in my earlier comment that if |
This is a long thread so I'll summarize my concern: users shouldn't be able to perform math operations (or other DST-sensitive operations like For example, the code below should throw because it's trying to perform hybrid-duration math on an implicitly defined offset time zone. This is almost certain to be a bug. Temporal.LocalDateTime.from('2017-06-16T21:25:37.258+05:30').plus({days: 1, hours: 12}); As long as the code above isn't allowed, I'm open to many different solutions, including:
My preferred solution would be either (1) or (3), but I don't feel that strongly as long as the buggy math is disallowed. IMHO the Depending on how we solve this problem, we may or may not want to offer the same behavior for object initializers. For example, if we choose (1) above, then I assume we'd want to offer the same option for object initializers.
Here's another interesting tidbit: Java doesn't require the offsets to match. In other words, internally its parsing will parse the DateTime+offset string into a ZonedDateTime offsetTimeZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[+05:30]");
System.out.println("offset time zone: " + offsetTimeZone.toString());
// => offset time zone: 2017-06-16T21:25:37.258+05:30
ZonedDateTime mismatchedOffsetZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[+06:30]");
System.out.println("mismatched offset time zone: " + mismatchedOffsetZone.toString());
// => mismatched offset time zone: 2017-06-16T22:25:37.258+06:30
ZonedDateTime invalidOffsetForZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[America/Los_Angeles]");
System.out.println("invalid offset for zone: " + invalidOffsetForZone.toString());
// => invalid offset for zone: 2017-06-16T08:55:37.258-07:00[America/Los_Angeles]
I don't feel very strongly about this one, so I'm inclined to agree with you. We could always add this later if this is a source of user confusion. BTW, the actual code is a little different: `${ldt.toDateTime()}${ldt.timeZoneOffsetString}`
Could you explain your position in more detail? From my perspective, allowing Per @sffc's comments above (which I agree with), developers are likely to know the format of the strings that they're parsing, so recovering from an exception will be trivial in most cases. This is unlike, for example, offset vs. timezone conflicts which by definition will only show up after an app has been in production long enough for time zone rules to change. If the caller gets an exception the first time they call BTW, I agree that some operations are safe (aka "valid use case") on a LocalDateTime with an implicit offset time zone, but that seems like an argument for a separate OffsetDateTime type or the "partial-ISO-like" solution. IMHO, throwing by default seems to be the less confusing solution relative to either of those alternatives. |
I think my disconnect with your explanation boils down to: I don't see (At least, the above point about nautical time zones holds if the offset is whole-hour. I'd maybe be fine with throwing on |
I think that you're highlighting a core challenge with the bracketless syntax: it's ambiguous. Reasonable people can reasonably disagree about whether I don't think that either interpretation is wrong. Both have pros and cons. But I'm also confident that we'll see both interpretations among users. Many developers simply won't understand the difference. Given that both interpretations exist, my preference is for "offset" because it's easier for developers to realize that they have the "wrong" interpretation:
If we did go with "offset" (meaning Temporal.Absolute.from(s).toLocalDateTime(s);
`${ldt.toAbsolute()}${ldt.timeZoneOffsetString}` The non-rare case is where the source data simply doesn't have time zone information. It's just a DateTime+Offset value that was lossily stored. For example, AFAIK there's no DBMS today has a native data type that stores DateTime+offset+TimeZone, so lossy storage is the norm. These values are absolutely not safe to perform LocalDateTime math, with, etc. So we should really be pushing these use-cases to DateTime and/or Absolute instead. If the data doesn't have a real time zone, LocalDateTime adds little/no value, and can actually makes things worse via DST bugs.
FWIW, there are specific
If we went with "time zone" then what would our docs for the
One of the reasons I'm pushing for "offset" is to avoid the need for that second paragraph. ;-) |
I’ll divide my response into 2 parts:
Any argument that calls upon “the 99% case” is prima facie spurious unless actually providing data and evidence to support the statement that this is in fact “the 99% case”. The argument above fails to do so. To quote C.Hitchens: Any argument presented without evidence can be dismissed without evidence But things get worse: the little bit of evidence provided above:
could just as well serve as evidence for the other side of the argument; the fact that no DBMS has such a type indicates that “the 99% case” is that people don’t care about doing DST correct calculations; on the contrary that is something only the 1% of people writing calendaring & scheduling software ever care about. The “actual 99% case” is the one where “correct DST” calculations are actually uninteded. Note: I did not provide any evidence for this assertion and am fine with it being dismissed without evidence. It’s sole intent was to demonstrate that the argument above was just as spurious and to be dismissed. The argument to the merit is slightly different. Here a few of the assumptions I am making:
Based on these assumptions the conclusion must be that all temporal types should be operable without the bracket extension. This most definitely includes Beyond that your use-case for
seems blatantly incorrect to me; it’s simply not a use-case you operate with. For the same reason I also object to the statement (from proposed documentation):
To my mind the only correct part of that “second paragraph” is:
Which is basically just saying:
and that’s needed no matter what because of the fact that we defined the non-standard bracket extension. So that “second paragraph” is necessary either way. One could even argue that this “second paragraph” would be even more necessary for the case where we don’t accept pure ISO strings, except it would have to read:
And that in my mind would be the point where we’d have to reevaluate whether such a non-standard thing should be in Temporal at all. Given the general usefulness of I hope this exposition makes as much sense to people reading it as it did when I was writing it in 33C weather. If not, I’m happy to discuss live. |
If the IANA "Etc" time zones cover the nautical time use case, can we just remove the concept of arbitrary offet time zones from the spec? People can still implement them as a custom time zone if they really want them. |
I admit that I'm having trouble figuring out exactly where we agree vs. disagree on this thread. Could we try to clarify? I'll list a few statements below-- let me know which you agree vs. disagree with. They're roughly in order-- if you don't agree with one, then you probably won't agree with ones below it. Assumption 1: Offsets are different than time zones An "offset" applies to only one single DateTime, but it has nothing to say about the offsets of other DateTime values. Knowing the offset of one DateTime does not let you know what the offset will be one day later. On the other hand, a "time zone" can be used to calculate the offsets of other values derived from this one, e.g. via An "offset time zone" is a time zone that always has the same constant offset. Assumption 2: Offsets act just like time zones for static date/time values Offsets and offset time zones act the same unless mutation is involved. Time zones only matter if you change the value, so operations that don't change the value (e.g. Assumption 3: Offsets that are not timezones are unsafe for DST-sensitive operations like Operations that create new values, like SELECT DATEADD(day, 1, arrival_time) from flights Assumption 4: many (most?) developers won't understand the subtle difference between offsets and offset time zones Given developer confusion about time zones and DST in general, I expect that many (maybe most) developers won't understand the subtle difference between offsets and time zones, and specifically they may not understand why math with a timezone-less offset is buggy. Assumption 5: There are two main use cases for offsets AFAIK, there are two main use cases using offsets: Assumption 6: the "Lossy Storage" use case's offset is not a time zone If a DateTime+offset is representing a value that originally had a time zone, but the time zone was lost in persistence, then its offset is just an offset, not a time zone. This means that it's not safe to do math or other DST-sensitive operations using that offset. Assumption 7: The "Lossy Storage" use case is much more common than "Ocean Shipping" use case, but both are important We don't need research to know that storing temporal data in a SQL database is much more popular than oceanic transport and other similar use cases. We can argue about whether the ratio is 20:1, 100:1, or 1000:1, but I'm not sure that the actual ratio matters much, as long as we can agree that:
It's not 33C here (@pipobscure where are you vacationing?) but I have had a few lagers during Zoom calls this evening so I'll end this here before I start assuming even crazier things. ;-) If we disagree on any of above, let's try to resolve those disagreements before moving on to debating conclusions. |
OK I updated the text which should cure the ambiguity. Now
|
Actually, my revision above wasn't accurate. I replaced with this one:
|
The 'calendar' option takes values 'auto', 'always', and 'never', and controls whether to display the calendar in toString() for all the calendar types. See: #703
These options control whether the time zone name annotation and the time zone offset, respectively, are printed in the output string. See: #703
These options control whether the time zone name annotation and the time zone offset, respectively, are printed in the output string. See: #703
The 'calendar' option takes values 'auto', 'always', and 'never', and controls whether to display the calendar in toString() for all the calendar types. See: #703
These options control whether the time zone name annotation and the time zone offset, respectively, are printed in the output string. See: #703
These options control whether the time zone name annotation and the time zone offset, respectively, are printed in the output string. See: #703
I've taken the liberty of renaming this, otherwise it has a conflicting meaning compared with the 'timeZone' option passed to Instant.toString() (which will be added in #741.) Making the distinction between 'timeZone' and 'timeZoneName' in this way aligns exactly with what Intl.DateTimeFormat does. See: #703
I've taken the liberty of renaming |
Sounds good. I edited #703 (comment) accordingly. |
The 'calendar' option takes values 'auto', 'always', and 'never', and controls whether to display the calendar in toString() for all the calendar types. See: #703
These options control whether the time zone name annotation and the time zone offset, respectively, are printed in the output string. See: #703
I've taken the liberty of renaming this, otherwise it has a conflicting meaning compared with the 'timeZone' option passed to Instant.toString() (which will be added in #741.) Making the distinction between 'timeZone' and 'timeZoneName' in this way aligns exactly with what Intl.DateTimeFormat does. See: #703
The 'calendar' option takes values 'auto', 'always', and 'never', and controls whether to display the calendar in toString() for all the calendar types. See: #703
These options control whether the time zone name annotation and the time zone offset, respectively, are printed in the output string. See: #703
I've taken the liberty of renaming this, otherwise it has a conflicting meaning compared with the 'timeZone' option passed to Instant.toString() (which will be added in #741.) Making the distinction between 'timeZone' and 'timeZoneName' in this way aligns exactly with what Intl.DateTimeFormat does. See: #703
The 'calendar' option takes values 'auto', 'always', and 'never', and controls whether to display the calendar in toString() for all the calendar types. See: #703
These options control whether the time zone name annotation and the time zone offset, respectively, are printed in the output string. See: #703
I've taken the liberty of renaming this, otherwise it has a conflicting meaning compared with the 'timeZone' option passed to Instant.toString() (which will be added in #741.) Making the distinction between 'timeZone' and 'timeZoneName' in this way aligns exactly with what Intl.DateTimeFormat does. See: #703
From #700 open issues:
Example:
The argument to repeat the offset in brackets is to prevent
LocalDateTime
from parsing ISO strings that lack a time zone identifier, in order to prevent implicitly treating a timezone-less value as if it had a time zone. The whole idea behind LocalDateTime is that time zones should never be implicit (withnow.localDateTime
as the only exception) because implicitly assuming a time zone is the path to timezone issues and DST bugs like we get with legacyDate
.But what's the argument on the other side to avoid emitting the duplicated offset in brackets? And if we did that, then should
LocalDateTime
still forbid parsing of bracket-less ISO strings?The text was updated successfully, but these errors were encountered: