Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISO 8601 date-time side-issue (see #119) #887

Closed
wants to merge 1 commit into from

Conversation

hjoliver
Copy link
Member

An empty pull request to get this show (#119) on the road.

@hjoliver
Copy link
Member Author

Some initial thoughts on implementation (for discussion, if controversial...)

  1. date-time format parsing and arithmetic (done, by Ben)
  2. parse cycle-times in cylc command lines etc. (easy, via 1)
  3. universal date-time cycling module to replace the current HoursOfTheDay, Daily, Monthly, and Yearly cycler modules.
    • should handle any repeated intervals, including sub-hourly Sub-hourly cycling intervals #68
    • can we absorb cycler functionality into the cycle time object held by each task proxy?
    • presumably Ben has all the necessary arithmetic sorted, we just need to do the particular computations required in cylc (such as what is the next valid cycle time, given one or more repeated interval specs)
    • Currently each task proxy holds multiple cyclers, one of each different cycling graph section it belongs to. Probably better to have a single cycler (or cycle time object?) hold multiple repeating intervals.
  4. need to support "classic mode" for backward compatibility
    • translate old-style cycle-times to ISO at the outset.
      • will still need a switch for old vs new format in filenames etc.
  5. Replace the current cylc cycletime date-time arithmetic utility
  6. more-or-less major overhaul of dependency graph parsing may be required, depending on (c) below

Some more or less related issues that may or may not be tackled initially -

  1. can we avoid the current frequent conversion of the cycle times held by task proxies in and out of object form (used for arithmetic), string form, (and integer form for comparisons?) (cycle time objects should do it all internally) cycle time handling and asynchronous tasks #650
  2. can we give task proxies a cycle time object that can transparently handle simple integers as well as proper date-times, and thereby avoid a lot of special case treatment of asynchronous tasks? Also integer cycling Integer cycling tasks for iterative processes #116
  3. generalize current start-up tasks to allow tasks that run only at shutdown or over some range of cycles during the suite run. Are any of my rantings toward the bottom of Validation error with start-up and cycling tasks in same conditional #434 still relevant?

@m214089
Copy link

m214089 commented Feb 12, 2014

I'm glad to provide full integer routines for Julian day to proleptic Gregorian and back conversion, which does run as well with negative Julian days and spans the entire lifetime of Earth. It took me quite a while to get this running and I haven'T found something like this before. It is in C. But there should be no problem porting it to python. I'll sen the file to Hilary for further distribution...

@hjoliver
Copy link
Member Author

Luis - thanks, but (excuse my ignorance!) can you explain why we need that - I thought ISO 8601 could represent the full range of dates etc.?

@m214089
Copy link

m214089 commented Feb 12, 2014

Hi Hilary,
if you need to do calculations on dates. Eg.

what date do you get for -60010317T134723Z by cycling R2347/PT3M ?

you need to have a clean time axis. So, if you like to support all
reasonable time strings, you almost always end up with the need to do
calculations back on JD. As long events are sub day range things are
simple, but if not it is getting messy ... With the two routines you get:

<=> days/milliseconds

And all cycling activities are trivial on days/milliseconds

Cheerio,
Luis

On 12/02/14 13:52, Hilary James Oliver wrote:

Luis - thanks, but (excuse my ignorance!) can you explain why we need
that - I thought ISO 8601 could represent the full range of dates etc.?


Reply to this email directly or view it on GitHub
#887 (comment).

@hjoliver
Copy link
Member Author

I see, thanks. Maybe @benfitzpatrick can comment on the calculations...

@benfitzpatrick
Copy link
Contributor

@m214089, the data model we put in in #597 supports all negative
proleptic Gregorian years, just like the ISO 8601 standard does.
I don't think there's a need to have to convert away from that
if the maths is right - after all, the proleptic Gregorian calendar
is much less complicated than some of the predecessors, e.g.:
http://en.wikipedia.org/wiki/Roman_calendar#Calendar_of_Numa

If running a suite based around a particular historical date that was
pre-Gregorian (probably not a common use case), you would
need to convert it - but that should be a one-time thing on suite
setup, outside cylc.

We can definitely support your example using the above library
that's now in cylc:

>>> import isodatetime
>>> recurrence = isodatetime.TimeRecurrenceParser().parse("R2347/-0060010317T134723Z/PT3M")
>>> recurrence.get_next(recurrence.start_point)
-006001-03-17T13:50:23Z

Days and milliseconds since a given date are useful, but they
don't help for monthly or yearly cycling (months and years
being such bad units).

@m214089
Copy link

m214089 commented Feb 12, 2014

Great. If you could keep it simpler fine with me.

@arjclark
Copy link
Contributor

@hjoliver - when this is implemented it'll need doing so that all dependencies on incrementing/generating next date/comparing dates/etc are dependent on the isodatetime library only so that the climate calendar extension (see our email chain) can just be slotted in with minimal changes to cylc.

@hjoliver
Copy link
Member Author

@arjclark - yes indeed (360-day calendar)

@matthewrmshin matthewrmshin added this to the soon milestone Feb 14, 2014
@benfitzpatrick
Copy link
Contributor

@hjoliver, re: your points above:

  1. I think you should keep the cycle time attributes of tasks as labels
    to make it simple - which allows the memoisation in cycle_time.py
    to work quickly - but maybe store the cycle time/TimePoint object
    itself alongside, or have a cache of
    task label -> cycle time/TimePoint objects or similar.

  2. is a difficult one - you can get by without a wrapper class
    by only performing comparisons between asynchronous
    time labels and cycling time labels in dedicated functions
    like the ctime_ge ones. These can have extra logic to
    return the right result.

  3. Yes, I think we need all cycling tasks to behave like your Validation error with start-up and cycling tasks in same conditional #434 example:

[[[2010080800]]]                   # initial one-off
    graph = "prep => foo"
[[[HoursOfTheDay(0,12) from 2010080800 until 2010090800]]]
    graph = "foo => bar"
[[[2010080112]]]                   # internal one-off
    graph = "bar => surprise"
[[[2010090800]]]                   # final one-off
    graph = "bar => cleanup"

(obviously with different syntax)

@benfitzpatrick
Copy link
Contributor

(I agree with your implementation thoughts)

@hjoliver
Copy link
Member Author

@benfitzpatrick - thanks for your comments. I had forgotten you had "memoized" cycle_time.py - a nice technique!

benfitzpatrick added a commit to benfitzpatrick/cylc that referenced this pull request Feb 20, 2014
@hjoliver
Copy link
Member Author

Implementation Plan

Here's how I intend to start, early next week.

  1. develop a new cycler module lib/cylc/cycling/iso.py that uses isodatetime.py to implement the interface in lib/cylc/cylcling/base.py. At first this will convert existing cycle time inputs into ISO8601 format inside the cycler class - later on this may provide completely centralized backward compatibility for old-style cycling.
  2. get cylc to take ISO8601 format input directly, command line and suite definition (including graph section headings).

If there are no unforeseen problems this should give us complete ISO8601 cycling capability very quickly (including sub-hourly etc.). Then:
3. the icing on the cake:

  1. shutdown tasks, finite-cycling-range tasks, etc., which may require extensive changes to graph parsing
  2. attempt to tidy up some of the cruft-accruing legacy code associated with cycling, as mentioned above.

@matthewrmshin et.al. - any disagreement with this approach, please let me know Friday (UK time) - I'm taking the family on a beach holiday Friday evening though Sunday, but if my enthusiasm gets the better of me I may make a start on this Sunday night.

benfitzpatrick added a commit to benfitzpatrick/cylc that referenced this pull request Feb 24, 2014
@hjoliver
Copy link
Member Author

hjoliver commented Mar 4, 2014

I'm closing this as a Pull Request is not really the right way to handle the early stages of such a big change - too much rebasing and/or updating from master will be required (and there are no real commits on the branch yet). Better to use feature branches on github that can be turned into Pull Requests when ready.

Discussion reverting to #119.

@hjoliver hjoliver closed this Mar 4, 2014
@hjoliver hjoliver deleted the ISO-8601 branch March 4, 2014 07:32
@matthewrmshin matthewrmshin removed this from the soon milestone Mar 4, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This is a duplicate of something else
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants