Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent use of timestamps in API #522

Closed
candlerb opened this issue Apr 26, 2019 · 3 comments
Closed

Consistent use of timestamps in API #522

candlerb opened this issue Apr 26, 2019 · 3 comments

Comments

@candlerb
Copy link
Contributor

Is your feature request related to a problem? Please describe.

The Loki API appears inconsistent in how it handles timestamps:

  • start and end query offsets are numeric, nanosecond-based epoch times
  • "ts" values for logs being pushed in or queried are string ISO8601 format (with nanosecond resolution)

This makes the API uncomfortable to use, and may require unnecessary levels of processing. Examples:

  1. loki may return an ISO8601 UTC timestamp, but if the browser wants to display it in local time, it will have to reparse it before reconstructing local time. (Or vice versa - the API doesn't define whether it returns timestamps in UTC or the host system local timezone)

  2. If you are reading a batch of logs, and the call terminates due to limit being reached, then calculating the next start or end time involves having to parse the last log's timestamp.

    (Iteration might be better handled with an opaque "next event" token, but that's a separate issue)

  3. Converting ISO8601 to nanosecond start and end values accurately in Javascript is problematic. The built-in Date() function only handles integral numbers of milliseconds; and even if it used floating point, the 53-bit mantissa would limit resolution to slightly better than a microsecond.

    I think you have to do the following: pad the string timestamp to ensure it has 9 digits after the period; trim off the last 6 digits; parse the remainder; convert integer milliseconds to string; and then stick the last 6 digits back on again. Yuk.

Describe the solution you'd like

I think that log timestamps should be be milliseconds from epoch everywhere. These are easy to use, native to Javascript, and perfectly accurate enough for system logs (more accurate than most machines' clock sync anyway).

As far as I can tell, cortex and loki use epoch timestamps internally. e.g. I can see the "from" and "to" timestamps in base64-encoded chunk filenames are millisecond-based epoch times:

fake/7a257c9eb6a62090:169f316b816:169f316b862:7d7324d2

Time.at(0x169f316b816 / 1000.0) = 2019-04-06 14:39:06 +0000

Also, the Loki design appears to be loosely based on AWS Cloudwatch Logs, which uses millisecond epoch timestamps for both queries and responses.

It's a breaking change, but loki is pre-beta.

Describe alternatives you've considered

  1. Having millisecond timestamps does make it more likely that successive log lines will have identical timestamps, but that has always been a possibility.

    To eliminate that problem, you could define that loki timestamps have two parts: a millisecond timestamp plus a sequence number which resets to zero on every new millisecond. That happens to be exactly how redis streams generates its message IDs. It makes iterating over logs trouble-free, and they are trivial to process: e.g.

    > "1556295624536-17".split("-")
    
  2. If it's considered important to retain nanosecond resolution, then "ts" values could be nanoseconds from epoch (as numbers). These can be processed on systems which have 64-bit integers; however, Javascript will only achieve slightly better than microsecond accuracy, due to the 53-bit mantissa. JSON parsers in other languages which treat numbers as floating point will be similarly affected.

  3. Return "ts" values as numeric nanoseconds, but in JSON decimal strings rather than numbers. This makes it moderately easy to split off the last 6 digits, thus giving millisecond time plus sub-millisecond nanoseconds (0-999999), which can be reassembled later:

    > x="1556295624123456789"
    > [Number(x.slice(0,x.length-6)), Number(x.slice(x.length-6))]
    

    Seems messy to me, but is similar work to option 1 (millisecond plus sequence).

  4. Keep "ts" as ISO8601, but change start and end query parameters to be ISO8601 strings as well for consistency. This makes it easier when querying logs in batches, as you don't need to parse the time to create the start or end time of the next batch (which as mentioned above, is hard to do in Javascript as it has insufficient numeric accuracy).

    However it still requires lots of parsing and conversion both in Loki and in its clients, between ISO and epoch formats, and the sub-milliseconds still have to be handled separately.

Additional context
AWS GetLogEvents API

@cyriltovena
Copy link
Contributor

we took another approach and we allow to send ISO query string for start and end does that work for you ? if not please reopen.

@candlerb
Copy link
Contributor Author

For me, it's not as good as the AWS Cloudwatch approach (ms timestamp); but at least it's consistent so that's acceptable.

@candlerb
Copy link
Contributor Author

candlerb commented Jul 9, 2019

Relates to: #656, #597

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants