Stats API #679

ukutaht · 2021-02-04T15:14:21Z

Changes

Relevant discussion: #95

This adds a read-only API to retrieve stats from your Plausible dashboard. I will create a PR with full documentation on our docs repository soon. Will also publish a Postman collection for testing. Some relevant points to discuss:

Authentication

I decided to go with a simple Bearer token for authentication. The API key is generated with Erlang's crypto:strong_rand_bytes(64) |> Base.url_encode64(). This creates 64 strong random bytes. Before storing in the database, this value along with the server secret is hashed using sha256. The first 6 letters are kept as plaintext to make it easier to recognise the API key from the UI.

I didn't want to use Bcrypt here because it adds an artificial delay of about 250ms to each request. OK when you're loggin in, but not OK for API requests. Since the API 'password' is 64 random bytes, we have much more entropy there and a brute force attack is less likely than user-generated passwords. Would be good for someone to confirm whether this is a reasonably secure approach.

Endpoints

Endpoints are namespaced with /api/v1/stats. The API is quite low level and flexible, almost like a restricted database interface. I worry about getting the abstractions right at this stage. I am building it with the aim of being able to rebuild the main dashboard by just using the public API endpoints. The endpoint are:

/api/v1/stats/realtime/visitors for current visitors
/api/v1/stats/aggregate for aggregated stats like the top row of the dashboard (visitors, bounce rate, etc)
/api/v1/stats/timeseries for graph data like the main graph on the dashboard
/api/v1/stats/breakdown for breaking down properties like 'Top sources', 'Top pages' etc. Could also be called group-by. Not part of this PR, under construction still 🚧

Again, full documentation will be available soon on the docs repo.

The endpoints are built to accommodate features that we have hard plans to build in the near future:

Different filter operators (currently only == supported)
Combining filters with AND and OR (currently only AND supported)
Different comparison modes (previous period, month-over-month, year-over-year, etc)
Configurable interval for timeseries data (e.g. view last year of data but group daily instead of monthly)

Tests

Automated tests have been added

Changelog

Entry has been added to changelog

Documentation

Docs have been updated

oliver-kriska · 2021-02-04T19:39:49Z

What about Phoenix Token for api authentication? Performance should be fine, even you can use lower key iterations than default 1000. Security should be maybe better. You can store api_key id in this token and do some verification or even skip it, whatever you want. In case of huge need for performance, you can use cachex for cache of actually used keys and verification.

https://hexdocs.pm/phoenix/Phoenix.Token.html#decrypt/4
There is code how it works: https://github.com/elixir-plug/plug_crypto/blob/master/lib/plug/crypto/message_encryptor.ex#L58-L67

oliver-kriska · 2021-02-04T19:43:14Z

The best practice is just show once the api key. But you can store it with Cloak and allow to show it. This storing type is not good for search but you don't have to search because token can contain regular api key id inside.

ukutaht · 2021-02-05T08:04:54Z

What about Phoenix Token for api authentication? Performance should be fine, even you can use lower key iterations than default 1000. Security should be maybe better. You can store api_key id in this token and do some verification or even skip it, whatever you want. In case of huge need for performance, you can use cachex for cache of actually used keys and verification.

https://hexdocs.pm/phoenix/Phoenix.Token.html#decrypt/4
There is code how it works: https://github.com/elixir-plug/plug_crypto/blob/master/lib/plug/crypto/message_encryptor.ex#L58-L67

Like you mentioned the best practice is to just show the key once. After the key is generated, I would prefer to treat it like a password. This means that even in case of the database and secrets leaking, it should not be possible to retrieve the plaintext api keys.

To achieve this I wanted to use a one-way hash function instead of a two-way encryption function. This is why I didn't use Phoenix Token. Because in case a of the database and secrets leaking, the attacker can just Phoenix.Token.decrypt/4 them all and get access to api keys.

With the current sha256 approach, the attacker would not have an easy way to retrieve the plaintext value of the api key. Even if they knew the server secret, they would have to brute force 10^77 possible combinations.

oliver-kriska · 2021-02-05T08:33:38Z

@ukutaht I got it. I understand what you want to achieve. We will see how it will work in production. BTW: it will be bigger problem when DB and phx secret will be compromised ;)

ukutaht added 13 commits January 28, 2021 10:44

WIP

5cf0732

Add ability to filter by anything

f31886a

Add API keys

2efa836

Add version to api endpoint

d1f8e27

Fix API test route

2a9b0ea

Fix API tests

c380718

Allow 'date' parameter in '6mo' and '12mo'

26eb947

Rename session -> visit in API filters

372efe2

Filter expressions in the API

34fcf3e

Implement filters in aggregate call

87f018c

Add compare option to aggregate call

87a79bd

Add way to manage API keys through the UI

f1deb30

Authenticate with API key

639d146

Use API key in tests

1b7e690

ukutaht marked this pull request as ready for review February 5, 2021 08:32

ukutaht merged commit 5acb5b7 into master Feb 5, 2021

ukutaht deleted the api branch February 5, 2021 09:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stats API #679

Stats API #679

ukutaht commented Feb 4, 2021 •

edited

Loading

oliver-kriska commented Feb 4, 2021

oliver-kriska commented Feb 4, 2021

ukutaht commented Feb 5, 2021 •

edited

Loading

oliver-kriska commented Feb 5, 2021

Stats API #679

Stats API #679

Conversation

ukutaht commented Feb 4, 2021 • edited Loading

Changes

Authentication

Endpoints

Tests

Changelog

Documentation

oliver-kriska commented Feb 4, 2021

oliver-kriska commented Feb 4, 2021

ukutaht commented Feb 5, 2021 • edited Loading

oliver-kriska commented Feb 5, 2021

ukutaht commented Feb 4, 2021 •

edited

Loading

ukutaht commented Feb 5, 2021 •

edited

Loading