Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Telemetry] Add eventlog endpoint for collecting client-side events #501

Closed
wants to merge 52 commits into from
Closed
Show file tree
Hide file tree
Changes from 50 commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
344bb57
Add initial eventlog hook
yuvipanda Jul 7, 2019
a0f40ea
Install jupyter_telemetry from source
yuvipanda Jul 7, 2019
96bf2f0
Set up an eventlog API endpoint
yuvipanda Jul 7, 2019
06b91e0
Use different naming convention & add test for it
yuvipanda Jul 7, 2019
716ff1b
Don't use f-strings
yuvipanda Jul 7, 2019
8e122fc
Derive JSON Schema files from YAML files
yuvipanda Jul 9, 2019
f9a0dfb
Keep event schemas in YAML
yuvipanda Jul 9, 2019
c7428e8
Depend on the jupyter_telemetry package
yuvipanda Jul 9, 2019
9437e88
read schemas from new utils function
Zsailer Oct 1, 2019
6e3c80c
Add fix for tables in RTD theme sphinx docs.
Zsailer Oct 1, 2019
4035fd5
add event schema auto-documentation to jupyter notebook docs
Zsailer Oct 1, 2019
23d50a3
format paths in recorded events
Zsailer Oct 1, 2019
3c94970
add documentation for eventlog endpoint
Zsailer Oct 1, 2019
e76c91b
return exception as 400 error in eventlog endpoint
Zsailer Oct 1, 2019
2ce7c54
normalize path in emitted event
Zsailer Oct 1, 2019
5794d31
initial tests
Zsailer May 19, 2020
7c9d3d5
add initial telemetry docs
Zsailer May 19, 2020
ef8573d
fix jupyter_telemetry dependency
Zsailer May 19, 2020
ea9e352
point telemetry at correct dev branch
Zsailer May 19, 2020
b06f7d6
add tests for eventlog
kiendang Oct 20, 2020
4d7fc23
Merge branch 'master' into jupyter_telemetry
kiendang Dec 17, 2020
7302396
Use correct fixture names
kiendang Dec 17, 2020
2c79b1a
Merge branch 'master' into jupyter_telemetry
kiendang Mar 9, 2021
0ac16dc
Fix import
kiendang Mar 10, 2021
7c81b23
Remove redundant call
kiendang Mar 18, 2021
b8ca484
Update telemetry
kiendang Mar 20, 2021
ac452cd
Add note about security
kiendang Mar 22, 2021
70f9275
Register client telemetry schemas using entry_points
kiendang Mar 31, 2021
7f50c85
Add working telemetry commit for testing
kiendang Apr 6, 2021
99439b6
Use backported importlib_metadata
kiendang Apr 6, 2021
cdc92e9
Merge remote-tracking branch 'upstream/master' into jupyter_telemetry
kiendang Apr 6, 2021
8e69ab0
Ignore errors while registering client events
kiendang Apr 23, 2021
e2db0ad
Add client eventlog to list services
kiendang Apr 23, 2021
66accdc
Add tests for client telemetry events
kiendang Apr 23, 2021
4dcd258
Add client telemetry eventlog tests to CI
kiendang Apr 23, 2021
87dca57
Merge branch 'master' into jupyter_telemetry
kiendang Apr 23, 2021
9fd2a5f
Clean up
kiendang Apr 23, 2021
f692488
Fix eventlog test
kiendang Apr 23, 2021
21117a6
Use standard lib instead of backport when possible
kiendang Apr 24, 2021
1936503
Fix docs
kiendang Apr 25, 2021
e099ccc
Merge branch 'master' into jupyter_telemetry
kiendang Apr 25, 2021
bc94f13
Fix docs
kiendang Apr 25, 2021
bfbdd17
Refine example
kiendang Apr 25, 2021
0974231
Use same interface for registering file and file object
kiendang Apr 25, 2021
035eb6e
Fix client test ci
kiendang Apr 25, 2021
8179c47
Remove redundant check
kiendang Apr 26, 2021
c307a8d
Add docs on registering client events
kiendang Apr 26, 2021
e206188
Remove unrelated doc change
kiendang Apr 26, 2021
cce5c71
Add .yml extension
kiendang May 3, 2021
19214e8
No longer use pathlib .suffix to check file extension
kiendang May 3, 2021
c67401d
Doc change notebook server to jupyter server
kiendang May 3, 2021
e15dd9b
Merge remote-tracking branch 'upstream/master' into telemetry_client
kiendang May 3, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/workflows/python-linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,12 @@ jobs:
- name: Run the tests for the examples
run: |
pytest examples/simple/tests/test_handlers.py
- name: Install the Python dependencies for the client telemetry eventlog example
run: |
cd examples/client_eventlog_example && pip install -e .
- name: Run the tests for the client telemetry eventlog example
run: |
pytest examples/client_eventlog_example/tests
- name: Coverage
if: ${{ matrix.python-version != 'pypy3' }}
run: |
Expand Down
6 changes: 6 additions & 0 deletions .github/workflows/python-macos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,12 @@ jobs:
- name: Run the tests for the examples
run: |
pytest examples/simple/tests/test_handlers.py
- name: Install the Python dependencies for the client telemetry eventlog example
run: |
cd examples/client_eventlog_example && pip install -e .
- name: Run the tests for the client telemetry eventlog example
run: |
pytest examples/client_eventlog_example/tests
- name: Coverage
if: ${{ matrix.python-version != 'pypy3' }}
run: |
Expand Down
6 changes: 6 additions & 0 deletions .github/workflows/python-windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,9 @@ jobs:
- name: Run the tests for the examples
run: |
pytest examples/simple/tests/test_handlers.py
- name: Install the Python dependencies for the client telemetry eventlog example
run: |
cd examples/client_eventlog_example && pip install -e .
- name: Run the tests for the client telemetry eventlog example
run: |
pytest examples/client_eventlog_example/tests
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
MANIFEST
docs/source/operators/events
build
dist
_build
docs/man/*.gz
docs/source/api/generated
docs/source/config.rst
docs/gh-pages
docs/source/events
jupyter_server/i18n/*/LC_MESSAGES/*.mo
jupyter_server/i18n/*/LC_MESSAGES/nbjs.json
jupyter_server/static/style/*.min.css*
Expand Down Expand Up @@ -36,4 +38,3 @@ config.rst

# copied changelog file
docs/source/other/changelog.md

3 changes: 3 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ include package.json
# include everything in package_data
recursive-include jupyter_server *

# Event Schemas
graft jupyter_server/event-schemas

# Documentation
graft docs
exclude docs/\#*
Expand Down
3 changes: 3 additions & 0 deletions docs/doc-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,6 @@ sphinxcontrib-openapi
sphinxemoji
myst-parser
pydata_sphinx_theme
git+https://github.com/pandas-dev/pydata-sphinx-theme.git@master
sphinx-jsonschema
jupyter_telemetry_sphinxext
13 changes: 13 additions & 0 deletions docs/source/_static/theme_overrides.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
/* override table width restrictions */
@media screen and (min-width: 767px) {

.wy-table-responsive table td {
/* !important prevents the common CSS stylesheets from overriding
this as on RTD they are loaded after this stylesheet */
white-space: normal !important;
}

.wy-table-responsive {
overflow: visible !important;
}
}
15 changes: 14 additions & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,9 @@
'IPython.sphinxext.ipython_console_highlighting',
'sphinxcontrib_github_alt',
'sphinxcontrib.openapi',
'sphinxemoji.sphinxemoji'
'sphinxemoji.sphinxemoji',
'sphinx-jsonschema',
'jupyter_telemetry_sphinxext'
]

myst_enable_extensions = ["html_image"]
Expand Down Expand Up @@ -216,6 +218,12 @@
# since it is needed to properly generate _static in the build directory
html_static_path = ['_static']

html_context = {
'css_files': [
'_static/theme_overrides.css', # override wide tables in RTD theme
],
}

# Add any extra paths that contain custom files (such as robots.txt or
# .htaccess) here, relative to this directory. These files are copied
# directly to the root of the documentation.
Expand Down Expand Up @@ -380,6 +388,11 @@
# import before any doc is built, so _ is guaranteed to be injected
import jupyter_server.transutils

# Jupyter telemetry configuration values.
jupyter_telemetry_schema_source = osp.join(HERE, '../../jupyter_server/event-schemas')
jupyter_telemetry_schema_output = osp.join(HERE, 'operators/events')
# Title of the index page that lists all found schemas
jupyter_telemetry_index_title = 'Telemetry Event Schemas'

def setup(app):
dest = osp.join(HERE, 'other', 'changelog.md')
Expand Down
3 changes: 2 additions & 1 deletion docs/source/operators/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,5 @@ These pages are targeted at people using, configuring, and/or deploying multiple
configuring-extensions
migrate-from-nbserver
public-server
security
security
telemetry
98 changes: 98 additions & 0 deletions docs/source/operators/telemetry.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
Eventlogging and Telemetry
==========================

The Notebook Server can be configured to record structured events from a running server using Jupyter's `Telemetry System`_. The types of events that the Notebook Server emits are defined by `JSON schemas`_ listed below_ emitted as JSON data, defined and validated by the JSON schemas listed below.
kiendang marked this conversation as resolved.
Show resolved Hide resolved


.. _logging: https://docs.python.org/3/library/logging.html
.. _`Telemetry System`: https://github.com/jupyter/telemetry
.. _`JSON schemas`: https://json-schema.org/

.. warning::
Do NOT rely on this feature for security or auditing purposes. Neither `server <#emitting-server-events>`_ nor `client <#eventlog-endpoint>`_ events are protected against meddling. For server events, those who have access to the environment can change the server code to emit whatever they want. The same goes for client events where nothing prevents users from sending spurious data to the `eventlog` endpoint.

Emitting server events
----------------------

Event logging is handled by its ``Eventlog`` object. This leverages Python's standing logging_ library to emit, filter, and collect event data.

To begin recording events, you'll need to set two configurations:

1. ``handlers``: tells the EventLog *where* to route your events. This trait is a list of Python logging handlers that route events to
2. ``allows_schemas``: tells the EventLog *which* events should be recorded. No events are emitted by default; all recorded events must be listed here.

Here's a basic example for emitting events from the `contents` service:

.. code-block::

import logging

c.EventLog.handlers = [
logging.FileHandler('event.log'),
]

c.EventLog.allowed_schemas = [
'hub.jupyter.org/server-action'
]

The output is a file, ``"event.log"``, with events recorded as JSON data.

Server event schemas
--------------------

.. toctree::
:maxdepth: 2

events/index

The ``eventlog`` endpoint
-------------------------

The Notebook Server provides a public REST endpoint for external applications to validate and log events
through the Server's Event Log.

To log events, send a `POST` request to the `/api/eventlog` endpoint. The body of the request should be a
JSON blog and is required to have the follow keys:

1. `'schema'` : the event's schema ID.
2. `'version'` : the version of the event's schema.
3. `'event'` : the event data in JSON format.

Events that are validated by this endpoint must have their schema listed in the `allowed_schemas` trait listed above.

.. _below:

Register client event schemas
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

``jupyter_server`` looks for locations of schema files provided by external packages by looking into the ``jupyter_telemetry`` entry point and then loads the files using the ``importlib.resources`` standard library.

For example, suppose there is a ``client_events`` package which wants to send events with schemas ``schema1.yaml``, ``schema2.yaml`` and ``extra_schema.yaml`` to the ``eventlog`` endpoint and has the following package structure:

.. code-block:: text

client_events/
__init__.py
schemas/
__init__.py
schema1.yaml
schema2.yaml
extras/
__init__.py
extra_schema.yaml

``schema1.yaml`` and ``schema2.yaml`` are resources under ``client_events.schemas`` and ``extra_schema.yaml`` under ``client_events.extras``. To make these schemas discoverable by ``jupyter_server``, create an entry point under the ``jupyter_telemetry`` group which resolves to a list containing their locations, in this case ``['client_events.schemas', 'client_events.extras']``:

In :file:`setup.cfg`

.. code-block:: yaml

[options.entry_points]
jupyter_telemetry =
my-event-entry-point = client_events:JUPYTER_TELEMETRY_SCHEMAS

In :file:`client_events/__init__.py`

.. code-block:: python

JUPYTER_TELEMETRY_SCHEMAS = ['client_events.schemas', 'client_events.extras']
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
__version__ = '0.1.0'

JUPYTER_TELEMETRY_SCHEMAS = ['client_eventlog_example.schemas']
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
$id: https://example.jupyter.org/client-event
version: 1
title: Client event
description: |
An example client event
type: object
properties:
thing:
title: Thing
categories:
- category.jupyter.org/unrestricted
description: A random thing
user:
title: User name
type: string
categories:
- category.jupyter.org/user-identifier
description: Name of user who initiated event
required:
- thing
- user
26 changes: 26 additions & 0 deletions examples/client_eventlog_example/setup.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
[metadata]
name = client_eventlog_example
version = attr: client_eventlog_example.__version__
description = a dummy module for testing client telemetry eventlog entrypoint
long_description = file: README.md
long_description_content_type = text/markdown
url = https://jupyter.org
author = Jupyter Development Team
author_email = jupyter@googlegroups.org
license = BSD
license_file = COPYING.md
classifiers =
Intended Audience :: Developers
Intended Audience :: System Administrators
Intended Audience :: Science/Research
License :: OSI Approved :: BSD License
Programming Language :: Python

zip_safe = False
include_package_data = True
packages = find:
python_requires = >=3.6

[options.entry_points]
jupyter_telemetry =
example-client-eventlog-entry-point = client_eventlog_example:JUPYTER_TELEMETRY_SCHEMAS
2 changes: 2 additions & 0 deletions examples/client_eventlog_example/setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
import setuptools
setuptools.setup()
3 changes: 3 additions & 0 deletions examples/client_eventlog_example/tests/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
pytest_plugins = [
'jupyter_server.pytest_plugin'
]
36 changes: 36 additions & 0 deletions examples/client_eventlog_example/tests/test_client_eventlog.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
import json


EVENT = {
'schema': 'https://example.jupyter.org/client-event',
'version': 1.0,
'event': {
'user': 'user',
'thing': 'thing'
}
}


async def test_client_eventlog(jp_eventlog_sink, jp_fetch):
serverapp, sink = jp_eventlog_sink
serverapp.eventlog.allowed_schemas = {
EVENT['schema']: {
'allowed_categories': [
'category.jupyter.org/unrestricted',
'category.jupyter.org/user-identifier'
]
}
}

r = await jp_fetch(
'api',
'eventlog',
method='POST',
body=json.dumps(EVENT)
)
assert r.code == 204

output = sink.getvalue()
assert output
data = json.loads(output)
assert EVENT['event'].items() <= data.items()
4 changes: 4 additions & 0 deletions jupyter_server/base/handlers.py
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,10 @@ def jinja_template_vars(self):
"""User-supplied values to supply to jinja templates."""
return self.settings.get('jinja_template_vars', {})

@property
def eventlog(self):
return self.settings.get('eventlog')

#---------------------------------------------------------------
# URLs
#---------------------------------------------------------------
Expand Down
Loading