-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(explore): dataset macro: dttm filter context #25950
Conversation
Note that I tried to capture this scenario in a test case, but since it's quite complex, with one dataset referring to another and needing to render jinja in both, I couldn't see a sensible way of doing it in a unit test -- other unit tests for the dataset macro mock the method which would be rendering the jinja in the underlying dataset to begin with. I also found that the integration tests for jinja were much simpler and didn't cover the The test coverage for Edit: I also initially broke a whole slew of integration tests by getting the time format wrong, so general coverage seems quite good here. |
1e5f843
to
f0b719b
Compare
There's a slight wrinkle here in that the datetime format actually varies depending on the engine, which the integration tests have highlighted. Unfortunately to preserve the information to the underlying dataset we need to parse it back into a That means I've had to be a little bit flexible in how the datetime is parsed, so I've handled the most major formats and done it in a safe way, falling back to omitting it if not present. Not sure if we can somehow preserve this information better; I think this is the best we can do without considering some major changes to the way this is working. |
f0b719b
to
b3ce064
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @giftig for the PR. One question I had is why restrict this to only from_dttm
and to_dttm
, i.e., should the entire context be exposed?
Additionally would you mind sharing in your PR description a screenshot of how your dataset is configured with Jinja2 templating? I sense this would help to provide more context for people to review.
Thanks for taking a look at this so quickly. re exposing the entire context: I'll take a look what else is available which would make sense to pass in; not sure how easy it'd be to expose + test the whole context, since with these parameters I had to do the date parsing to get it back to its original form so it worked seamlessly, but if there are a couple of extra obvious parameters to include I'll add those too. If I add the whole context without being aware of what's in those fields, we might see the wrong types for other fields too, and that may be more confusing than just not having the context. re dataset configuration: in the testing instructions I gave instructions for a (fairly contrived) example dataset to show the principle of when and why this causes issues. Hopefully that provides enough context. Our actual usecase is a custom macro for querying a table which applies a sensible default filter to prevent accidentally querying too much data, and it uses the dashboard filter instead of a short default when available, but I think the simpler example illustrates what's currently missing more clearly and also makes it easy to reproduce / test the change. |
Pass from_dttm and to_dttm as additional context to the dataset macro, allowing any jinja templating in the underlying dataset query text to refer to this context. This is useful in situations where virtual dataset A incorporates from_dttm and/or to_dttm into the query, in order to make the underlying query aware of chart filter contexts, and then virtual dataset B is built on top of dataset A and used in charts
b3ce064
to
8adc63d
Compare
@john-bodley I tried passing the rest of the context alongside these parameters but we're actually using this data to add to a def get_sqla_query( # pylint: disable=too-many-arguments,too-many-locals,too-many-branches,too-many-statements
self,
apply_fetch_values_predicate: bool = False,
columns: Optional[list[Column]] = None,
extras: Optional[dict[str, Any]] = None,
filter: Optional[ # pylint: disable=redefined-builtin
list[utils.QueryObjectFilterClause]
] = None,
from_dttm: Optional[datetime] = None,
granularity: Optional[str] = None,
groupby: Optional[list[Column]] = None,
inner_from_dttm: Optional[datetime] = None,
inner_to_dttm: Optional[datetime] = None,
is_rowcount: bool = False,
is_timeseries: bool = True,
metrics: Optional[list[Metric]] = None,
orderby: Optional[list[OrderBy]] = None,
order_desc: bool = True,
to_dttm: Optional[datetime] = None,
series_columns: Optional[list[Column]] = None,
series_limit: Optional[int] = None,
series_limit_metric: Optional[Metric] = None,
row_limit: Optional[int] = None,
row_offset: Optional[int] = None,
timeseries_limit: Optional[int] = None,
timeseries_limit_metric: Optional[Metric] = None,
time_shift: Optional[str] = None, so we can't pass everything; in my case it contained {'columns': ['ds', 'gender', 'name', 'num_boys', 'num_girls', 'from_dttm'],
'filter': [{'col': 'ds',
'op': 'TEMPORAL_RANGE',
'val': 'DATEADD(DATETIME("now"), -30, year) : now'}],
'from_dttm': '1993-11-14T09:16:20',
'groupby': None,
'metrics': None,
'row_limit': 1000,
'row_offset': 0,
'table_columns': ['ds',
'gender',
'name',
'num_boys',
'num_girls',
'from_dttm'],
'time_column': None,
'time_grain': 'P1D',
'to_dttm': '2023-11-14T09:16:20'} So at a glance I see basic properties of the dataset are present, e.g. |
SUMMARY
Pass from_dttm and to_dttm as additional context to the dataset macro, allowing any jinja templating in the underlying dataset query text to refer to this context.
This is useful in situations where virtual dataset A incorporates from_dttm and/or to_dttm into the query, in order to make the underlying query aware of chart filter contexts, and then virtual dataset B is built on top of dataset A and used in charts.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Before
After
TESTING INSTRUCTIONS
Make sure
ENABLE_TEMPLATE_PROCESSING
is turned on, and use a standard dev superset environment with the samples database available. You can also tweak these steps to use other data or backends as long as you apply a time filter in your dashboard.and save it as a virtual dataset called
from_dttm
. Currently thefrom_dttm
field will showempty
because it won't be populated until a filter is applied on a chart in a later step.Create a table chart from this dataset, select all fields, and add a time filter on
ds
. Add this chart to a new dashboard, and configure a time filter fords
on that dashboard. Set that filter to the past 30 years so we can see some data. Notice that nowfrom_dttm
displays a datetime value.In SQL Lab again, run the following query:
you'll need to make sure
25
here is the ID of your virtual datasetfrom_dttm
in step 1. Create a chart exactly like the first one, and add it to the same dashboard so you can see the two charts side by side.from_dttm
is correctly populated in both charts; prior to this change the second one, using thedataset
macro, will showempty
instead as it does not propagate sufficient context from the chart through the macro.ADDITIONAL INFORMATION
ENABLE_TEMPLATE_PROCESSING