-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds data-exclude support and localStorage.plausible_ignore support #489
Conversation
Looks good overall @Vigasaurus
I haven't tested yet but my only worry is someone doing
I think it's great, much preferred to regular Regex syntax and I think using stars in URLs is very widely accepted/supported.
This is the raw size but we advertise the size with Brotli compression which is what you actually get over the wire. With compression it's probably not over 1KB yet. Since adding page exclusions requires one to re-deploy their script anyways, I think we should add it as an extra script option that doesn't get included in the standard script. See how I've done |
Coolio, thanks for the detailed feedback on all the PRs :D I'll get going on all the comments some time today or tomorrow 😀 |
I'm unsure about the formatting decision in tracker.ex - lmk.
This will certainly have conflicts with my other PRs related to the tracking scripts right now, I'll make one extra PR after both are done to ensure they're consolidated into the compiled scripts.
@ukutaht This should be ready to go - I'm happy with most everything here, but I'm unsure about the formatting in I'll get started on the docs update too. |
Will review later today but wanted to jot down a thought so I'll remember to test later on. What's the behaviour of the pseudo-regex with trailing slashes and subdirectories? What I mean is:
The 'Excluded' column is what I intuitively expect to happen, question marks mean that I am not 100% certain this is the expected/correct behaviour. However it is implemented, it should be well documented and as easy to understand as possible. |
Yeah this is definitely a sticking point for me too, as far as documentation is concerned. I feel as though a technical explanation of what the regex becomes would be best for power-users, but would likely confuse others. I think I kept it pretty well explained in plausible/docs#45 but it could definitely use more detail if you think its unclear. (Maybe this table you've laid out can help actually) Honestly, I think the biggest confusion point will be trailing slashes - so maybe a note on there saying
Same table as yours, last column is whether or not the current implementation excludes it So most of them work how you'd expect - the last one is the only one which doesn't and that's mostly just a decision to be made here. Do we allow zero-length The way I tested these was by just extracting the regex constructor from the script and running it manually. Here's what it became if you want to try it at all. "actual_pathname_to_test".match(new RegExp('^' + "excluded-string".trim().replace(/\*/g, '.*') + '\/?$')) so I tested your examples by doing for example "/page/nested".match(new RegExp('^' + "/page/*".trim().replace(/\*/g, '.*') + '\/?$')) |
I'm actually going to make a change here (tomorrow, that's why I'm moving to draft) to make docs and explaining it simpler, in that the If this format all works well etc, I'll eventually add this same matching to page goals if wanted (for stuff like WooCommerce's |
This is now updated to not let the |
I added a note in the docs saying that trailing slashes are added automatically and are not needed in date-exclude.
I've changed it such that a
More research found that zero-length asterisks in URLs in this format are non-standard, so I've changed them to be 1-length or greater. Updated page path testing regex:
Overall, I added much more detail in plausible/docs#45, and an option for tables in the docs. Please do let me know your thoughts :D |
Again sorry about the super long wait on this :) Overall this is in a great place. Good job on the documentation side as well.
Sounds good. I think we should use bash glob as the main inspiration for the syntax here. It's well known. In glob, a regular asterisk does not match a forward slash as you've done here. But I think we should add a way to exclude a whole subdirectory. In glob, this can be done with Also in glob both single and double asterisks do match a zero-length character. So in glob
Love it. The same syntax could also be used to roll up pageviews in regular reporting (i.e. in the Top Pages report). I know many people want to group pageviews with similar URLs. |
Love the ideas. I think the double glob mechanic would definitely be useful too (and should be easy to implement, since it's just the old implementation tacked on). I have no particular issues with either implementation as far as zero-length goes - I think it just comes down to consistency and documentation. I'll make the changes 👍
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, requested a few changes. Once they're done I will merge this
Got a question on Twitter: they're using the same script on both the main domain and subdomain but want to exclude the |
I'm a tiny bit confused by the use case/end goal, but can confirm that "/**" would exclude any path If you have more details about what they're trying to exclude/the subdomain structure, I'm sure this could support it to an extent |
thanks @Vigasaurus! i've asked them to clarify. btw if you exclude a specific page does that mean that any custom goals on that specific page won't be counting? |
Nope, the exclusion only happens for triggering pageviews - outbound links and other custom events still fire. |
Also reading more of the Twitter thread, the use case makes sense to me. I think excluding "/**" is definitely the way to do it for them, if they want all the marketing pages on the www or none subdomain to track normally, and then in app subdomain they run the full exclusion and fire custom events as needed. |
thanks! so by excluding "/**" no visitors / pageviews will be counted both from the main domain and from the subdomain? |
Well, I assume they have separate html outputs for each subdomain, so they'd likely only put the exclude on one of the subs' html - the main domain's html (the marketing site) should just not have any exclusion |
ahh yes! i see. makes sense to me now. thanks! |
Sweet! Feel free to ping me here or on the Twitter thread if they have other concerns/questions :D |
they responded with more details so i think it's more clear now: https://twitter.com/tim_nolet/status/1354048423742341120 last question: if they do as described will the referral source still show for custom event conversions? |
Mm, I don't think so, because from my understanding the referral source for custom events comes by linking the entry page+referrer for the entire session to the event session, and if there are no pageviews there are no entry pages or referrers. But to be honest I haven't looked a whole lot at how the referrer and sessions stuff works, so Uku would know better/more concretely. |
ok cool, thanks! |
I think the referrer stuff and everything should work normally. In their setup the normal funnel seems to:
As long as they're sending the stats to the same dashboard, the session should stay intact and the referrer source for the initial pageview on www.domain.com will be visible in the dashboard. |
perfect! thanks for confirming! |
Oh yeah that makes sense! I guess the generated user hash would line up the same too since the data-domain would be the same. I guess the break down would happen if someone was referred directly to app.domain.com/signup, and that wouldn't get any referrer on load, but that's probably a bit unlikely/could be planned around from a subdomain design standpoint |
…lausible#489) * Adds data-exclude support and plausible_ignore support * Splits exclusion into separate script option * localStorage parsing upgrades * Additional script type additions I'm unsure about the formatting decision in tracker.ex - lmk. * Adds new compiled files This will certainly have conflicts with my other PRs related to the tracking scripts right now, I'll make one extra PR after both are done to ensure they're consolidated into the compiled scripts. * Moves localStorage blocker out of special script * Changelog * Second thoughts on localStorage exclusion * Updates `*` to not match `/` or whitespace * Fix formatting * Removes zero-length asterisks * Adds support for double glob, zero-length replacements * Update to reduce size+allow localStorage exclude Co-authored-by: Uku Taht <Uku.taht@gmail.com>
A (potential) bug raised by one of our clients: Expected to match but this is the result: "/l/testing".match(new RegExp("/l/*".trim().replace(/\*\*/g, '.*').replace(/[^\.]\*/g, '[^\\s\/]*') + '\/?$'))
<- null @Vigasaurus any idea what's going on here? |
@ukutaht Ah yeah that's actually a pretty big bug, that I caught when I was doing pageview goal globs, but completely forgot to update in the tracker script. Notice here the regex for the second replace was appropriately changed to a negative lookbehind, instead of the negative replacement, but in the tracker I didn't make this change. The appropriate new version in the tracker would be: "/l/testing".match(new RegExp("/l/*".trim().replace(/\*\*/g, '.*').replace(/(?<!\.)\*/g, '[^\\s\/]*') + '\/?$'))
<- Array [ "/l/testing" ] which is the correct result, notice the swap from |
Changes
Adds support in the plausible.js script for exlcuding specific routes from couting towards events, as well as preventing all plausible events if a localStorage flag
plausible_ignore
is present.@ukutaht This isn't technically done yet, I wanted your feedback on methodology and usability before finalizing it. I also am mostly unsure what the
p.js
script is for/when its used (I found no meaningful reference to it) - so I'm unsure if my changes need to be added to it.Once we can agree on the format/the method - I can write up the docs/changelog as needed.
Basic premise is that we have two new exclusions on the client-side, similar to localhost or phantom.
First,
localStorage.plausible_ignore
- if this exists at all (i.e. is truthy in its string representation), all plausible events are prevented.Second, if a string is passed in to
data-exclude
with the formatdata-exclude="/tou, /rule/*, /how-to-play, /*/admin, */secret, */priv/*"
pages that match those routes will be excluded. In this example, we read those routes as the following table. I decided to use a format like this in pseudo-regex to make it a bit more usable/easy to understand for an end user - lmk if you'd prefer just a straight regex or something./tou
/^\/tou\/?$/
/rule/*
/^\/rule\/.*\/?$/
/how-to-play
/^\/how-to-play\/?$/
/*/admin
/^\/.*\/admin\/?$/
*/secret
/^.*\/secret\/?$/
*/priv/*
/^.*\/priv\/.*\/?$/
Also, a sidenote - it seems like the plausible script has creeped above 1KB - sitting at 1268B (1.24 KiB = 1.27 KB) on master - and with this update it creeps up to 1591B (1.55 KiB = 1.59 KB). I don't think it super matters, since we are adding features to get to this size - but I just figure some branding/copy on the landing might need some updating to match it.
Tests
Changelog
Documentation
Adds docs for client-side exclusions docs#45