-
Notifications
You must be signed in to change notification settings - Fork 116
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Tested an ready for deployment I think
- Loading branch information
Showing
17 changed files
with
1,856 additions
and
317 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
File Format used for Telemetry Dashboard | ||
======================================== | ||
_All format described here are internal, not for external consumption._ | ||
|
||
External users should include `telemetry.js` and consume data through this | ||
interface. Reading the raw data is hard and these data format may change, but | ||
the javascript interface is designed to be reasonably stable. | ||
|
||
|
||
|
||
|
||
|
||
|
||
Processor Output Format | ||
----------------------- | ||
|
||
/my/dim/../ JSON | ||
|
||
JSON: | ||
revision: | ||
buildid: | ||
histogram: | ||
|
||
|
||
|
||
Web Facing Bucket Layout | ||
------------------------ | ||
|
||
LATEST_VERSION = v2 | ||
v1/ | ||
data | ||
v2/ | ||
check-points.json | ||
check-points/ (one for every week) | ||
YYYYMMDDhhmmss/ | ||
versions.json | ||
FILES_PROCESSED | ||
FILES_MISSING | ||
<channel>/<version>/ | ||
MEASURE-by-build-date.json | ||
MEASURE-by-submission-date.json | ||
filter-tree.json | ||
histograms.json | ||
revisions.json | ||
latest-current.json = most recent current, contents of versions.json | ||
current/ | ||
YYYYMMDDhhmmss/ | ||
versions.json | ||
FILES_PROCESSED | ||
FILES_MISSING | ||
<channel>/<version>/ | ||
MEASURE-by-build-date.json | ||
MEASURE-by-submission-date.json | ||
filter-tree.json | ||
histograms.json | ||
revisions.json | ||
|
||
|
||
Web Facing Format | ||
----------------- | ||
|
||
|
||
/<channel>/<version> | ||
|
||
MEASURE.json | ||
{ | ||
<filter_id>: [ | ||
bucket0, | ||
bucket1, | ||
..., | ||
bucketN, | ||
sum, # -1, if missing | ||
log_sum, # -1, if missing | ||
log_sum_squares, # -1, if missing | ||
sum_squares_lo, # -1, if missing | ||
sum_squares_hi, # -1, if missing | ||
count | ||
], | ||
<filter_id>... | ||
} | ||
|
||
filters.json | ||
{ | ||
_id: filter_id, | ||
name: "filter-name", | ||
<option>: { | ||
<subtree> | ||
} | ||
} | ||
|
||
histograms.json | ||
|
||
{ | ||
MEASURE: { | ||
description: ... | ||
... | ||
} | ||
} | ||
|
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,17 +1,5 @@ | ||
FILES = histogram_tools.py Histograms.json specs.py dashboard.zip | ||
all: $(FILES) | ||
|
||
Histograms.json: | ||
wget -c http://hg.mozilla.org/mozilla-central/raw-file/tip/toolkit/components/telemetry/Histograms.json -O $@ | ||
|
||
histogram_tools.py: | ||
wget -c http://hg.mozilla.org/mozilla-central/raw-file/tip/toolkit/components/telemetry/histogram_tools.py -O $@ | ||
|
||
specs.py: Histograms.json | ||
python specgen.py $< > $@ | ||
|
||
dashboard.zip: specs.py processor.py auxiliary.py | ||
zip $@ $? | ||
egg: | ||
python setup.py bdist_egg | ||
|
||
clean: | ||
rm -f $(FILES) *.pyc | ||
rm -rf dist build telemetry_dashboard.egg-info |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,57 +1,42 @@ | ||
#Telemetry Dashboard | ||
|
||
Generate static files for a telemetry dashboard. | ||
|
||
|
||
#How to Run | ||
|
||
You'll need to have `mango` set up in your .ssh_config to connect you to the hadoop node where you'll run jydoop from. | ||
|
||
``` | ||
Run `script/bootstrap` | ||
Serve the `html/` dir | ||
``` | ||
|
||
##Histogram View | ||
There are x fields to narrow query by | ||
|
||
have a category table that stores category tree: | ||
Each node has a unique id | ||
Level1 Product: Firefox|Fennec|Thunderbird | ||
Level2 Platform: Windows|Linux | ||
Level3 etc | ||
|
||
|
||
size of this table can be kept in check by reducing common videocards to a family name, etc | ||
Can also customize what shows up under different levels..For example we could restrict tb, to have less childnodes. | ||
|
||
Store the tree in a table, but keep it read into memory for queries, inserting new records | ||
|
||
Then have a histogram table where | ||
columns: histogram_id | category_id | value | ||
where histogram_id is id like SHUTDOWN_OK, category id is a key from category table, value is the sum of histograms in that category...can be represented with some binary value | ||
|
||
|
||
##Misc | ||
Evolution can be implemented by adding a build_date field to histogram table | ||
|
||
TODO: | ||
How big would the category tree table be..surely there is a finite size for that | ||
|
||
histogram table would be |category_table| * |number of histograms|, pretty compact | ||
|
||
### Map + Reduce | ||
Mapper should turn each submission into | ||
<key> <data> which looks like | ||
buildid/channel/reason/appName/appVersion/OS/osVersion/arch {histograms:{A11Y_CONSUMERS:{histogram_data}, ...} simpleMeasures:{firstPaint:[100,101,1000...]}} | ||
Where key identifies where in the filter tree the data should live..Note a single packet could produce more than 1 such entry if we want to get into detailed breakdowns of gfxCard vs some FX UI animation histogram | ||
|
||
Reducer would then take above data and sum up histograms + append to simple measure lists based on key | ||
|
||
|
||
This should produce a fairly small file per day per channel(~200 records). Which will then be quick to pull out and merge into the per-build-per-histogram-json that can be rsynced to some webserver. This basically a final iterative REDUCE on top of map-reduce for new data. Hadoop does not feel like the right option for that, but I could be wrong. | ||
|
||
###todo: | ||
|
||
* oneline local testing using Jython's FileDriver.py | ||
|
||
Telemetry Dashboard | ||
=================== | ||
Telemetry dashboard is an analysis job that aggregates telemetry histograms and | ||
simple measures, and offers an decent presentation. The default dashboard | ||
developed in this repository is hosted at | ||
(telemetry.mozilla.com)[http://telemetry.mozilla.com]. But the aggregated data | ||
is also available for consumption by third-party applications, so you don't need | ||
to do the aggregation on your own. | ||
|
||
Consuming Telemetry Aggregations | ||
-------------------------------- | ||
Include into your code `http://telemetry.mozilla.com/js/telemetry.js` feel free | ||
to use the other modules too. | ||
Don't go about reading the raw JSON files, they are not designed for human | ||
consumption! | ||
|
||
|
||
Hacking Telemetry Dashboard | ||
--------------------------- | ||
If you want to improve the user-interface for telemetry dashboard, clone this | ||
repository, setup a static server that hosts the `html/` folder on our localhost | ||
and start hacking. This is easy! | ||
|
||
If you want to add new aggregations, or improve upon existing aggregations, | ||
change the storage format, take a look at `Formats.mkd`. Talk to the guy who is | ||
maintaining telemetry dashboard. | ||
|
||
Basic flow is as follows: | ||
1. An `.egg` file is generated with `make egg` | ||
2. Analysis tasks are created with telemetry-server | ||
3. `DashboardProcessor` from `analysis.py` aggregated telemetry submissions, | ||
this process is driven by telemetry-server. | ||
4. `Aggregator` from `aggregator.py` collects results from analysis tasks, by: | ||
1. Downloads existing data from s3 | ||
2. Fetch task finished messages from SQS | ||
3. Download `result.txt` files in parallel | ||
4. Updates results on disk | ||
5. Publishes updated results in a new subfolder of `current/` on s3, every | ||
once in a while. | ||
6. Check points all aggregated data to a subfolder of `check-points/` on s3, | ||
every once in a while. | ||
7. Repeat |
Oops, something went wrong.