-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add base mode and footprint to the labeled trips dataframe #152
Conversation
I can't see anything obviously wrong. Let me catch up on a few emails and then try this as a separate snippet... |
I think the issue is the way I iterate over them, not the way that I insert information. I tried iterating over them and not modifying them, and had the same issue |
this works for me
I wonder if the issue is that you are trying to iterate over the iterator twice. Note that |
This was exactly the what I was missing. I converted the iterator to a list, like in your snippet, and things work now. I have successfully added |
the iterator is consumed on first iteration, must use a list to update items instead
The data is coming together! There is definitely room for polishing especially in regard to what further unification could take place with |
having separate columns makes accessing the data easier, no longer need to break a tuple
remove most of the old footprint calculations unpack the footprints into energy and emissions calculate difference between mode and replaced mode
data function is now async because of waiting on returns from e-mission-common
using older names allows for old code to run
only calculating the footprint when needed in a given notebook saves time
updated the version in the build script and updated the calls in scaffolding.py to use the new call pattern some slight data formatting required
I still need to make this compatible with dynamic labels! I might need help testing that piece unless I can get my database to work better than it currently is... |
@Abby-Wheelis I think I have merged all the dependencies (base mode changes and inferred labels). You should be unblocked to finish up this change. |
Not broken after merge! On to trying to rip out the dictionaries so we can pass the labels the same way for all deployments - default or custom per-deployment |
Now relying on the default labels from emcommon
Ok, I've removed the csvs now, I think we could further reduce duplication by moving the choice of labels into |
… for program, not study.
Testing Stacked Bar Charts for Cortez ebikes with the merged and current changes after ea3cd89: Comparison between Cortez ebikes current and merged
Looks identical. Note: The color variants from base mode appears to be picked up differently each time the charts are being generated. Might be something about dedupe_colors() which needs to be explored. |
Test Scenario: Dataset used: Executed the following notebooks:
Changes in the config file:
In
Details of execution of the above notebooks:
Could not run the below notebooks:
because of the expected error:
Results:
All charts generated properly. Note: There is issue with running generic_metrics notebook with dfc-fermata data, it looks up for |
…ated data for generic_metrics noteboook.
Incorporated - Do not execute generic_metrics notebook for survey info. Details
|
I have tested with all three configurations (1), (2), & (3). I have not tallied the data representation in the charts against older version. |
viz_scripts/generic_metrics.ipynb
Outdated
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Do not run this notebook if it has survey_info; nbclient will run up through this cell\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't filter on if survey info exists -
Some deployments have trip-level surveys: dfc-fermata
And others have records of the onboarding survey: cortezebikes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 468d40e by filtering as
if not survey_info.get('trip-labels', None) == 'MULTILABEL':
\\ Do not execute the generic_metrics notebook.
…TILABEL for generic_metrics notebook
Testing Scenario: Dataset: Execution of generate_plots.py for generic_metrics notebook Execution details alongside the git diff for changing `STUDY_CONFIG` and `DB_HOST` in docker-compose.yml :
When running the generate_plots.py for generic_metrics for
This addresses this comment #148 (comment) too |
Testing Scenario: Dataset: Execution of generate_plots for all the notebooks:
Results:
Stacked Bar Charts issues is being resolved here: #155 Apart from the generation of Stacked Bar Charts issues, everything looks good! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few notes - the removal of the server import and the change from "not mulitlabel" to "is enketo" are the only two blocking changes in my opinion
disp.display(expanded_labeled_inferred_ct.head()) | ||
return expanded_labeled_inferred_ct | ||
|
||
# CASE 2 of https://github.com/e-mission/em-public-dashboard/issues/69#issuecomment-1256835867 | ||
unique_users = lambda df: len(df.user_id.unique()) if "user_id" in df.columns else 0 | ||
trip_label_count = lambda s, df: len(df[s].dropna()) if s in df.columns else 0 | ||
|
||
async def load_viz_notebook_data(year, month, program, study_type, dynamic_labels, dic_re, dic_pur=None, include_test_users=False): | ||
async def load_viz_notebook_data(year, month, program, study_type, dynamic_labels, include_test_users=False, add_footprint=False): | ||
#TODO - see how slow the loading the footprint is compared to just the baseMode, and evaluate if passing param around is needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO - separate adding base mode and adding footprint?
rather than halt on survey info that does not have multilabel halt on survey info that does have enketo blank survey info means default, which means multilabel, and we should still proceed
Looks good! |
I have added a few changes -
Now, we can display the CO2 footprint and energy emission even for the ones which uses custom labels. |
Test Scenario: Dataset used: Execution of notebooks: Details of execution:
Results:
|
Test Scenario: Dataset used: Executed generic_timeseries notebook: Details
|
Good idea! Looks good to me |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was actually a really clean PR and I'm happy to merge it! I think we did a lot of the heavy lifting early with the basemode changes, and so this is so much nicer 😄
I do have a few cleanups that we can roll into a cleanup change later.
At a high level, I think it would be helpful, after we have had a chance to look through what the inferred labels look like, to see if we can have labeled
or labeled_versus_inferred
energy and emission calculations. Aka actually implement "count every trip". But we should do so on both the dashboard and the phone, so this is a project for CHEER 2025 or similar.
// Note: We're disabling energy metrics on public dashboard when dynamic labels are available. | ||
// TODO: Remove the if (data.label_options) in future when energy computation is handled properly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay! So happy to handle the TODO!
@@ -1,4 +1,3 @@ | |||
0 7 * * * python bin/update_mappings.py mapping_dictionaries.ipynb >> /var/log/intake.stdinout 2>&1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am so glad that we were able to delete this as well. Fun fact: when people start working with the public dashboard, they would typically not read the instructions to run the mapping_dictionaries first and would report that the run failed.
Glad to see that we won't have that issue any more
"if survey_info.get('trip-labels', None) == 'ENKETO':\n", | ||
" ipython = get_ipython()\n", | ||
" ipython._showtraceback = scaffolding.no_traceback_handler\n", | ||
" raise Exception(\"The plots in this notebook are not relecant for ENKETO trip-labels\")" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: spelling
except: | ||
print("hit exception") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
future fix: I would suggest logging what the exception was (ideally using logging.exception
) so that we can check and fix as needed. We may also want to add an entry into the client stats of how many trips were affected. This will allow us to check on why the builds are failing
def extract_kwh(footprint_dict): | ||
if 'kwh' in footprint_dict.keys(): | ||
return footprint_dict['kwh'] | ||
else: | ||
print("missing kwh", footprint_dict) | ||
return np.nan | ||
|
||
def compute_CO2_footprint_dynamic(expanded_ct, dynamic_labels, label_type): | ||
conversion_meter_to_kilometer = 0.001 | ||
conversion_kilogram_to_lbs = 2.20462 | ||
def extract_co2(footprint_dict): | ||
if 'kg_co2' in footprint_dict.keys(): | ||
return footprint_dict['kg_co2'] | ||
else: | ||
print("missing co2", footprint_dict) | ||
return np.nan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
future fix: it seems like this should be combinable into a single parameterized function
" \"Labeled by user\\n (Confirmed trips)\", ax[1], text_results[1], colors_mode, debug_df_inferred, values_to_translations)\n", | ||
" \"Labeled and Inferred by OpenPATH\\n\"+stacked_bar_quality_text_commute_inferred, ax[1], text_results[1], colors_mode, debug_df_inferred, values_to_translations)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
future fix: I think I flagged this somewhere else, but it would be good to change this to "inferred from prior labels" here because there are multiple inference algorithms in OpenPATH
See initial discussion in #146