-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GID missing in L3 #179
GID missing in L3 #179
Conversation
@i-be-snek , I think it's good to add a fix for American states, but we can handle it in V2 if the data gap process is not replying on it, and currently the visualization of L2 and L3 are also not replying on it, thanks! |
I don't think we should leave this for V2 because no American states have GIDs now, so it would affect the V1 release. |
|
@liniiiiii if you are happy with everything and have checked everything, please approve it so we can merge it. |
@liniiiiii This is now ready to review :) whenever you have time |
Okay, unlike github may like to claim, I am not cancelling any github action runs. Not sure what the problem exactly is but we can run all tests locally with |
@i-be-snek The India case is solved, and I found that for India, there are several GIDs, potentially from its historical GIDs, which in visualization process, I also found some GIDs for historical regions. The PR is passed locally with But I have a question about the parsing process now, because when I run the command below poetry run python3 Database/parse_events.py --raw_dir Database/raw/ESSD_2024_V3/dev --filename wiki_dev_whole_infobox_20240729_70single_events_all_categories_V3_gpt-4o-2024-05-13_rawoutput_data_time_nums.json --output_dir Database/output/V3_branch_gid_test --event_levels l1,l2,l3 I got response like parse_events: 2024-10-23 12:01:58 INFO Normalizing nulls for l2 Instance_Per_Administrative_Areas_Homeless
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 70/70 [00:00<00:00, 283947.08it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 101/101 [00:00<00:00, 5468.81it/s]
parse_events: 2024-10-23 12:01:58 INFO Dropping any columns with non-str column names due to None types in the dicts [] After that, I only get the L2 and L3 with damage and insured damage parsed results. To double check, I switch to main and run |
@liniiiiii Thanks for the review. |
@liniiiiii btw, the "several GIDs" are disputed areas that India claims, so this is not an error. India, China, and many countries are represented by multiple GIDs. |
@liniiiiii Nevermind, now I understand that some categories are missing. :) |
@liniiiiii you discovered a really small typo that causes a really big problem. The code was tabbed wrong 🥲 big thanks! I pushed a fix for that now and added |
@i-be-snek Thanks for this commit wikimpacts-py3.11nl@F-IR-HYDRPC135:~/Wikimpacts$ poetry run pytest
================================================================= test session starts ==================================================================
platform linux -- Python 3.11.9, pytest-8.2.2, pluggy-1.5.0
rootdir: /home/nl/Wikimpacts
configfile: pytest.ini
testpaths: tests
plugins: anyio-4.4.0
collected 146 items
tests/test_normalize_locations.py .............................. [ 20%]
tests/test_normalize_numbers.py ............................................................................................................. [ 95%]
tests/test_specific_instance_matcher.py ....... [100%]
=================================================================== warnings summary ===================================================================
../.cache/pypoetry/virtualenvs/wikimpacts-jRbwdZbh-py3.11/lib/python3.11/site-packages/pycountry/__init__.py:10
/home/nl/.cache/pypoetry/virtualenvs/wikimpacts-jRbwdZbh-py3.11/lib/python3.11/site-packages/pycountry/__init__.py:10: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
import pkg_resources
../.cache/pypoetry/virtualenvs/wikimpacts-jRbwdZbh-py3.11/lib/python3.11/site-packages/pkg_resources/__init__.py:2825
/home/nl/.cache/pypoetry/virtualenvs/wikimpacts-jRbwdZbh-py3.11/lib/python3.11/site-packages/pkg_resources/__init__.py:2825: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('mpl_toolkits')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
tests/test_normalize_numbers.py::TestNormalizeNumbers::test__preprocess[EUR19-19]
/home/nl/.cache/pypoetry/virtualenvs/wikimpacts-jRbwdZbh-py3.11/lib/python3.11/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
tests/test_normalize_numbers.py::TestNormalizeNumbers::test__preprocess[EUR19-19]
/home/nl/.cache/pypoetry/virtualenvs/wikimpacts-jRbwdZbh-py3.11/lib/python3.11/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
===================================================== 146 passed, 4 warnings in 213.69s (0:03:33) ====================================================== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
passed the poetry run pytest
This PR fixes an issue that cased missing GIDs in L3. The issue was related to a function not returning properly. The PR also includes minimal changes that improve what functions return.
While fixing this, I am reminded of American states not returning the same GID.
(the screenshot is from a test run on some of the full run data for "l3/Specific_Instance_Per_Administrative_Area_Damage")
@liniiiiii do you think it's a good idea to add a fix for American states not being converted to the right GID? I yes, then I'll add a fix for that and tag you as a reviewer.