Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding DataFrame for the RPacket Tracking Functionality #1776

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
1904924
Added functionality to track properties for r_packets, configured fro…
DhruvSondhi Jul 26, 2021
cabea5f
Tried implementation of record based tracking of the r_packet properties
DhruvSondhi Jul 27, 2021
aed2e56
Implemented array based packet property tracking
DhruvSondhi Jul 29, 2021
09e7f36
Added functionality to reduce the array size inside the RPacketCollec…
DhruvSondhi Jul 29, 2021
5fbf57e
Reverted formatting changes within montecarlo_numba/base.py
DhruvSondhi Jul 29, 2021
db2a023
Changed the searching implementation for finding the exact size of th…
DhruvSondhi Aug 2, 2021
2e92303
Added docstring for the newly added RPacketCollection Class
DhruvSondhi Aug 2, 2021
57c3ed8
Made array length for the properties in the RPacketCollection configu…
DhruvSondhi Aug 2, 2021
580fc32
Added tests for r_packet tracking
DhruvSondhi Aug 2, 2021
4ae5507
Changed initial_array_length to const value,
DhruvSondhi Aug 3, 2021
96e16cc
Added demarcation for Setup & Teardown in tests
DhruvSondhi Aug 4, 2021
2b2de2b
Changed set_properties func to track,
DhruvSondhi Aug 5, 2021
7762fa1
Renamed r_packet_tracking to track_rpacket for consistency
DhruvSondhi Aug 5, 2021
61b39bf
Added Documentation for RPacket Tracking
DhruvSondhi Aug 5, 2021
de7cdfb
[build docs]
DhruvSondhi Aug 9, 2021
d596a96
Moved tracked_rpacket outside montecarlo_main_loop,
DhruvSondhi Aug 10, 2021
b56fee6
Added functionality to display dataframe when accessed by sim.runner.…
DhruvSondhi Aug 10, 2021
947d02b
Added functionality for single DataFrame to hold all the values for a…
DhruvSondhi Aug 11, 2021
dddce2f
Added functionality to append DF
DhruvSondhi Aug 12, 2021
bbba997
Removed redundant code,
DhruvSondhi Aug 12, 2021
8d496f6
Fixed Iteration values getting converted to float from int,
DhruvSondhi Aug 16, 2021
b3bf9f2
Restructured code for multi threaded runs & changed the way of the ge…
DhruvSondhi Nov 9, 2021
cbe0b79
Added Docstring along with few changes to the way the arrays are gene…
DhruvSondhi Dec 3, 2021
3d0b130
Refactored original tests to be consistent with the new changes
DhruvSondhi Dec 9, 2021
5a43301
Restoring some changes which are not fixed when rebasing
DhruvSondhi Dec 11, 2021
789de23
Added tests for generated tracking dataframe
DhruvSondhi Dec 17, 2021
5638ac7
Added config for the rpacket tracking tests
DhruvSondhi Dec 17, 2021
154abcb
Updated documentation for RPacket Tracking DataFrame feature
DhruvSondhi Dec 21, 2021
3f1326a
[build docs]
DhruvSondhi Dec 21, 2021
ec9b478
Changed the way generation of DataFrame happens, Sped up the process …
DhruvSondhi Dec 23, 2021
9d17d08
Reverted documentation positioning in index.rst
DhruvSondhi Jan 10, 2022
ead38b5
Changed the process of generation of the tracking dataframe into more…
DhruvSondhi Jan 17, 2022
135b22f
Fixing orphaned imports from single_packet_loop when rebased
DhruvSondhi Jan 30, 2022
2d6292f
Renamed `rpacket_collections` to `tracked_rpacket` for remaining occu…
DhruvSondhi Feb 3, 2022
6a4ad39
Merge remote-tracking branch 'upstream/master' into packet_interactio…
DhruvSondhi Feb 3, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 44 additions & 24 deletions docs/io/output/rpacket_tracking.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
"id": "c103617c",
"metadata": {},
"source": [
"**TARDIS** has the functionality to track the properties of the *RPackets* that are generated when running the Simulation. The `rpacket_tracker` can track all the interactions a packet undergoes & thus keeps a track of the various properties, a packet may have.<br>Currently, the `rpacket_tracker` tracks the properties of all the rpackets in the *Last Iteration of the Simulation*. It generates a `List` that contains the individual instances of `RPacketCollection`{`Numba JITClass`}, for storing all the interaction properties as listed below."
"**TARDIS** has the functionality to track the properties of the *RPackets* that are generated when running the Simulation. The `rpacket_tracker` can track all the interactions a packet undergoes & thus keeps a track of the various properties, a packet may have.<br>The `rpacket_tracker` tracks the properties of all the rpackets in the *All the Iterations of the Simulation*. It generates a `pandas.DataFrame` that contains the the properties of all the interactions that a particular `RPacket` undergoes as it is propagated throughout the run of the simulation. This happens for all the packets of each respective `Iteration` and are stored along with the `iteration number`. A sample of the *RPacket Tracking DataFrame* can be seen in the end of this tutorial."
]
},
{
Expand Down Expand Up @@ -47,7 +47,7 @@
"\n",
"Warning\n",
"\n",
"Current implementation stores all the data for the interaction of the packets in a `list`, so it needs to accessed with a `list index` for each property for a particular `rpacket`. Examples for the same are shown as follows. \n",
"Turning on the `tracking` option in the config has consequences on the time of the simulation run. Please keep this into consideration as the time for processing the output is high and can lead to long simulation times for simpler runs as well.<br>Properties can be accessed through the `DataFrame` indexing easily.\n",
"</div>"
]
},
Expand All @@ -64,7 +64,7 @@
"id": "29e14475",
"metadata": {},
"source": [
"**TARDIS**' `rpacket_tracker` is configured via the `YAML` file. This functionality of tracking the packets is turned **off**, by default. This is due to that fact that using this property, may slow down the execution time for the Simulation. An example configuration can be seen below for setting up the *tracking*:\n",
"**TARDIS**' `rpacket_tracker` is configured via the `YAML` file. This functionality of tracking the packets is turned **off**, by default. This is due to the fact that using this property, leads to longer execution time for the Simulation. An example configuration can be seen below for setting up the *tracking*:\n",
"\n",
"```yaml\n",
"... \n",
Expand All @@ -80,7 +80,7 @@
"id": "13b6420b",
"metadata": {},
"source": [
"The `montecarlo` section of the **YAML** file now has a `tracking` sub section which holds the configuration properties for the `track_rpacket` & the `initial_array_length` (discussed later in the tutorial)."
"The `montecarlo` section of the **YAML** file has a `tracking` sub section which holds the configuration properties for the `track_rpacket` & the `initial_array_length` (discussed later in the tutorial)."
]
},
{
Expand Down Expand Up @@ -110,7 +110,7 @@
"source": [
"# Reading the Configuration stored in `tardis_config_packet_tracking.yml` into config\n",
"\n",
"config = Configuration.from_yaml(\"tardis_example.yml\")"
"config = Configuration.from_yaml(\"tardis_tracking_example.yml\")"
]
},
{
Expand Down Expand Up @@ -168,7 +168,7 @@
"source": [
"# Running the simulation from the config\n",
"\n",
"sim = run_tardis(config, show_convergence_plots=False, show_progress_bars=False)"
"sim = run_tardis(config, log_level=\"Debug\", show_convergence_plots=False, show_progress_bars=False)"
]
},
{
Expand All @@ -194,7 +194,7 @@
"id": "4771d92a",
"metadata": {},
"source": [
"It can be seen from the above code, that the `sim.runner.rpacket_tracker` is an instance of the `List` specifically *Numba Typed List*. The `RPacketCollection` class has the following structure for the properties : {More information in the **TARDIS API** for `RPacketCollection` class}"
"It can be seen from the above code, that the `sim.runner.rpacket_tracker` is an instance of the `pandas.DataFrame` object. The `RPacketCollection` class has the following structure for the properties : {More information in the **TARDIS API** for `RPacketCollection` class}"
]
},
{
Expand Down Expand Up @@ -227,6 +227,24 @@
"len(sim.runner.rpacket_tracker)"
]
},
{
"cell_type": "markdown",
"id": "7b3ee39f",
"metadata": {},
"source": [
"The generated DataFrame can be accessed with `sim.runner.rpacket_tracker`. The DataFrame for this particular *simulation configuration* can be seen as follows :"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dbec15ec",
"metadata": {},
"outputs": [],
"source": [
"sim.runner.rpacket_tracker"
]
},
{
"cell_type": "markdown",
"id": "411f2ef9",
Expand All @@ -241,7 +259,7 @@
"id": "a4772b00",
"metadata": {},
"source": [
"- Accessing the `index` property for the packet {`10`}:"
"- Accessing the `index` property for the packets :"
]
},
{
Expand All @@ -251,15 +269,15 @@
"metadata": {},
"outputs": [],
"source": [
"sim.runner.rpacket_tracker[10].index"
"sim.runner.rpacket_tracker[\"Packet Index\"]"
]
},
{
"cell_type": "markdown",
"id": "d81fbbf7",
"metadata": {},
"source": [
"- Accessing the `seed` property for the packet {`10`}:"
"- Accessing the `seed` property for the packets :"
]
},
{
Expand All @@ -269,15 +287,15 @@
"metadata": {},
"outputs": [],
"source": [
"sim.runner.rpacket_tracker[10].seed"
"sim.runner.rpacket_tracker[\"Packet Seed\"].unique()"
]
},
{
"cell_type": "markdown",
"id": "7afe2110",
"metadata": {},
"source": [
"- Accessing the `status` property for the packet {`10`}:"
"- Accessing the `status` property for the packets :"
]
},
{
Expand All @@ -287,33 +305,35 @@
"metadata": {},
"outputs": [],
"source": [
"sim.runner.rpacket_tracker[10].status"
"sim.runner.rpacket_tracker[\"Packet Status\"]"
]
},
{
"cell_type": "markdown",
"id": "ea308a55",
"cell_type": "code",
"execution_count": null,
"id": "8e66c2f7",
"metadata": {},
"outputs": [],
"source": [
"Thus, all other properties {`r`, `nu`, `mu`, `energy`, `shell_id`} can be accessed accordingly."
"sim.runner.rpacket_tracker.loc[sim.runner.rpacket_tracker[\"Iteration\"] == 1]"
]
},
{
"cell_type": "markdown",
"id": "c83dd906",
"cell_type": "code",
"execution_count": null,
"id": "543507be",
"metadata": {},
"outputs": [],
"source": [
"We can also see the total number of interactions of index `10` packet under went, with the following example:"
"sim.runner.rpacket_tracker.loc[sim.runner.rpacket_tracker[\"Iteration\"] == 9]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "090b1517",
"cell_type": "markdown",
"id": "ea308a55",
"metadata": {},
"outputs": [],
"source": [
"len(sim.runner.rpacket_tracker[10].shell_id)"
"Thus, all other properties {`r`, `nu`, `mu`, `energy`, `shell_id`} can be accessed accordingly."
]
},
{
Expand Down
57 changes: 57 additions & 0 deletions docs/io/output/tardis_tracking_example.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Example YAML configuration for TARDIS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels like the standard example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a standard example. There is changes to the number of packets so as to make it run fast for the tracking. More changes would be done in the future so as to make it very fast & not consume much time in testing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's change this after

tardis_config_version: v1.0

supernova:
luminosity_requested: 9.44 log_lsun
time_explosion: 13 day

atom_data: kurucz_cd23_chianti_H_He.h5

model:
structure:
type: specific
velocity:
start: 1.1e4 km/s
stop: 20000 km/s
num: 20
density:
type: branch85_w7

abundances:
type: uniform
O: 0.19
Mg: 0.03
Si: 0.52
S: 0.19
Ar: 0.04
Ca: 0.03

plasma:
disable_electron_scattering: no
ionization: lte
excitation: lte
radiative_rates_type: dilute-blackbody
line_interaction_type: macroatom

montecarlo:
seed: 23111963
no_of_packets: 100
iterations: 10
nthreads: 1

last_no_of_packets: 1000
no_of_virtual_packets: 2

convergence_strategy:
type: damped
damping_constant: 1.0
threshold: 0.05
fraction: 0.8
hold_iterations: 3
t_inner:
damping_constant: 0.5

spectrum:
start: 500 angstrom
stop: 20000 angstrom
num: 10000
112 changes: 112 additions & 0 deletions tardis/io/logger/tests/test_logging.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
import pytest
import logging
import os
import pandas as pd
import numpy as np

from tardis.io.config_reader import Configuration
from tardis.simulation import Simulation
Expand Down Expand Up @@ -105,3 +108,112 @@ def test_logging_both_specified(
assert record.levelno == LOGGING_LEVELS[log_level.upper()]
else:
assert record.levelno >= LOGGING_LEVELS[log_level.upper()]


@pytest.fixture
def config():
return Configuration.from_yaml(
"tardis/io/tests/data/tardis_configv1_verysimple_tracking.yml"
)


@pytest.fixture
def tracker_ref_path(tardis_ref_path):
return os.path.abspath(os.path.join(tardis_ref_path, "rpacket_tracking.h5"))


@pytest.fixture
def tracking_refdata(
config, atomic_data_fname, tracker_ref_path, generate_reference
):
config["atom_data"] = atomic_data_fname

simulation = Simulation.from_config(config)
simulation.run()

track_df = simulation.runner.rpacket_tracker
key = "tracking"

if not generate_reference:
return simulation
else:
track_df.to_hdf(tracker_ref_path, key=key, mode="w")
pytest.skip("Reference data was generated during this run.")


@pytest.fixture
def read_comparison_refdata(tracker_ref_path):
return pd.read_hdf(tracker_ref_path)


# @pytest.mark.parametrize(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean out these comments

# ["no_of_packets", "initial_seed", "last_seed", "iterations"],
# [(1200, 2850180890, 2683780343, 3)],
# )
def test_tracking_dataframe(
config,
tracking_refdata,
# no_of_packets,
# initial_seed,
# last_seed,
# iterations,
):
sim = tracking_refdata

# Initial Test to check if the data frame is generated or not
assert config["montecarlo"]["tracking"]["track_rpacket"] == True
assert isinstance(sim.runner.rpacket_tracker, pd.DataFrame)
# # assert (
# # len(sim.runner.rpacket_tracker["Packet Seed"].unique()) == no_of_packets
# # )
# assert sim.runner.rpacket_tracker["Packet Seed"].iloc[0] == initial_seed
# assert sim.runner.rpacket_tracker["Packet Seed"].iloc[-1] == last_seed
# assert len(sim.runner.rpacket_tracker["Iteration"].unique()) == iterations


def test_compare_dataframe(
tracking_refdata,
read_comparison_refdata,
):
sim = tracking_refdata
comparison_df = read_comparison_refdata

pd.testing.assert_frame_equal(
sim.runner.rpacket_tracker,
comparison_df,
check_dtype=True,
check_column_type=True,
check_exact=True,
)

assert isinstance(comparison_df, pd.DataFrame)
assert len(comparison_df["Packet Seed"].unique()) == len(
sim.runner.rpacket_tracker["Packet Seed"].unique()
)
assert len(comparison_df["Iteration"].unique()) == len(
sim.runner.rpacket_tracker["Iteration"].unique()
)


def test_parallel_dataframe(
config,
atomic_data_fname,
read_comparison_refdata,
):
comparison_df = read_comparison_refdata

config["atom_data"] = atomic_data_fname
config["montecarlo"]["nthreads"] = 3

sim = Simulation.from_config(config)
sim.run()

assert isinstance(sim.runner.rpacket_tracker, pd.DataFrame)
assert len(comparison_df["Packet Seed"].unique()) == len(
sim.runner.rpacket_tracker["Packet Seed"].unique()
)
assert len(comparison_df["Iteration"].unique()) == len(
sim.runner.rpacket_tracker["Iteration"].unique()
)

config["montecarlo"]["nthreads"] = 1
Loading