Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract unittests #19

Merged
merged 50 commits into from
Aug 19, 2024
Merged
Show file tree
Hide file tree
Changes from 48 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
1de5521
removed unecessary import
Aug 16, 2024
003a3ab
try to add workflow yml
Aug 16, 2024
b4b559d
Add GitHub Actions workflow
Aug 16, 2024
3886afa
linted imports
Aug 16, 2024
c5a21bf
linted imports
Aug 16, 2024
b14e398
updated actions workflow
Aug 16, 2024
8ebaeb4
made .toml
Aug 16, 2024
199da96
updated actions workflow
Aug 16, 2024
41095fe
updated actions workflow
Aug 16, 2024
054afaa
updated actions workflow
Aug 16, 2024
036e025
updated actions workflow
Aug 16, 2024
9f6ce0e
updated actions workflow
Aug 16, 2024
625f1c8
tweak ci.yml & setup.py
Aug 19, 2024
864e53e
removed logging from requirements as it is a standard lib
Aug 19, 2024
88e26f7
modify .yml
Aug 19, 2024
823a1cb
modify .yml
Aug 19, 2024
920d3b2
modify .yml
Aug 19, 2024
29fe2ee
modify .yml
Aug 19, 2024
db0b936
modify .yml
Aug 19, 2024
678a439
modify .yml
Aug 19, 2024
abe2cc6
fix import
Aug 19, 2024
5564084
fix import
Aug 19, 2024
15d9ea1
switched to lazy formatting for logging
Aug 19, 2024
caf7bcf
switched to lazy formatting for logging, changed setup.py python version
Aug 19, 2024
329c8ba
switched to lazy formatting for logging
Aug 19, 2024
8d667a1
switched to lazy formatting for logging
Aug 19, 2024
a6aaafe
switched to lazy formatting for logging
Aug 19, 2024
1671d95
aligned tests to match new logging strucutre
Aug 19, 2024
78a22de
update workflow name
Aug 19, 2024
1befa35
modified .yml so it can generate badges
Aug 19, 2024
a4886ea
modified .yml so it can generate badges
Aug 19, 2024
312c372
modified .yml so it can generate badges
Aug 19, 2024
74d6dea
modified .yml so it can generate badges
Aug 19, 2024
256440e
modified .yml so it can generate badges
Aug 19, 2024
fd18833
modified .yml so it can generate badges, fixed req.txt
Aug 19, 2024
6ff4552
modified .yml so it can generate badges
Aug 19, 2024
1c960ed
modified .yml to get it working again
Aug 19, 2024
9015d39
modified .yml to get it working again
Aug 19, 2024
f2089b0
modified .yml to get it working again
Aug 19, 2024
ea0341e
modified .yml to get it working again
Aug 19, 2024
2836824
modified .yml to get it working again
Aug 19, 2024
005e5fd
modified .yml to get correct coverage version
Aug 19, 2024
c0992e2
modified .yml to get correct coverage version
Aug 19, 2024
c51788c
modified .yml to get correct coverage version
Aug 19, 2024
de278d7
modified .yml to get correct coverage version
Aug 19, 2024
cd16d8a
roll back .yml to just run tests
Aug 19, 2024
d62aad4
adds a license.txt
Aug 19, 2024
d9d938c
Merge branch 'main' into extract_unittests
JoshuaMarden Aug 19, 2024
b8d64e4
try to remove logs
Aug 19, 2024
743a019
fix .gitignore
Aug 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified .DS_Store
Binary file not shown.
Binary file added .github/.DS_Store
Binary file not shown.
Binary file added .github/workflows/.DS_Store
Binary file not shown.
34 changes: 34 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: CI

on: [push, pull_request]

jobs:
run-unittests:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3

- name: Set up Python
uses: actions/setup-python@v4

- name: Create and activate virtual environment
run: |
python -m venv venv # Create a virtual environment
source venv/bin/activate # Activate the virtual environment

- name: Install dependencies
run: |
source venv/bin/activate # Ensure the virtual environment is active
pip install --upgrade pip # Upgrade pip within the virtual environment
pip install -r requirements.txt # Install dependencies from requirements.txt

- name: Set up environment
run: |
source venv/bin/activate # Ensure the virtual environment is active
source ./add_root_to_path.sh # Run the script to modify PYTHONPATH

- name: Run tests
run: |
source venv/bin/activate # Ensure the virtual environment is active
pytest # Run tests directly with pytest
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ data/*
*.feather
*.log


# Python Module Data
#(e.g __pychache__ directories)
__pycache__/
Expand Down
21 changes: 21 additions & 0 deletions LICENSE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2024 J J Marden

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
12 changes: 6 additions & 6 deletions pipeline/extract_carbon.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ def fetch_data(self) -> Optional[Dict[str, Any]]:
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException as e:
self.logger.error(f"An error occurred: {e}")
self.logger.error("An error occurred: %s", e)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice updating all logs to use %s

return None


Expand Down Expand Up @@ -135,19 +135,18 @@ def execute(self) -> Optional[pd.DataFrame]:

if data:
self.logger.info("Data fetched successfully from API.")
self.logger.debug(f"Fetched Data: {data}")
self.logger.debug("Fetched Data: %s", data)

self.logger.info("Processing the fetched data.")
df = self.data_processor.process_data(data)

if df is not None:
self.logger.info("Data processed successfully into DataFrame.")
self.logger.debug("DataFrame of Carbon Forecast Data:")
self.logger.debug(df.to_string())
self.logger.debug("DataFrame of Carbon Forecast Data:\n%s", df.to_string())

self.logger.info("Saving the processed data locally.")
self.data_processor.save_data_locally(df)
self.logger.info(f"Data saved locally at `{self.data_processor.save_location}`.")
self.logger.info("Data saved locally at `%s`.", self.data_processor.save_location)

self.logger.info("Attempting to get S3 client.")
s3_client = self.data_processor.get_s3_client()
Expand All @@ -156,7 +155,8 @@ def execute(self) -> Optional[pd.DataFrame]:
self.logger.info("S3 client retrieved successfully.")
self.logger.info("Uploading the data to S3.")
self.data_processor.save_data_to_s3()
self.logger.info(f"Data successfully uploaded to S3 bucket `{self.s3_bucket}` as `{self.s3_file_name}`.")
self.logger.info("Data successfully uploaded to S3 bucket `%s` as `%s`.",
self.s3_bucket, self.s3_file_name)
else:
self.logger.error("Failed to get S3 client. Data was not uploaded to S3.")
return df
Expand Down
15 changes: 7 additions & 8 deletions pipeline/extract_demand.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ def fetch_data(self) -> Optional[Dict[str, Any]]:
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException as e:
self.logger.error(f"An error occurred: {e}")
self.logger.error("An error occurred: %s", e)
return None


Expand Down Expand Up @@ -100,7 +100,6 @@ def process_data(self, data: Dict[str, Any],
a pd.DataFrame, the second element is a dictionary containing the
time window over which the data was fetched.
"""
logger = logger or self.logger

if not data or "data" not in data:
logger.warning("No data found in response.")
Expand Down Expand Up @@ -159,16 +158,16 @@ def execute(self) -> Optional[Tuple[pd.DataFrame, Dict[str, datetime]]]:
df, time_period = result

self.logger.debug("DataFrame of Demand Data:")
self.logger.debug(df.to_string()) # Log the entire DataFrame as a string
self.logger.debug("%s", df.to_string()) # Log the entire DataFrame as a string
self.logger.info("Head of the DataFrame:")
self.logger.info("\n" + df.head().to_string())
self.logger.info("\n%s", df.head().to_string())
self.logger.info("Time Period of Data:")
self.logger.info(time_period)
self.logger.info("%s", time_period)

# Saving data locally
self.logger.info("Saving data locally.")
local_save_path = self.data_processor.save_data_locally(df)
self.logger.info(f"Data successfully saved locally at {local_save_path}.")
self.logger.info("Data successfully saved locally at %s.", local_save_path)

# Uploading data to S3
self.logger.info("Preparing to upload data to S3.")
Expand All @@ -177,7 +176,7 @@ def execute(self) -> Optional[Tuple[pd.DataFrame, Dict[str, datetime]]]:
if s3_client:
self.logger.info("S3 client initialized successfully.")
self.data_processor.save_data_to_s3()
self.logger.info(f"Data successfully uploaded to S3 at `{self.s3_file_name}`.")
self.logger.info("Data successfully uploaded to S3 at `%s`.", self.s3_file_name)
else:
self.logger.error("Failed to initialize S3 client.")

Expand All @@ -187,7 +186,7 @@ def execute(self) -> Optional[Tuple[pd.DataFrame, Dict[str, datetime]]]:
else:
self.logger.error("Failed to retrieve data from API.")
except Exception as e:
self.logger.error(f"An error occurred during the execution: {e}")
self.logger.error("An error occurred during the execution: %s", e)

self.logger.info("Execution of the workflow completed.")
return None
Expand Down
22 changes: 8 additions & 14 deletions pipeline/extract_generation.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ def fetch_data(self) -> Optional[Dict[str, Any]]:
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException as e:
self.logger.error(f"An error occurred: {e}")
self.logger.error("An error occurred: %s", e)
return None


Expand Down Expand Up @@ -103,7 +103,7 @@ def process_data(self, data: Dict[str, Any]) -> Optional[Tuple[pd.DataFrame, Dic
"""

if not data or "data" not in data:
logger.warning("No data found in response.")
self.logger.warning("No data found in response.")
return None

df = pd.DataFrame(data["data"])
Expand Down Expand Up @@ -159,19 +159,14 @@ def execute(self) -> Optional[Tuple[pd.DataFrame, Dict[str, datetime]]]:
if result is not None:
df, time_period = result

self.logger.debug("DataFrame of Demand Data:")
# Log the entire DataFrame as a string
self.logger.debug(df.to_string())
self.logger.info("Head of the DataFrame:")
self.logger.info("\n" + df.head().to_string())
self.logger.info("Time Period of Data:")
self.logger.info(time_period)
self.logger.debug("DataFrame of Demand Data:\n%s", df.to_string())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great reduction in Logging!

self.logger.info("Head of the DataFrame:\n%s", df.head().to_string())
self.logger.info("Time Period of Data: %s", time_period)

# Saving data locally
self.logger.info("Saving data locally.")
local_save_path = self.data_processor.save_data_locally(df)
self.logger.info(f"Data successfully saved locally at {
local_save_path}.")
self.logger.info("Data successfully saved locally at %s.", local_save_path)

# Uploading data to S3
self.logger.info("Preparing to upload data to S3.")
Expand All @@ -180,8 +175,7 @@ def execute(self) -> Optional[Tuple[pd.DataFrame, Dict[str, datetime]]]:
if s3_client:
self.logger.info("S3 client initialized successfully.")
self.data_processor.save_data_to_s3()
self.logger.info(f"Data successfully uploaded to S3 at `{
self.s3_file_name}`.")
self.logger.info("Data successfully uploaded to S3 at `%s`.", self.s3_file_name)
else:
self.logger.error("Failed to initialize S3 client.")

Expand All @@ -191,7 +185,7 @@ def execute(self) -> Optional[Tuple[pd.DataFrame, Dict[str, datetime]]]:
else:
self.logger.error("Failed to retrieve data from API.")
except Exception as e:
self.logger.error(f"An error occurred during the execution: {e}")
self.logger.error("An error occurred during the execution: %s", e)

self.logger.info("Execution of the workflow completed.")
return None
Expand Down
27 changes: 11 additions & 16 deletions pipeline/extract_price.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ def get_settlement_periods(self,
df = pd.read_feather(path_to_reference_data)
periods = df.groupby('settlementDate')['settlementPeriod'].unique().to_dict()
periods = {k: list(v) for k, v in periods.items()}
logger.info(f"Getting price data for {periods}")
logger.info("Getting price data for %s", periods)
return periods

def construct_default_params(self, date: str, period: int) -> str:
Expand All @@ -76,8 +76,8 @@ def fetch_data(self, periods: Dict[str, List[int]]) -> List[Dict[str, Any]]:
response.raise_for_status()
response_list.append(response.json())
except requests.exceptions.RequestException as e:
self.logger.warning(f"An error occurred when requesting fuel data for {date}, period {period}!")
self.logger.warning(f"{e}")
self.logger.warning("An error occurred when requesting fuel data for %s, period %d!", date, period)
self.logger.warning("%s", e)

return response_list

Expand Down Expand Up @@ -110,7 +110,6 @@ def process_data(self, response_list: List[Dict[str, Any]],
"""
Takes a list of responses, merges them into a DataFrame, and returns it.
"""
logger = logger or self.logger

if not response_list:
logger.warning("No data found in response.")
Expand Down Expand Up @@ -153,7 +152,7 @@ def execute(self) -> Optional[Tuple[pd.DataFrame, int]]:

try:
periods = self.api_client.get_settlement_periods(self.reference_data_path)
self.logger.info(f"Retrieved settlement periods: {periods}")
self.logger.info("Retrieved settlement periods: %s", periods)

response_list = self.api_client.fetch_data(periods)
if response_list:
Expand All @@ -163,17 +162,14 @@ def execute(self) -> Optional[Tuple[pd.DataFrame, int]]:
if result is not None:
df, number_of_settlement_periods = result

self.logger.debug("DataFrame of Price Data:")
self.logger.debug(df.to_string()) # Log the entire DataFrame as a string
self.logger.info("Head of the DataFrame:")
self.logger.info("\n" + df.head().to_string())
self.logger.info("Number of Settlement Periods:")
self.logger.info(number_of_settlement_periods)
self.logger.debug("DataFrame of Price Data:\n%s", df.to_string())
self.logger.info("Head of the DataFrame:\n%s", df.head().to_string())
self.logger.info("Number of Settlement Periods: %d", number_of_settlement_periods)

# Saving data locally
self.logger.info("Saving data locally.")
local_save_path = self.data_processor.save_data_locally(df)
self.logger.info(f"Data successfully saved locally at {local_save_path}.")
self.logger.info("Data successfully saved locally at %s.", local_save_path)

# Uploading data to S3
self.logger.info("Preparing to upload data to S3.")
Expand All @@ -182,7 +178,7 @@ def execute(self) -> Optional[Tuple[pd.DataFrame, int]]:
if s3_client:
self.logger.info("S3 client initialized successfully.")
self.data_processor.save_data_to_s3()
self.logger.info(f"Data successfully uploaded to S3 at `{self.data_processor.s3_file_name}`.")
self.logger.info("Data successfully uploaded to S3 at `%s`.", self.data_processor.s3_file_name)
else:
self.logger.error("Failed to initialize S3 client.")

Expand All @@ -192,7 +188,7 @@ def execute(self) -> Optional[Tuple[pd.DataFrame, int]]:
else:
self.logger.error("Failed to retrieve data from the API.")
except Exception as e:
self.logger.error(f"An error occurred during the execution: {e}")
self.logger.error("An error occurred during the execution: %s", e)

self.logger.info("Execution of the workflow completed.")
return None
Expand Down Expand Up @@ -227,6 +223,5 @@ def main() -> None:
logger.info("---> Data inserted and process completed for %s.", script_name)



if __name__ == "__main__":
main()
main()
5 changes: 3 additions & 2 deletions pipeline/extract_to_s3.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,10 @@
from extract_demand import main as extract_demand
from extract_price import main as extract_price


from constants import Constants as ct
import config as cg
save_directory = cg.DATA

save_directory = ct.DATA
if not os.path.exists(save_directory):
os.makedirs(save_directory)

Expand Down
Loading
Loading