Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PI invoices are now exported as PDFs #129

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

QuanMPhm
Copy link
Contributor

@QuanMPhm QuanMPhm commented Dec 31, 2024

Closes #84. This PR consists of the last commit.
This PR mostly involved changes to the pi-specific invoice class. More details in the commit message. Two Python packages were crucial for this task, Jinja and Selenium

@knikolla @larsks I am aware that running the invoice script with this change will make pandas print a bunch of FutureWarnings for each PDF printed. Should I address them after the current state of the PR is approved?

@QuanMPhm QuanMPhm marked this pull request as ready for review January 2, 2025 20:08
Copy link
Member

@larsks larsks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using selenium is overkill here. You can drive chrome's print-to-pdf functionality directly from the command line:

google-chrome --headless --print-to-pdf=output.pdf --no-pdf-header-footer document.html

This takes a single subprocess.run() call. You can control the page size via CSS in your source document. Note that in the US, "Letter" size (8.5x11 inches) is much more common than A4.

@QuanMPhm
Copy link
Contributor Author

QuanMPhm commented Jan 9, 2025

@larsks I have removed the selenium dependancy and followed your approach. However, know I have a question regarding the unit test...

@larsks @knikolla The unit test failed because it did not find the chromium binary. Should I change the github action file to install Chrome, or should I just mock the subprocess.run() call to just write a pdf file?

@knikolla
Copy link
Contributor

@larsks @knikolla The unit test failed because it did not find the chromium binary. Should I change the github action file to install Chrome, or should I just mock the subprocess.run() call to just write a pdf file?

Mock os.path.exists and subprocess.run and verify that it was called with the parameters you expected it to be called with.

@larsks
Copy link
Member

larsks commented Jan 17, 2025

@larsks @knikolla The unit test failed because it did not find the chromium binary. Should I change the github action file to install Chrome, or should I just mock the subprocess.run() call to just write a pdf file?

Unit tests shouldn't have external dependencies, which means you should mock out subprocess.run.

It might be nice to have an integration test that calls out to chrome; this would ensure that our chrome command line is valid. If @knikolla agrees that's a good idea, that could be a future pull request.

@knikolla
Copy link
Contributor

@larsks @knikolla The unit test failed because it did not find the chromium binary. Should I change the github action file to install Chrome, or should I just mock the subprocess.run() call to just write a pdf file?

Unit tests shouldn't have external dependencies, which means you should mock out subprocess.run.

It might be nice to have an integration test that calls out to chrome; this would ensure that our chrome command line is valid. If @knikolla agrees that's a good idea, that could be a future pull request.

Considering the rapid pace of chrome development and inability to rely on semantic versioning as they bump the major version number too often, I think it's worth having a test in the future that verifies that the command line arguments that we're passing are still accepted by the current version of chrome.

For the purposes of this PR, a unit test is enough.

The PI-specific dataframes will first be converted to HTML
tables using Jinja templates, and then converted to PDFs using
Chromium. Now, users of the script must provide a path to the
Chromium/Chrome binary throught the env var `CHROME_BIN_PATH`

A html template folder has been added, and the test cases
for the PI-specific invoice will now both check whether the
dataframe is formatted correctly and if the PDFs are
correctly generated. The dockerfile has been to install chromium
@QuanMPhm
Copy link
Contributor Author

I have created a new issue to create the integration test at #145. I have also implemented all feedback so far.

@QuanMPhm QuanMPhm requested a review from larsks January 22, 2025 15:18
Comment on lines +77 to +78
else:
return "$" + str(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't release need the else clause here, and prefer f-strings over string concatenation:

Suggested change
else:
return "$" + str(data)
return f"${data}"

But see my comment later on about whether or not we even need this function.

Comment on lines +95 to +96
column_sums = list()
sum_columns_list = list()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, prefer [] over list() to initialize list variables:

Suggested change
column_sums = list()
sum_columns_list = list()
column_sums = []
sum_columns_list = []


def _create_pdf_invoice(temp_fd_name):
chrome_binary_location = os.environ.get(
"CHROME_BIN_PATH", "usr/bin/chromium"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this actually be /usr/bin/chromium?

Suggested change
"CHROME_BIN_PATH", "usr/bin/chromium"
"CHROME_BIN_PATH", "/usr/bin/chromium"

"--no-sandbox",
f"--print-to-pdf={invoice_pdf_path}",
"--no-pdf-header-footer",
"file://" + temp_fd_name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer an f-string (especially since you're using this syntax just a few lines earlier):

Suggested change
"file://" + temp_fd_name,
f"file://{temp_fd_name}",

Comment on lines 154 to 157
if not os.path.exists(
self.name
): # self.name is name of folder storing invoices
os.mkdir(self.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplify this using os.makedirs:

Suggested change
if not os.path.exists(
self.name
): # self.name is name of folder storing invoices
os.mkdir(self.name)
os.makedirs(self.name, exist_ok=True)

Comment on lines +42 to +46
def add_dollar_sign(data):
if pandas.isna(data):
return data
else:
return "$" + str(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(See my earlier comment about this function.)

Comment on lines +146 to +147
for answer_arg in answer_arglist:
self.assertTrue(answer_arg in chrome_arglist[0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be more constrained in your check:

Suggested change
for answer_arg in answer_arglist:
self.assertTrue(answer_arg in chrome_arglist[0])
self.assertTrue(answer_arglist == chrome_arglist[0][:-1])

self.assertIn("ProjectC", pi_df["Project - Allocation"].tolist())
mock_filter_cols.return_value = test_invoice
mock_path_exists.return_value = True
output_dir = tempfile.TemporaryDirectory()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to create a temporary directory -- because we're mocking out subprocess.run, we never create any output.

You can do this instead:

        pi_inv = test_utils.new_pi_specific_invoice(
            "/fakedir", invoice_month, data=test_invoice
        )
        pi_inv.process()
        pi_inv.export()
        pi_pdf_1 = f"/fakedir/BU_PI1_{invoice_month}.pdf"
        pi_pdf_2 = f"/fakedir/HU_PI2_{invoice_month}.pdf"

pi_inv = test_utils.new_pi_specific_invoice(
output_dir.name, invoice_month=self.invoice_month, data=self.dataframe
if not group_name:
group_name = [None for _ in range(len(pi))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can write instead:

Suggested change
group_name = [None for _ in range(len(pi))]
group_name = [None] * len(pi)

Compare:

>>> pi=[1,2,3,4,5]
>>> [None for _ in range(len(pi))]
[None, None, None, None, None]
>>> [None] * len(pi)
[None, None, None, None, None]

)
if not os.path.exists(chrome_binary_location):
sys.exit(
f"Chrome binary does not exist at {chrome_binary_location}. Make sure the env var CHROME_BIN_PATH is set correctly or that Google Chrome is installed"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message directs the user to ensure that Google Chrome is installed, but the code defaults to Chromium (which could lead to the situation in which Google Chrome is installed but the user still receives this error message). It makes more sense to replace or with and:

Suggested change
f"Chrome binary does not exist at {chrome_binary_location}. Make sure the env var CHROME_BIN_PATH is set correctly or that Google Chrome is installed"
f"Chrome binary does not exist at {chrome_binary_location}. Make sure that Google Chrome is installed and the environment variable CHROME_BIN_PATH is set correctly."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generate PDFs from the PI-specific invoices
3 participants