Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate More of the Invoice post processing steps #6

Closed
joachimweyl opened this issue Jan 2, 2024 · 7 comments · Fixed by #12
Closed

Automate More of the Invoice post processing steps #6

joachimweyl opened this issue Jan 2, 2024 · 7 comments · Fixed by #12
Assignees

Comments

@joachimweyl
Copy link
Contributor

joachimweyl commented Jan 2, 2024

Motivation

After the OpenShift, OpenStack, and storage invoice data is pulled it is then combined together. Then the next step is that it is processed in a few ways in Google Sheets. We would like to automate this after processing as much of the last step as we can easily.
We are not trying to automate all of the steps, just the steps leading up to what we send to MGHPCC. The other steps such as updating PDF's, Rollups, Orran course charges, and Utilized billed and non billed can wait as @joachimweyl has automated them mostly in Google Sheets and they are not on the same time crunch.

Completion Criteria

Provide details of the steps of the Invoice after-process

Description

  1. Add 3 new columns, Credit, Credit Code, and Balance to billable CSV

    1. Add Credits for new PI
      1. Compare a list of all PIs that have used OpenStack and OpenShift in the past, find out if there are any new PIs this month, and if there are add the current Cost value (up to $1k across the whole project) to the Credit column of the billable CSV
      2. this means that if there are 4 projects and they sum up to 1.5k we only add up to 1k in the Credit column and the rest will be invoiced directly
    2. Add 0002 to the Credit code field for all rows that are given New PI Credits
    3. Set the Balance column for all rows to be Cost minus Credit
      1. this means for non-credit rows balance will equal cost and for credit rows it will be the difference.
  2. Update the Institution column

    1. You can use the following excel code to convert, knowing that this can change we might want this to be a table in the invoicing repo. Keep in mind the code below is order-specific because of Harvard affiliates.
    =IF(REGEXMATCH(D2,"northeastern.edu"),"Northeastern University",IF(REGEXMATCH(D2,"bu.edu"),"Boston University",IF(REGEXMATCH(D2,"bentley.edu"),"Bentley",IF(REGEXMATCH(D2,"uri.edu"),"University of Rhode Island",IF(REGEXMATCH(D2,"redhat.com"),"Red Hat",IF(REGEXMATCH(D2,"childrens.harvard.edu"),"Boston Childrens Hospital",IF(REGEXMATCH(D2,"mclean.harvard.edu"),"McLean Hospital",IF(REGEXMATCH(D2,"meei.harvard.edu"),"Massachusetts Eye & Ear",IF(REGEXMATCH(D2,"dfci.harvard.edu"),"Dana-Farber Cancer Institute",IF(REGEXMATCH(D2,"bwh.harvard.edu"),"Brigham and Women's Hospital",IF(REGEXMATCH(D2,"bidmc.harvard.edu"),"Beth Israel Deaconess Medical Center",IF(REGEXMATCH(D2,"harvard.edu"),"Harvard University",IF(REGEXMATCH(D2,"wpi.edu"),"Worcester Polytechnic Institute",IF(REGEXMATCH(D2,"mit.edu"),"Massachusetts Institute of Technology",IF(REGEXMATCH(D2,"umass.edu"),"University of Massachusetts Amherst",IF(REGEXMATCH(D2,"uml.edu"),"University of Massachusetts Lowell",IF(REGEXMATCH(D2,"codeforboston.org"),"Code For Boston",IF(REGEXMATCH(D2,"mmsh"),"Harvard University",IF(REGEXMATCH(D2,"gstuart"),"University of Massachusetts Amherst",IF(REGEXMATCH(D2,"rudolph"),"Boston Childrens Hospital",IF(REGEXMATCH(D2,"robbaron"),"Boston University",IF(REGEXMATCH(D2,"kmdalton"),"Harvard University",IF(REGEXMATCH(D2,"mzink"),"University of Massachusetts Amherst",IF(REGEXMATCH(D2,"@yale.edu"),"Yale University",IF(REGEXMATCH(D2,"francesco.pontiggia"),"Harvard University","Please add new Institution to function")))))))))))))))))))))))))
    
  3. Create a CSV of only the HU rows named NERC-YYYY-MM-Harvard-Invoice.csv where YYYY is the year and MM is the month

  4. create a CSV of HU and BU only named NERC Invoice - HU&BU.csv where is the name of the month of the data processed.

  5. Save 2 CSVs above to the CSV folder or to a shared drive that @joachimweyl has access to

Reference

  1. February data

Completion dates

Desired - 2024-01-25
Required - TBD

@QuanMPhm
Copy link
Contributor

QuanMPhm commented Apr 1, 2024

@joachimweyl @knikolla Can I take on this issue?

@joachimweyl
Copy link
Contributor Author

@QuanMPhm how comfortable are you with how the current invoicing scripts are working?

@QuanMPhm
Copy link
Contributor

QuanMPhm commented Apr 1, 2024

@joachimweyl I don't have any problems with it so far, such questions for context and how it all works.

@joachimweyl
Copy link
Contributor Author

Sounds good, take this issue as yours.

@QuanMPhm QuanMPhm self-assigned this Apr 1, 2024
@QuanMPhm
Copy link
Contributor

QuanMPhm commented Apr 1, 2024

@joachimweyl To address each of the things you have listed so far in the issue description:

  1. I'll use the data on new and pre-existing PIs you've shared with me to add these columns, which should be a straightforward procedure. I assume at some point we would want to keep a single file or some database keeping track of all NERC PIs in the past. I'll leave the design of that file or database to @knikolla
  2. We either have the option of adding hard coded rules in the invoice processing script to determine the institution of each PI, or to obtain each PI's institution through some other, preferably pre-existing, source of data. I'll ask @knikolla if such a data source exists
  3. Fairly simple
  4. Fairly simple
  5. I'll look only possible Google Drive integrations in Python.

@joachimweyl
Copy link
Contributor Author

@QuanMPhm please note the update to the Institution filtering, of course, if you can find a way to gather Institution we don't have to do any of the filtering email addresses.

@knikolla
Copy link
Contributor

knikolla commented Apr 2, 2024

@joachimweyl I created nerc-project/coldfront-plugin-api#39 to have ColdFront return the email in all cases, so as to remove the username exceptions from your logic. But until we have a case where deducing the institution from the email address doesn't work, using the email address is the easiest path forward.

@QuanMPhm Additionally, I want to add that don't be constrained by Google Drive implementations and uploading the results to S3 storage is fine.

QuanMPhm pushed a commit to QuanMPhm/process_csv_report that referenced this issue Apr 2, 2024
Add openapi.yaml and update README
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants