Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(connector):from_key parameter validation #407

Merged
merged 2 commits into from
Dec 31, 2020

Conversation

pallavibharadwaj
Copy link
Contributor

@pallavibharadwaj pallavibharadwaj commented Nov 3, 2020

The PR is addressing two issues:

1. Parameter validation for template params:

Parameter checking happens in dataprep connector.py where we check for keys in allowed_params. This check had to include the from_key to accommodate for the template params.

Also Fixed: Finnhub config files had the template keys mentioned differently (the fromKey was being passed as part of the query). This has been fixed now on the DataConnector repo along with the corresponding changes in the DBLP config files. [Needs merge of the corresponding PR in the Data Connector repo.]

Testing:
fromKey tested in both DBLP and Finnhub.
The following was throwing an error because first_name and last_name are not directly specified as params in the config file.
df = await dc.query("publication", first_name="Jian", last_name="Pei")

Selection_050

Consequence:
In the config files, the fromKey in the template param FieldDef is now to be specified as a List instead of string containing all the params the template needs.
Ex:
"fromKey": ["first_name", "last_name"]
Note: Single parameter for template should have the key specified an array of one element.


2. Reducing Dependency on Requests Library [#396]:

Replacing the python requests library with our own implementation serving as a wrapper around the standard http.client library

Checklist:

  • My code follows the style guidelines of this project
  • I have already squashed the commits and make the commit message conform to the project standard.
  • I have already marked the commit with "BREAKING CHANGE" or "Fixes #" if needed.
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

Copy link
Contributor

@peiwangdb peiwangdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for us to write our own request function in a separate utils file and make the interface similar to the original request function, instead of substituting the code in all the places? @pallavibharadwaj

@pallavibharadwaj
Copy link
Contributor Author

Is it possible for us to write our own request function in a separate utils file and make the interface similar to the original request function, instead of substituting the code in all the places? @pallavibharadwaj

Is this for the requests library substitution? Yes, sure. I'll work on that.

@peiwangdb
Copy link
Contributor

@pallavibharadwaj would you fix the check failures?

@pallavibharadwaj
Copy link
Contributor Author

pallavibharadwaj commented Dec 1, 2020

@pallavibharadwaj would you fix the check failures?

@peiwangdb The check has failed for code formatting with connector.py that does not include my changes. Re-formatting the file will result in a lot of code change in this file. I did not do it so it will be easier to review the particular changes pertaining to this issue. Do you want me to go ahead and reformat the whole file?

@pallavibharadwaj pallavibharadwaj force-pushed the develop branch 3 times, most recently from 255786f to bd63dc5 Compare December 1, 2020 17:03
@peiwangdb
Copy link
Contributor

@pallavibharadwaj thanks for the information. I am not sure why this formatting issue emerged. Do you have an idea who made the changes? If it is not a hard fix, could you open a new PR to solve the issue and then do a rebase? I can quickly approve of it.

@nick-zrymiak
Copy link

@pallavibharadwaj would you fix the check failures?

@peiwangdb The check has failed for code formatting with connector.py that does not include my changes. Re-formatting the file will result in a lot of code change in this file. I did not do it so it will be easier to review the particular changes pertaining to this issue. Do you want me to go ahead and reformat the whole file?

What would be the alternative to this?

@pallavibharadwaj pallavibharadwaj force-pushed the develop branch 3 times, most recently from ceee2c1 to 8850fd8 Compare December 9, 2020 10:46
@pallavibharadwaj
Copy link
Contributor Author

@peiwangdb @nick-zrymiak

  1. It's a simple fix and I have re-formatted the connector file along with all the other files that needed reformatting as a separate PR refactor(dataprep): Reformatting all dataprep files #450

  2. There is one other issue that I had discussed on Discord regarding security warning with HTTPSConnection on older versions of Python (prior to 2.7.9 and 3.4.3). We have to make a choice here depending on what versions of Python we support for Dataprep. If we support later than the above-mentioned versions, we can probably ignore this warning.

https_security

@nick-zrymiak
Copy link

@peiwangdb @nick-zrymiak

  1. It's a simple fix and I have re-formatted the connector file along with all the other files that needed reformatting as a separate PR refactor(dataprep): Reformatting all dataprep files #450
  2. There is one other issue that I had discussed on Discord regarding security warning with HTTPSConnection on older versions of Python (prior to 2.7.9 and 3.4.3). We have to make a choice here depending on what versions of Python we support for Dataprep. If we support later than the above-mentioned versions, we can probably ignore this warning.

I've noticed the cutoff for other large Python libraries is 3.6 so may be fine.

There's also this paragraph from an article that came out Feb 6, 2019 so the numbers may be more lopsided now:

"Regarding Python 2 vs. Python 3 adoption, the survey shows 84 percent using Python 3 and 16 percent still using Python 2. Among the Python 3 users, 54 percent are using Python 3.6 and 30 percent Python 3.7, with the rest split among other versions."

On the other hand there may be a substitute for HTTPSConnection but I'm not too sure.

@peiwangdb what do you think?

@pallavibharadwaj pallavibharadwaj force-pushed the develop branch 8 times, most recently from 9bc6572 to 2c2c03e Compare December 17, 2020 19:48
@codecov
Copy link

codecov bot commented Dec 17, 2020

Codecov Report

Merging #407 (a799a6b) into develop (9b16020) will decrease coverage by 0.26%.
The diff coverage is 65.85%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #407      +/-   ##
===========================================
- Coverage    81.17%   80.90%   -0.27%     
===========================================
  Files           73       74       +1     
  Lines         5535     5625      +90     
===========================================
+ Hits          4493     4551      +58     
- Misses        1042     1074      +32     
Impacted Files Coverage Δ
dataprep/connector/schema/defs.py 52.91% <27.77%> (-4.73%) ⬇️
dataprep/connector/generator/generator.py 31.81% <37.50%> (+0.56%) ⬆️
dataprep/connector/connector.py 62.94% <52.94%> (-0.36%) ⬇️
dataprep/connector/utils.py 91.17% <91.17%> (ø)
dataprep/connector/config_manager.py 85.93% <100.00%> (+1.19%) ⬆️
dataprep/tests/connector/test_integration.py 100.00% <100.00%> (ø)
dataprep/eda/correlation/compute/nullivariate.py 98.18% <0.00%> (-0.91%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9b16020...a799a6b. Read the comment docs.

@pallavibharadwaj
Copy link
Contributor Author

@peiwangdb @dovahcrow I have resolved the issues with the PR. Like I said before, there is still a security warning with HTTPSConnection on older versions of Python (prior to 2.7.9 and 3.4.3), which we discussed can be ignored.

nick-zrymiak
nick-zrymiak previously approved these changes Dec 18, 2020
Copy link

@nick-zrymiak nick-zrymiak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very well done!

@dovahcrow dovahcrow merged commit 9a8c6bf into sfu-db:develop Dec 31, 2020
devinllu pushed a commit to devinllu/dataprep that referenced this pull request Nov 9, 2021
feat(connector):from_key parameter validation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants