Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement easy access query parameters and request body validation based on query parameters #363

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

kliwongan
Copy link
Contributor

@kliwongan kliwongan commented Jan 17, 2023

Description

This PR fixes #289 and may help with #351 (since msgspec encoding is apparently much faster than the standard json lib)
Query validation is optional, may find another way to indicate whether to verify (or not) instead of saving it in
the FunctionInfo, or force the change in general (when the "full" type-checking and schema validation is implemented)
The schema validation method itself supports nested classes and does not require the user to use external libraries.

The general idea of the fix is:
When a request is handled, use the Python GIL to run the steps below using the Python interpreter (possible bottleneck):

  1. Get the signature of the model using the features in the inspect module. The current PR supports custom user-defined (potentially nested) objects in the handler function signature.
  2. Generate a msgspec.Struct based model for schema validation based on either the member variables of the class (if it does not have a user-defined init constructor, or the constructor signature itself)
  3. Compare the model against the received request (by decoding the JSON result into the model object) to see if the schema validation fails; if it does, an InternalServer error is raised.
  4. If schema validation succeeds, we convert the object generated by the model validation into a dictionary and pass it to the handler function as kwargs and get the output of the handler function as usual

Further extensions:

  1. Implement process/thread-safe caching of Encoder and Decoder objects for the msgspec encode and decode calls. Since a call to either method requires some setup, multiple calls can be made faster if we use an Encoder/Decoder object for the respective calls.

@netlify
Copy link

netlify bot commented Jan 17, 2023

Deploy Preview for robyn canceled.

Name Link
🔨 Latest commit 437b7de
🔍 Latest deploy log https://app.netlify.com/sites/robyn/deploys/63f666736a810c0008d3e9ee

@kliwongan kliwongan changed the title Serialize Implement easy access query parameters and request body validation based on query parameters Jan 17, 2023
Copy link
Member

@sansyrox sansyrox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kliwongan 😄

The PR looks nice from my first glance ✨ I think I will require some time to go through it but I have one question for now

@@ -2,7 +2,6 @@
flake8==4.0.1
black==21.12b0
websockets==10.1
maturin==0.12.11
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you delete this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it was simply because installing from the dev-requirements caused an error, since the requirements.txt file already had maturin in it, and so there was a double reference to maturin. I had meant to revert it back before posting the PR and then consult someone about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kliwongan , I understand now. Maybe we can just keep maturin instead of maturin==0.12.11 ? That way we won't have conflicts

Copy link
Contributor Author

@kliwongan kliwongan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important to note: If possible, I intend to run the code in dependencies.py in Rust to avoid cyclical imports, however it is much easier to write in Python. If asked to do so, I will (somehow) replicate the code there in Rust without too much overhead and optimize as much as possible.
I am simply leaving it like this to demonstrate that certain features are possible, then porting over after I finalize the method in which I implement those features.

Copy link
Contributor Author

@kliwongan kliwongan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may not need ForwardRefs, my current approach with user-defined classes is actually somewhat correct, and by using a simple print statement I can confirm that evaluate_forwardref is never run, but we can leave it in there for backwards compatibility for now?

Inferred from fastapi/fastapi#5461 and also PEP 563

// Perform query param validation
let request_hashmap = request.to_hashmap(py).unwrap();
let handler = function.handler.as_ref(py);
let robyn = py.import("robyn").unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to import robyn here?

@sansyrox
Copy link
Member

Also @kliwongan , apologies for these one liner changes. I will do a thorough review either tomorrow or on Saturday 😄

@AntoineRR
Copy link
Collaborator

Hi @kliwongan, and thanks for the PR !
I took a look at this too as it aims to bring an important new feature to Robyn. I agree with you that the current architecture isn't the best as it implies that we use Robyn's Python code in Robyn's Rust code. This could be an issue for separating Robyn's web server and Robyn's web framework in the future AFAIK (see #268), and also decrease performances. I would suggest indeed to do the validation in Rust directly, which will probably and unfortunately be much harder 😞 Maybe we could use or be inspired by projects like pydantic-core which aims at migrating pydantic core validation to Rust ?
I don't know if @sansyrox agrees with me, but should we rewrite the data validation in Rust, I could try to help you with it 🙂

@kliwongan
Copy link
Contributor Author

kliwongan commented Jan 20, 2023

Hi @kliwongan, and thanks for the PR ! I took a look at this too as it aims to bring an important new feature to Robyn. I agree with you that the current architecture isn't the best as it implies that we use Robyn's Python code in Robyn's Rust code. This could be an issue for separating Robyn's web server and Robyn's web framework in the future AFAIK (see #268), and also decrease performances. I would suggest indeed to do the validation in Rust directly, which will probably and unfortunately be much harder 😞 Maybe we could use or be inspired by projects like pydantic-core which aims at migrating pydantic core validation to Rust ? I don't know if @sansyrox agrees with me, but should we rewrite the data validation in Rust, I could try to help you with it 🙂

I have a buggy branch (not sure if it's my machine or not) that somewhat addresses this problem https://github.com/kliwongan/robyn/tree/buggy-serialize; I essentially store the "validator" function in the FunctionInfo and then call it when necessary; for some reason I am getting a lot of connection errors when running tests (pytest and sending requests using Postman) and the errors are inconsistent, does anyone mind helping me investigate?

EDIT: Buggy branch works now, merged and deleted.

Anyways, if required, I would also love to rewrite the query validation in Rust! I've seen FastAPI's code and I have a good idea on how to apply other kinds of dependency injection as well (at least on the Python side). I agree on perhaps taking inspiration from pydantic-core, it might be a good idea. We can perhaps mix it in with the "validator" concept I have, where the user can also use whatever model validation function they want, which would return the **kwargs that the handler needs. However, I'm not even sure if the query validation should be a part of the server or the framework

@kliwongan kliwongan requested a review from sansyrox January 20, 2023 23:25
Comment on lines 93 to 124

# Classes for testing
class Test():
f: int
g: int

class NestedCls():
f: Test
special: Optional[str] = "Nice"

class TestCtor:
a: int
b: str

def __init__(self, a: int, b: str = "Nice"):
self.a = a
self.b = b

class Nested:
c: int
d: str

class TestQueryType:
a: int
b: str

class TestForwardRef:
a: 'Ref'

class Ref:
a: int
b: str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kliwongan , can we move this to a different file in the integration_tests folder only?

I usually tend to keep the fixtures in the conftest file and other imports can live in other files

@@ -1,8 +1,7 @@
-r requirements.txt
flake8==4.0.1
black==21.12b0
websocket-client==1.4.2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might have to rebase as this is needed for us and was added in one of the latest requests

@sansyrox
Copy link
Member

sansyrox commented Jan 21, 2023

Hey @kliwongan , thank you for your great work 😄 You PR shows a lot of hard work. Especially needing you to dive deep in Pydantic and FastAPI's codebase.

But as @AntoineRR(thanks for the review btw 😄 ) suggested, we would love to have the validation written in the Rust side. As it makes the performance slower and makes it harder to segregate the responsibilities in the codebase.

Why we are emphasizing this much on writing the validation in Rust side is because we believe that we do the hard things so that the developers using Robyn don't have to. This way we will be to provide the best experience.

We are more than happy to help you out with the implementation if need be. 😄

Having said that, great work so far! ✨

@AntoineRR
Copy link
Collaborator

Hi @kliwongan, do you still have time to work on this?

@kliwongan
Copy link
Contributor Author

Hi @kliwongan, do you still have time to work on this?

Yes, I'm starting to get a bit more free time now, I will restart working on it soon.

@sansyrox
Copy link
Member

sansyrox commented May 15, 2023

Hey @lecronium#4930 👋

So, I have the comments ready for the PR. I had this idea that we can allow multiple libraries for data validation. And if the there are any custom types(not the python default types), we can set validate=True by default unless someone overrides it to False.

e.g.

@app.post("/query_validation_complex", validate=True)
async def test_validation_complex(a: int, b: str, c: MsgSpecClass, d:PydanticClass):
    return jsonify({"a": a, "b": b, "c": c})

Also, we should make the data validation imports optional(maybe). Like we do with templating - To keep the dependencies as minimal as possible.

Let me know what you think of it 😄

@sansyrox
Copy link
Member

Hey @kliwongan 👋

Welcome back 😄

We have two updated requirements:

  • Allow request members to be accessed through keyword names

-e.g. currently you can do

@app.get('/')
def fx(request):
   ....

What I am aiming for is

@app.get('/')
def fx(request, body, query_params, json, ...):
   ....

And then add validation.

Ideally, I would want the validator to be plug-and-playable and off by default. i.e. the validation overhead should be present only when the developer explicitly turns it on.

And I would want to support both pydantic and msgspec, which can be configurable.

I believe your API implementation is quite generic, which is a great thing. Would love if we can fit the above requirements in the implementation.

@sansyrox sansyrox mentioned this pull request Oct 28, 2024
1 task
@sansyrox sansyrox force-pushed the main branch 5 times, most recently from 07f28de to 0b766c9 Compare January 7, 2025 01:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Implement type defined easy access query parameters.
4 participants