Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unions of PlutusData and BuiltinData #397

Merged
merged 5 commits into from
Aug 24, 2024
Merged

Conversation

SCMusson
Copy link
Contributor

@SCMusson SCMusson commented Aug 5, 2024

Targeting issue #367
The following would now compile

source_code = """
from dataclasses import dataclass
from typing import Dict, List, Union
from pycardano import Datum as Anything, PlutusData

@dataclass()
class A(PlutusData):
    CONSTR_ID = 0
    foo: int

def validator(x: Union[A, int, bytes, List[int], Dict[int, int]]) -> int:
    k: int = 0
    if isinstance(x, A):
        k = 5
    elif isinstance(x, bytes):
        k = 7
    elif isinstance(x, int):
        k = 6
    elif isinstance(x, List[int]):
        k = 8
    elif isinstance(x, Dict[int, int]):
        k = 9
    return k
"""

I think it can't currently distinguish between Dict[int, int] and Dict[int, bytes] for example. I'll work on that if you think this is heading in the correct direction. I also need to sit down and think of what edge cases might break this and write some more tests.

Comment on lines 422 to 435
bind_self = node.func.typ.typ.bind_self
bound_vs = sorted(list(node.func.typ.typ.bound_vars.keys()))
args = []
for a, t in zip(node.args, node.func.typ.typ.argtyps):
for i, (a, t) in enumerate(zip(node.args, node.func.typ.typ.argtyps)):
# now impl_from_args has been chosen, skip type arg
if (
hasattr(node.func, "orig_id")
and node.func.orig_id == "isinstance"
and i == 1
):
continue
assert isinstance(t, InstanceType)
# pass in all arguments evaluated with the statemonad
a_int = self.visit(a)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of assert isinstance(t, InstanceType) for type checking is not ideal for production code. If the assertion fails, it will raise an AssertionError, which may not provide sufficient context for debugging.

Recommendation: Replace the assert statement with a proper type check and raise a more informative exception if the type check fails. This will improve error handling and maintainability.

Comment on lines +97 to +161
def impl_from_args(self, args: typing.List[Type]) -> plt.AST:
if isinstance(args[1], IntegerType):
return OLambda(
["x"],
plt.ChooseData(
OVar("x"),
plt.Bool(False),
plt.Bool(False),
plt.Bool(False),
plt.Bool(True),
plt.Bool(False),
),
)
elif isinstance(args[1], ByteStringType):
return OLambda(
["x"],
plt.ChooseData(
OVar("x"),
plt.Bool(False),
plt.Bool(False),
plt.Bool(False),
plt.Bool(False),
plt.Bool(True),
),
)
elif isinstance(args[1], RecordType):
return OLambda(
["x"],
plt.ChooseData(
OVar("x"),
plt.Bool(True),
plt.Bool(False),
plt.Bool(False),
plt.Bool(False),
plt.Bool(False),
),
)
elif isinstance(args[1], ListType):
return OLambda(
["x"],
plt.ChooseData(
OVar("x"),
plt.Bool(False),
plt.Bool(False),
plt.Bool(True),
plt.Bool(False),
plt.Bool(False),
),
)
elif isinstance(args[1], DictType):
return OLambda(
["x"],
plt.ChooseData(
OVar("x"),
plt.Bool(False),
plt.Bool(True),
plt.Bool(False),
plt.Bool(False),
plt.Bool(False),
),
)
else:
raise NotImplementedError(
f"Only isinstance for byte, int, Plutus Dataclass types are supported"
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The impl_from_args method in the IsinstanceImpl class has a significant issue with maintainability and extensibility. The method uses a series of elif statements to handle different types, which can become cumbersome as more types are added. This approach is not scalable and makes the code harder to maintain.

Recommended Solution:
Refactor the method to use a dictionary mapping types to their corresponding OLambda implementations. This will make the code more modular and easier to extend in the future.

n.skip_next = True
return self.visit(BoolOp(And(), [n, ntc]))
else:
return ntc
try:
tc.func = self.visit(node.func)
except Exception as e:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try block at the end of the fragment catches a generic Exception, which can obscure the root cause of errors and make debugging difficult. Catching all exceptions is generally not recommended unless you re-raise the exception or handle it in a very specific way.

Recommended Solution: Catch more specific exceptions that you expect might be raised in this block. If you must catch all exceptions, consider logging the exception details before re-raising it.

@nielstron
Copy link
Contributor

I think it would be best to only allow joining Dict[Any, Any] and List[Any] in these unions, ideally making dict and list synonyms of this type.

If a user really knows what they are doing maybe an unsafe cast from dict to Dict[int, int] can then be allowed?

Also it's important that str is not allowed in these unions but this seems to be the case already.

What happens when a function has parameter type Union[A, int] and is passed an int/A (where the type of that variable was just int / A before)? Need to wrap in PlutusData in those cases

["x"],
plt.ChooseData(
OVar("x"),
plt.Bool(True),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the take on isinstance as a polymorphic impl.

Is this enough though to ensure some x is the correct record type and not just any record type? isinstance(x, B) should not always return true for x of type Union[A, B]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are both builtin and plutusData types in a union, then when checking plutusData it first checks its a record type and then invokes the old functionality of checking CONSTR_ID:

ntc = Compare(
left=Attribute(tc.args[0], "CONSTR_ID"),
ops=[Eq()],
comparators=[Constant(target_class.record.constructor)],
)
custom_fix_missing_locations(ntc, node)
ntc = self.visit(ntc)
ntc.typ = BoolInstanceType
ntc.typechecks = TypeCheckVisitor(self.allow_isinstance_anything).visit(tc)
if isinstance(tc.args[0].typ.typ, UnionType) and any(
[
isinstance(a, (IntegerType, ByteStringType, ListType, DictType))
for a in tc.args[0].typ.typ.typs
]
):
n = copy(node)
n.skip_next = True
return self.visit(BoolOp(And(), [n, ntc]))
else:
return ntc

bound_vars={
v: self.variable_type(v)
for v in externally_bound_vars(node)
if not v in ["List", "Dict"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does this come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither List or Dict appear in self.scopes (so self.variable_type will fail) but end up in externally_bound_vars. I'm not sure what the most 'correct' solution here is. I'm guessing there would be problems if someone tried to define there own class with the name Test and Dict.

@SCMusson
Copy link
Contributor Author

SCMusson commented Aug 7, 2024

I've just realised too that isinstance(x, List[int]) and isinstance(x, Dict[int, int]) is not valid python.
The following would be thrown:

TypeError: Subscripted generics cannot be used with class and instance checks

So would the following be what you are looking for:

source_code = """
from dataclasses import dataclass
from typing import Dict, List, Union
from pycardano import Datum as Anything, PlutusData

@dataclass()
class A(PlutusData):
    CONSTR_ID = 0
    foo: int

def validator(x: Union[A, int, bytes, List[Anything], Dict[Anything, Anything]]) -> int:
    k: int = 0
    if isinstance(x, A):
        k = 5
    elif isinstance(x, bytes):
        k = 7
    elif isinstance(x, int):
        k = 6
    elif isinstance(x, List):
        k = 8
    elif isinstance(x, Dict):
        k = 9
    return k
"""

Alternatively working towards using isinstance(x, list) and isinstance(x, dict) instead or as well as isinstanec(x, List) and isinstance(x, Dict).

Let me know what you think.

Comment on lines +385 to +393
assert (
isinstance(elt.slice, Name)
and elt.slice.orig_id == "Anything"
), f"Only List[Anything] is supported in Unions. Received List[{elt.slice.orig_id}]."
if isinstance(elt, Subscript) and elt.value.id == "Dict":
assert all(
isinstance(e, Name) and e.orig_id == "Anything"
for e in elt.slice.elts
), f"Only Dict[Anything, Anything] or Dict is supported in Unions. Received Dict[{elt.slice.elts[0].orig_id}, {elt.slice.elts[1].orig_id}]."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of assert statements for type checking and validation can be problematic in production code. Assertions can be disabled with the -O (optimize) flag when running Python, which means these checks might not be executed, potentially leading to unexpected behavior or security issues.

Recommended Solution: Replace assert statements with explicit exception handling, such as raising TypeError or ValueError with appropriate error messages.

@nielstron
Copy link
Contributor

Nice catch! If isinstance(x, Dict) is valid python that should be fine, otherwise isinstance(x, dict) would be required.

Comment on lines 456 to 458
assert (
func.returns is None or func.returns.id != n.name
), "Invalid Python, class name is undefined at this stage"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of assert statements for type checking and validation can be problematic in production code. Assertions can be disabled with the -O (optimize) flag when running Python, which means these checks might not be executed, potentially leading to unexpected behavior or security issues.

Recommended Solution: Replace assert statements with explicit exception handling, such as raising TypeError or ValueError with appropriate error messages.

Comment on lines 1023 to 1025
except Exception:
# if this fails raise original error
raise e

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try block catches a generic Exception, which can obscure the root cause of errors and make debugging difficult. Catching all exceptions is generally not recommended unless you re-raise the exception or handle it in a very specific way.

Recommended Solution: Catch more specific exceptions that you expect might be raised in this block. If you must catch all exceptions, consider logging the exception details before re-raising it.

@SCMusson
Copy link
Contributor Author

I've had some free time so I've come back to this.
The following should run:

source_code = """
from dataclasses import dataclass
from typing import Dict, List, Union
from pycardano import Datum as Anything, PlutusData

@dataclass()
class A(PlutusData):
    CONSTR_ID = 0
    foo: int

def validator(x: Union[A, int, bytes, List[Anything], Dict[Anything, Anything]]) -> int:
    k: int = 0
    if isinstance(x, A):
        k = 5
    elif isinstance(x, bytes):
        k = 7
    elif isinstance(x, int):
        k = 6
    elif isinstance(x, List):
        k = 8
    elif isinstance(x, Dict):
        k = 9
    return k
"""

Only List[Anything] and Dict[Anything, Anything] are accepted in a Union now.
It will raise an error if List or Dict are subscripts inside isinstance to align with actual python.
Looks like python 3.11 CI errored because of timings not because of bugs from what I'm reading.

Copy link
Contributor

@nielstron nielstron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great now. Just some small comments.

return 6
elif isinstance(x, bytes):
return 7
elif isinstance(x, list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to make really sure that Python and OpShin map to the same operaton here (i.e. that List is allowed for isinstance)

Suggested change
elif isinstance(x, list):
elif isinstance(x, List):

return 7
elif isinstance(x, list):
return 8
elif isinstance(x, dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
elif isinstance(x, dict):
elif isinstance(x, Dict):

@OpShin OpShin deleted a comment from codereviewbot-ai bot Aug 23, 2024
@SCMusson
Copy link
Contributor Author

SCMusson commented Aug 23, 2024

Implemented those requested changes.
Also test for failure with Bool, Str, and duplicate constr_ids in Unions.
I added a check for duplicate CONSTR_IDs in Unions.

@OpShin OpShin deleted a comment from codereviewbot-ai bot Aug 24, 2024
@OpShin OpShin deleted a comment from codereviewbot-ai bot Aug 24, 2024
@OpShin OpShin deleted a comment from codereviewbot-ai bot Aug 24, 2024
@nielstron nielstron merged commit e927d1d into OpShin:dev Aug 24, 2024
2 of 3 checks passed
@nielstron
Copy link
Contributor

As we have discovered a few missing edge cases here (see #403) I will not be able to pay out the full bug bounty for this PR. If @SCMusson is willing to supply the additional fixes I would simply offer additional time to resolve this and claim the full reward, otherwise we will have to decide a reasonable split with another developer.

@SCMusson
Copy link
Contributor Author

SCMusson commented Aug 27, 2024

I plan to return to this problem however I will be short of available time for the next week after which I will be able to devote some time to fixing this.

@SCMusson
Copy link
Contributor Author

SCMusson commented Sep 3, 2024

I think this is now fixed in #403

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants