ValueError
when sampling PII columns
#1445
Labels
bug
Something isn't working
feature:sampling
Related to generating synthetic data after a model is built
Environment Details
Description
For PII columns only, it should be ok if the input
pandas.dtype
is not the same as the outputpandas.dtype
. For example, the input data may be all 0'd out (since it's sensitive). But I expect the synthetic data data should have strings, based on thesdtype
that I have selected in the metadata (and ultimately the Faker that is used).Steps to reproduce
Output:
Additional Context
For PII and ID columns only, we can do a try-catch whenever we try to cast the data back to the original
dtype
. If we cannot do the casting, then just return the data without casting. Log an INFO message when this happens.The text was updated successfully, but these errors were encountered: