Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandas TypeError when trying to execute example from the getting started page #205

Closed
msgonzaga opened this issue Oct 15, 2020 · 3 comments
Labels
resolution:duplicate This issue or pull request already exists

Comments

@msgonzaga
Copy link

msgonzaga commented Oct 15, 2020

  • SDV version: 0.4.5.dev (but I had the same issue with the 0.4.4 stable version)
  • Python version: 3.8
  • Operating System: Windows 10

Description

TypeError when trying to execute the Getting Started example

What I Did

I tried to execute the getting started example from the page https://sdv.dev/SDV/user_guides/single_table/gaussian_copula.html
and got a TypeError: cannot astype a datetimelike from [datetime64[ns]] to [int32].

from sdv.demo import load_tabular_demo
data = load_tabular_demo('student_placements')
from sdv.tabular import GaussianCopula
model = GaussianCopula()
model.fit(data)

This is the Exception traceback:

TypeError                                 Traceback (most recent call last)
<ipython-input-11-bcfd88bde046> in <module>
----> 1 model.fit(data)

~\.conda\envs\synthdatagen\lib\site-packages\sdv-0.4.5.dev0-py3.8.egg\sdv\tabular\base.py in fit(self, data)
    102                      self._metadata.name, data.shape)
    103         if not self._metadata_fitted:
--> 104             self._metadata.fit(data)
    105 
    106         self._num_rows = len(data)

~\.conda\envs\synthdatagen\lib\site-packages\sdv-0.4.5.dev0-py3.8.egg\sdv\metadata\table.py in fit(self, data)
    461 
    462         LOGGER.info('Fitting HyperTransformer for table %s', self.name)
--> 463         self._fit_hyper_transformer(constrained, extra_columns)
    464         self.fitted = True
    465 

~\.conda\envs\synthdatagen\lib\site-packages\sdv-0.4.5.dev0-py3.8.egg\sdv\metadata\table.py in _fit_hyper_transformer(self, data, extra_columns)
    362         transformers_dict = self._get_transformers(dtypes)
    363         self._hyper_transformer = rdt.HyperTransformer(transformers=transformers_dict)
--> 364         self._hyper_transformer.fit(data[list(dtypes.keys())])
    365 
    366     @staticmethod

~\.conda\envs\synthdatagen\lib\site-packages\rdt\hyper_transformer.py in fit(self, data)
    135         for column_name, transformer in self._transformers.items():
    136             column = data[column_name]
--> 137             transformer.fit(column)
    138 
    139     def transform(self, data):

~\.conda\envs\synthdatagen\lib\site-packages\rdt\transformers\datetime.py in fit(self, data)
     55             data = pd.Series(data)
     56 
---> 57         transformed = self._transform(data)
     58         self.null_transformer = NullTransformer(self.nan, self.null_column)
     59         self.null_transformer.fit(transformed)

~\.conda\envs\synthdatagen\lib\site-packages\rdt\transformers\datetime.py in _transform(datetimes)
     40         nulls = datetimes.isnull()
     41         integers = np.zeros(len(datetimes))
---> 42         integers[~nulls] = datetimes[~nulls].astype(int).astype(float).values
     43         integers[nulls] = np.nan
     44 

~\.conda\envs\synthdatagen\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors)
   5544         else:
   5545             # else, only a single dtype is given
-> 5546             new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors,)
   5547             return self._constructor(new_data).__finalize__(self, method="astype")
   5548 

~\.conda\envs\synthdatagen\lib\site-packages\pandas\core\internals\managers.py in astype(self, dtype, copy, errors)
    593         self, dtype, copy: bool = False, errors: str = "raise"
    594     ) -> "BlockManager":
--> 595         return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
    596 
    597     def convert(

~\.conda\envs\synthdatagen\lib\site-packages\pandas\core\internals\managers.py in apply(self, f, align_keys, **kwargs)
    404                 applied = b.apply(f, **kwargs)
    405             else:
--> 406                 applied = getattr(b, f)(**kwargs)
    407             result_blocks = _extend_blocks(applied, result_blocks)
    408 

~\.conda\envs\synthdatagen\lib\site-packages\pandas\core\internals\blocks.py in astype(self, dtype, copy, errors)
   2098 
   2099         # delegate
-> 2100         return super().astype(dtype=dtype, copy=copy, errors=errors)
   2101 
   2102     def _can_hold_element(self, element: Any) -> bool:

~\.conda\envs\synthdatagen\lib\site-packages\pandas\core\internals\blocks.py in astype(self, dtype, copy, errors)
    593             vals1d = values.ravel()
    594             try:
--> 595                 values = astype_nansafe(vals1d, dtype, copy=True)
    596             except (ValueError, TypeError):
    597                 # e.g. astype_nansafe can fail on object-dtype of strings

~\.conda\envs\synthdatagen\lib\site-packages\pandas\core\dtypes\cast.py in astype_nansafe(arr, dtype, copy, skipna)
    937             return arr.astype(dtype)
    938 
--> 939         raise TypeError(f"cannot astype a datetimelike from [{arr.dtype}] to [{dtype}]")
    940 
    941     elif is_timedelta64_dtype(arr):

TypeError: cannot astype a datetimelike from [datetime64[ns]] to [int32]

Thanks!

@csala
Copy link
Contributor

csala commented Oct 15, 2020

Hello @msgonzaga Thanks for reporting this.

I did a few tests and I am able to reproduce the error. However, it seems to happen only on Windows environments, which unfortunately we do not fully support yet.

On *nix environments like Mac or Linux there should be no problem. Do you have the chance to switch to one of those?

@msgonzaga
Copy link
Author

msgonzaga commented Oct 15, 2020

@csala thanks for checking it out! Unfortunately, I'm not able to switch at this moment.

@csala
Copy link
Contributor

csala commented Oct 22, 2020

I just opened issue #218 to keep track of this problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
resolution:duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants