Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add detect methods to MultiTableMetadata #892

Closed
amontanez24 opened this issue Jul 8, 2022 · 0 comments · Fixed by #933
Closed

Add detect methods to MultiTableMetadata #892

amontanez24 opened this issue Jul 8, 2022 · 0 comments · Fixed by #933
Assignees
Labels
feature request Request for a new feature
Milestone

Comments

@amontanez24
Copy link
Contributor

Problem Description

As a user, it would be very useful if there was a way to add a template of the metadata for each table to my MultiTableMetadata.

Expected behavior

  • The detect methods only need to detect the sdtypes for each column.
  • Add detect_table_from_csv(filepath) method
    • Under the hood this should just call SingleTableMetadata.detect_from_csv(filepath)
    • Parameters
      • table_name: string that is the name of the table
      • filepath: string that is the full path to the csv file
    • Errors
      • If the metadata has already been detected, raise the following error
      Error: Metadata for table 'users' already exists. Specify a new table name or create a new MultiTableMetadata object for other data sources.
    • Should print the detected metadata as follows
>>> metadata = MultiTableMetadata()
>>> metadata.detect_table_from_csv(table_name='users', filepath='data/users.csv')

Detected metadata for table 'users':
{
  "columns": {
    "student_id": { "sdtype": "numerical" },
    "gender": { "sdtype": "categorical" },
    "gpa": { "sdtype": "numerical" },
    "age": { "sdtype": "numerical" },
    "education_level": { "sdtype": "categorical" }
  }
}
  • Add detect_table_from_dataframe(data) method
    • Under the hood this should just call SingleTableMetadata.detect_from_dataframe(filepath)
    • Parameters
      • table_name: string that is the name of the table
      • data: the dataframe
    • Errors
      • If the metadata has already been detected, raise the following error
      Error: Metadata for table 'users' already exists. Specify a new table name or create a new MultiTableMetadata object for other data sources..
    • Should also print the detected metadata
@amontanez24 amontanez24 added the feature request Request for a new feature label Jul 8, 2022
@amontanez24 amontanez24 added this to the 1.0.0 milestone Aug 16, 2022
@amontanez24 amontanez24 self-assigned this Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant