You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our sampling methods assume sample (for all models) and GaussianCopula would never use reject sampling. This is incorrect. Due to constraints, any sampling method for any model might need reject sampling.
We are using batch_size and batch_size_per_try as interchangeable concepts when they are opposites: batch_size is intended to be small (to control output and save progress), but batch_size_per_try is intended to be large (for reject sampling)
Expected behavior
Have the same parameters for all sampling methods (sample, sample_conditions and sample_remaining_columns) for all single table models.
batch_size which will be used in the intended way: Users should be able to split up large sampling tasks into smaller sizes
Requirement: batch_size <= total num_rows
Default: batch_size = total num rows
max_tries_per_batch, which will be used when there is any kind of reject sampling involved
Default: 10
Logic
For every batch, for every try: Dynamically update the # of rows we try to sample to achieve the full batch_size.
Problem Description
sample
(for all models) andGaussianCopula
would never use reject sampling. This is incorrect. Due to constraints, any sampling method for any model might need reject sampling.batch_size
andbatch_size_per_try
as interchangeable concepts when they are opposites:batch_size
is intended to be small (to control output and save progress), butbatch_size_per_try
is intended to be large (for reject sampling)Expected behavior
Have the same parameters for all sampling methods (
sample
,sample_conditions
andsample_remaining_columns
) for all single table models.batch_size
which will be used in the intended way: Users should be able to split up large sampling tasks into smaller sizesbatch_size <= total num_rows
batch_size = total num rows
max_tries_per_batch
, which will be used when there is any kind of reject sampling involved10
Logic
For every batch, for every try: Dynamically update the # of rows we try to sample to achieve the full
batch_size
.TabularPreset
The text was updated successfully, but these errors were encountered: