Skip to content

DataFrame support and more!

Compare
Choose a tag to compare
@johntmyers johntmyers released this 16 Jun 00:14
· 252 commits to master since this release

Major changes to Gretel Synthetics including native support for DataFrames and batched column training!

⚙️ Introduce a batch module that allows a DataFrame to be ingested and split into batches of smaller DataFrames where each batch has a subset of the columns of the source DataFrame. This allows training of datasets with several columns while still allowing the preservation of correlations and statistical data. See our Medium Blog for details and our example dataframe_batch Notebook located in the examples directory.

📖 Massive updates to docstrings for the config module. Details for each config parameter.

🤖 Update to generation functionality. If a validator is provided, the gen_lines config option will be used only to count valid lines that are generated. In order to stop run away generation, a max_invalid parameter exists that specifies the maximum number of invalid lines that can be generated. If this number of invalid lines is exceeded, a RunTimeError will be thrown and generation will be halted.