Releases: capitalone/DataProfiler
Releases · capitalone/DataProfiler
0.8.3
Profiler
- Fix req missing for typing_extensions #698
- Add profiler option for column level invalid values #704
- Updated setup.cfg mypy flags and resolved related errors. #703
Documentation
Bugs
- Fix bug with null replication metrics #702
- PSI diff() #708
- PSI diff() bug #707
- Fix bug with null replication metrics when row is all null #706
- PSI no calculation on
TextProfile
#711 - PSI fixing unit tests #712
Other Changes
- Includes mypy in pre-commit and fixes last needed updates #696
- Quick Fix: Oxford Comma in README #697
- Pre-Commit: Default setup.cfg flags #701
- Updated setup.cfg with check-manifest #705
- Updating the version to v0.8.3 #710
- PSI blurb in Docs for 0.8.3 #714
- Generate Docs for v0.8.3 #715
Full Changelog: 0.8.2post1...0.8.3
What's Changed
- Fix req missing for typing_extensions by @JGSweets in #698
- Pre-Commit: Default
setup.cfg
flags by @taylorfturner in #701 - Add Makefile to auto setup repo for developers by @tonywu315 in #699
- Quick Fix: Oxford Comma in README by @taylorfturner in #697
- Adding
PSI
todiff
report by @taylorfturner in #688 - Updated setup.cfg with check-manifest by @Sanketh7 in #705
- Fix bug with null replication metrics by @tonywu315 in #702
- Add profiler option for column level invalid values by @tonywu315 in #704
- PSI
diff()
bug by @taylorfturner in #707 PSI
diff()
by @taylorfturner in #708- Fix bug with null replication metrics when row is all null by @tonywu315 in #706
- Add PSI documentation in README.md by @taylorfturner in #709
- Updated setup.cfg mypy flags and resolved related errors. by @Sanketh7 in #703
- Updating the version to v0.8.3 by @taylorfturner in #710
- PSI no calculation on
TextProfile
by @taylorfturner in #711 - PSI fixing unit tests by @taylorfturner in #712
Full Changelog: 0.8.2.post1...0.8.3
0.8.2.post1
Bugs
- Fix req missing for typing_extensions #698
Full Changelog: 0.8.2...0.8.2.post1
What's Changed
Full Changelog: 0.8.2...0.8.2.post1
0.8.2
Profiler
- added static typing to data_utils.py #662
- Added static typing to *_data classes in data_readers #677
- Adding the types of parameters and returns of functions #681
- Added static typing to data.py and filepath_or_buffer.py #682
- Fix typing and missing types #684
- Fix typing errors and missing return types #692
Documentation
Bugs
- Quick Fix #680
- Fix matplotlib version requirements param #686
- fix JSON bug with data reading #691
- Includes mypy in pre-commit and fixes last needed updates #696
Other Changes
- Update ghworkflow actions to use pre-commit #687
- Updating the version to v0.8.2 #694
- Generate Docs for v0.8.2 #695
Full Changelog: 0.8.1...0.8.2
What's Changed
- Add static typing to labeler models by @tonywu315 in #672
- Quick Fix by @taylorfturner in #680
- Add static typing to labelers/data_processing.py by @tonywu315 in #673
- Added static typing to data_readers/base_data.py and data_readers/json_data.py by @Sanketh7 in #666
- Added static typing to data.py and filepath_or_buffer.py by @Sanketh7 in #682
- Move contribute info to CONTRIBUTING.md by @tonywu315 in #683
- Fix typing and missing types by @tonywu315 in #684
- Fix
matplotlib
version requirements param by @taylorfturner in #686 - Fix typos, remove (unintended?) indentation by @bencomp in #690
- fix JSON bug with data reading by @JGSweets in #691
- Fix typing errors and missing return types by @tonywu315 in #692
- Adding the types of parameters and returns of functions by @stefanycoimbra in #681
- Update ghworkflow actions to use pre-commit by @JGSweets in #687
- Added static typing to *_data classes in data_readers by @Sanketh7 in #677
- added static typing to data_utils.py by @Sanketh7 in #662
- Updating the version to v0.8.2 by @taylorfturner in #694
- Includes mypy in pre-commit and fixes last needed updates by @JGSweets in #696
New Contributors
Full Changelog: 0.8.1...0.8.2
0.8.1
Profiler
- Added static typing to data_readers/avro_data.py #657
- Added static typing to data_readers/structured_mixins.py #659
- Static Typing profiler #660
- Static Typing profilers/column profile #661
- Static typing for profilers #663
- Add static typing to data labeler and abstract classes #664
- Add static typing to labeler utils #668
- Allow diff to set format options for prepare report #669
- Replace dict constructor with dict comprehension #676
Documentation
Bugs
- fix: nb issues being valid #655
- Fix bug with loading data labeler from disk #665
- Fixes bug with empty data for DataLabeler Col #667
- Fixes 0 variance in a dataset #671
Other Changes
Full Changelog: 0.8.0...0.8.1
What's Changed
- Static Typing profilers/utils.py by @tonywu315 in #630
- Static Typing for Base Column Primitive Type Profilers by @tonywu315 in #645
- Added static typing to data_readers/structured_mixins.py by @Sanketh7 in #659
- Static Typing profilers/profile_builder.py by @tonywu315 in #643
- Static Typing profilers/numerical_column_stats.py by @tonywu315 in #648
- Static Typing profiler by @tonywu315 in #660
- Static Typing profilers/column profile by @tonywu315 in #661
- Added static typing to data_readers/avro_data.py by @Sanketh7 in #657
- Static typing for profilers by @tonywu315 in #663
- Add static typing to data labeler and abstract classes by @tonywu315 in #664
- Fix bug with loading data labeler from disk by @tonywu315 in #665
- Fixes bug with empty data for DataLabeler Col by @JGSweets in #667
- Allow diff to set format options for prepare report by @JGSweets in #669
- Add static typing to labeler utils by @tonywu315 in #668
- Add logo readme by @taylorfturner in #670
- Fixes 0 variance in a dataset by @JGSweets in #671
- Updating the version to v0.8.1 by @taylorfturner in #675
- Replace dict constructor with dict comprehension by @boneyag in #676
New Contributors
Full Changelog: 0.8.0...0.8.1
0.8.0
Profiler
- DataProfiler: hotfix for handling nan values in diff #647
- Static Typing profilers/profiler_options.py #644
- refactor: validate parameters and the returns of functions #640
- Preset option in ProfileOptions #638
- GraphProfiler: add() NotImplementedError #636
- ColumnNameLabeler Setup #635
- Fix for issue #605 #634
- GraphProfiler: diff() functionality #631
- Post Processor for ColumnNameModel #629
- Graph Profiler: save() and load() Functionality #628
- Quick Add: require_module in ColumnNameModel test #627
- New Data Labeler: ColumnNameModel Build #626
Documentation
- fix: nb issues being valid #655
- fix: remove extra commas #654
- DataProfiler: structured_profilers example fix #653
- DataProfiler: graph_data_demo update #649
- ColumnNameLabeler Notebook Example #646
- Update documentation README for gh-pages branch #619
- README unti testing docs #632
- Notebook Examples for DP + GE: expect_profile_numeric_columns_percent #625
- Notebook Examples for DP + GE: expect_profile_numeric_columns_diff #624
- Notebook Examples for DP + GE: expect_column_values_vs_profile #623
- Notebook Examples for DP + GE: expect_column_value_confidence #622
Other Changes
Full Changelog: 0.7.11...0.8.0
What's Changed
- Update TF / numpy reqs, drop py3.6 by @JGSweets in #614
- Notebook Examples for DP + GE: expect_column_value_confidence by @micdavis in #622
- Notebook Examples for DP + GE: expect_column_values_vs_profile by @micdavis in #623
- Notebook Examples for DP + GE: expect_profile_numeric_columns_diff by @micdavis in #624
- Notebook Examples for DP + GE: expect_profile_numeric_columns_percent by @micdavis in #625
- New Data Labeler: ColumnNameModel Build by @taylorfturner in #626
- Quick Add:
require_module
inColumnNameModel
test by @taylorfturner in #627 - Graph Profiler: save() and load() Functionality by @micdavis in #628
- README unti testing docs by @taylorfturner in #632
- GraphProfiler: diff() functionality by @micdavis in #631
- Fix for issue #605 by @vindhyanairlj in #634
- Preset option in ProfileOptions by @lovleen3112 in #638
- refactor: validate parameters and the returns of functions by @stefanycoimbra in #640
- GraphProfiler: add() NotImplementedError by @micdavis in #636
- Post Processor for ColumnNameModel by @taylorfturner in #629
- ColumnNameLabeler Setup by @taylorfturner in #635
- DataProfiler: hotfix for handling
nan
values in diff by @micdavis in #647 - ColumnNameLabeler Notebook Example by @taylorfturner in #646
- Static Typing profilers/profiler_options.py by @tonywu315 in #644
- DataProfiler:
graph_data_demo
update by @micdavis in #649 - Updating the version to v0.8.0 by @micdavis in #652
- DataProfiler:
structured_profilers
example fix by @micdavis in #653 - fix: remove extra commas by @JGSweets in #654
- fix: nb issues being valid by @JGSweets in #655
New Contributors
- @vindhyanairlj made their first contribution in #634
- @lovleen3112 made their first contribution in #638
- @stefanycoimbra made their first contribution in #640
- @tonywu315 made their first contribution in #644
Full Changelog: 0.7.11...0.8.0
0.7.11
Profiler
- Fixes GraphProfiler loading directly with class #602
- Graph loading bug #603
- Hot Fix: Typo in Graph Docstring #611
- fixes profile_schema bug for unnamed columns #612
- Include compact in fix for profile schema serialization #613
Documentation
- Add example notebook for graph data input #594
- Fix version for 0.7.10 #596
- Update ReadMe to include Graph Profiler #597
- lower --> upper #598
- Add Graph Github Pages #599
- Graph Pipeline Demo notebook header fix #600
- Add GraphData to data_reader notebook #601
- formatting issue for 0.7.11 docs #617
Other Changes
Full Changelog: 0.7.10...0.7.11
What's Changed
- Add example notebook for graph data input by @MisterPNP in #594
- Update ReadMe to include Graph Profiler by @MisterPNP in #597
- lower --> upper by @taylorfturner in #598
- Graph Pipeline Demo notebook header fix by @taylorfturner in #600
- Fixes GraphProfiler loading directly with class by @JGSweets in #602
- Graph loading bug by @JGSweets in #603
- Add GraphData to data_reader notebook by @taylorfturner in #601
- Hot Fix: Typo in Graph Docstring by @taylorfturner in #611
- fixes profile_schema bug for unnamed columns by @JGSweets in #612
- Include compact in fix for profile schema serialization by @JGSweets in #613
- Updating the version to v0.7.11 by @taylorfturner in #615
- formatting issue for 0.7.11 docs by @taylorfturner in #617
Full Changelog: 0.7.10...0.7.11
v0.7.10
Profiler
- Add Graph Profiler Update to Profile Builder #587
- HOT FIX: black formatting issue #588
- Implement GraphData/GraphProfiler into DataProfiler Pipeline #581
- fix: make prepare report keys json serializable #580
- Add property array to continuous distribution profile in GraphProfiler/ Integer Node ID identified as Integer in GraphData #579
- Reformatted codebase using flake8. #578
- SyntheticDataOptions and metrics calculations for NaN replication #571
- Reformatted batch 4 (see comment) using flake8. #569
- Reformatted batch #3 (see comment) using flake8. #567
- Reformatted batch 2 (see comment) with flake8. #566
- Reformatted big batch #1 (see comment) using flake8. #565
- Reformatted dataprofiler/data_readers/graph_data.py using flake8. #564
- Reformatted histogram_utils.py using flake8. #563
- Reformatted dataprofiler/profilers/profile_options.py using flake8. #562
- Reformatted dataprofiler/profilers/numerical_column_stats.py using flake8. #561
- Reformatted dataprofiler/profilers/unstructured_labeler_profile.py us… #558
- Reformatted dataprofiler/profilers/int_column_profile.py using flake8. #557
- Reformatted dataprofiler/labelers/base_data_labeler.py using flake8. #556
- Reformatted init file of profilers/helpers/ using flake8. #555
- Reformatted dataprofiler/profilers/unstructured_text_profile.py using flake8. #554
- Reformatted dataprofiler/labelers/data_processing.py using flake8. #553
- [UTILs] Adding top-level function for distributed merging of profiles #552
- Reformatted dataprofiler/labelers/data_labelers.py using flake8. #551
- Reformatted dataprofiler/dp_logging.py using flake8; changed first docstring in float, text, and order column profiler modules. #550
- Reformatted dataprofiler/profilers/float_column_profiler.py and dataprofiler/profilers/text_column_profiler.py using flake8. #549
- Reformatted dataprofiler/profilers/order_column_profile.py using flak… #548
- Graph Profiler class to create a profile of an input graph #546
- Calculate correlation only between selected columns #544
- HOT FIX: resolving black and isort #538
- Reformatted code using trailing-whitespace hook, excluding tests/data and speed_tests/data folders #537
- Add data labeler tf loader for any cnn softmax model #532
- Reformatted . using isort. #530
- Reformatted dataprofiler/data_readers and dataprofiler/tests using bl… #529
- Add data loader to GraphData class #528
- Reformatted dataprofiler/labelers using black 22.3.0. #526
- Reformatted dataprofiler/data_readers using black 22.3.0. #525
- Reformatted resources/init.py using black 22.3.0. #524
- Reformatted dataprofiler/init.py using black 22.3.0. #523
- Reformatted dataprofiler/validators using black 22.3.0. #522
- Reformatted dataprofiler/tests using black 22.3.0. #521
- Reformatted dataprofiler/profilers using black 22.3.0. #520
- Add a class to differentiate between Tabular and Graph CSV files #517
- Reformatted dataprofiler/version.py using black. #515
- Reformatted dataprofiler/profilers using black #513
- Reformatted dataprofiler/reports using black. #512
- Reformatted dataprofiler/tests using black. #511
- Reformatted dataprofiler/validators using black. #510
- Reformatted dataprofiler/dp_logging.py using black. #509
- Add New preprocessor for using an encoding map #506
Documentation
- fix: missing null_rep matrix info #592
- Hot fix/notebook #586
- HOT FIX: example notebook update #584
- Added documentation for null_replication_metrics #583
- Updated README to include null_replication_metrics #582
- Adding Example for merge_profile_list #559
- HOT FIX: update README and remove isort.cfg #536
Dependencies
- Hot Fix: Resolving Version in Pre-Commit File #547
- Added end-of-file-fixer hook, excluding tests/data folder from changes. #541
- Added debug-statments pre-commit hook to yaml config file. #539
Other Changes
- Generate Docs for 0.7.10 #593
- Hot fix/version change #591
- Revert "Generate Docs for 0.8.0" - > 0.7.9 #590
- Tox #570
- HOT FIX: update README and remove isort.cfg #536
- Added yaml configuration file. #534
- Reformatted setup.py using black. #514
- Update Codeowners #507
Full Changelog: 0.7.9...0.7.10
What's Changed
- Add New preprocessor for using an encoding map by @JGSweets in #506
- Update Codeowners by @micdavis in #507
- Reformatted dataprofiler/version.py using black. by @jakleh in #515
- Reformatted setup.py using black. by @jakleh in #514
- Reformatted dataprofiler/dp_logging.py using black. by @jakleh in #509
- Reformatted dataprofiler/validators using black. by @jakleh in #510
- Reformatted dataprofiler/reports using black. by @jakleh in #512
- Reformatted dataprofiler/profilers using black by @jakleh in #513
- Reformatted dataprofiler/tests using black. by @jakleh in #511
- Reformatted dataprofiler/profilers using black 22.3.0. by @jakleh in #520
- Reformatted dataprofiler/tests using black 22.3.0. by @jakleh in #521
- Reformatted dataprofiler/validators using black 22.3.0. by @jakleh in #522
- Reformatted dataprofiler/init.py using black 22.3.0. by @jakleh in #523
- Reformatted resources/init.py using black 22.3.0. by @jakleh in #524
- Reformatted dataprofiler/data_readers using black 22.3.0. by @jakleh in #525
- Reformatted dataprofiler/labelers using black 22.3.0. by @jakleh in #526
- Add a class to differentiate between Tabular and Graph CSV files by @MisterPNP in #517
- Reformatted dataprofiler/data_readers and dataprofiler/tests using bl… by @jakleh in #529
- Reformatted . using isort. by @jakleh in #530
- Added yaml configuration file. by @jakleh in #534
- HOT FIX: update README and remove isort.cfg by @taylorfturner in #536
- Add data loader to GraphData class by @MisterPNP in #528
- Add data labeler tf loader for any cnn softmax model by @JGSweets in #532
- Reformatted code using trailing-whitespace hook, excluding tests/data and speed_tests/data folders by @jakleh in #537
- HOT FIX: resolving black and isort by @taylorfturner in #538
- Added debug-statments pre-commit hook to yaml config file. by @jakleh in #539
- Added end-of-file-fixer hook, excluding tests/data folder from changes. by @jakleh in #541
- Calculate correlation only between selected columns by @Ta7ar in #544
- Hot Fix: Resolving Version in Pre-Commit File by @taylorfturner in #547
- Reformatted dataprofiler/profilers/order_column_profile.py using flak… by @jakleh in #548
- Reformatted dataprofiler/profilers/float_column_profiler.py and dataprofiler/profilers/text_column_profiler.py using flake8. by @jakleh in #549
- Reformatted dataprofiler/dp_logging.py using flake8; changed first docstring in float, text, and order column profiler modules. by @jakleh in #550
- Reformatted dataprofiler/labelers/data_labelers.py using flake8. by @jakleh in #551
- Reformatted dataprofiler/labelers/data_processing.py using flake8. by @jakleh in #553
- [UTILs] Adding top-level function for distributed merging of profiles by @taylorfturner in #552
- Reformatted dataprofiler/profilers/unstructured_text_profile.py using flake8. by @jakleh in #554
- Reformatted init file of profilers/helpers/ using flake8. by @jakleh in #555
- Reformatted dataprofiler/labelers/base_data_labeler.py using flake8. by @jakleh in #556
- Reformatted dataprofiler/profilers/int_column_profile.py using flake8. by @jakleh in #557
- Graph Profiler class to create a profile of an input graph by @MisterPNP in #546
- Reformatted dataprofiler/profilers/unstructured_labeler_profile.py us… by @jakleh in #558
- Adding
Example
formerge_profile_list
by @taylorfturner in #559 - Reformatted dataprofiler/profilers/numerical_column_stats.py using flake8. by @jakleh in #561
- Reformatted dataprofiler/profilers/profile_options.py using flake8. by @jakleh in #562
...
v0.7.9
Profiler
- Numeric Stats Mixin: pop disabled elements from profile report #491
- FloatColumn: enable pop when remove_disabled_flag set to True #493
- TextColumn: enable pop when remove_disabled_flag set to True #494
- TextProfiler: enable pop when remove_disabled_flag set to True #495
- Profile Builder: add report for remove_disable_flag at the top level #496
- Report functionality added to BaseColumnProfile #497
- PR for column compiler reports #499
Documentation
- Added Description Documentation #486
- Added the statistic descriptions to GitHub pages #500
- Updated the pip install ghpages documentation #501
Dependencies
- build(deps): bump actions/setup-python from 2 to 4 #487
Other Changes
Full Changelog: 0.7.8...0.7.9
v0.7.8
#Profiler
- DateTimeColumn: Handle Datetime day suffixes #458
Bug fixes
Other Changes
- Github issues default assignees list #457
- README updates #475
- Add Python 3.10 to GHA #476
- Configure Dependabot #477
- Updating the Version to v0.7.8 #482
- Documentation / Github Pages updated #483
Full Changelog: 0.7.7...0.7.8