Skip to content

Releases: capitalone/DataProfiler

0.8.3

10 Nov 20:50
79da96f
Compare
Choose a tag to compare

Profiler

  • Fix req missing for typing_extensions #698
  • Add profiler option for column level invalid values #704
  • Updated setup.cfg mypy flags and resolved related errors. #703

Documentation

  • Add Makefile to auto setup repo for developers #699
  • Add PSI documentation in README.md #709

Bugs

  • Fix bug with null replication metrics #702
  • PSI diff() #708
  • PSI diff() bug #707
  • Fix bug with null replication metrics when row is all null #706
  • PSI no calculation on TextProfile #711
  • PSI fixing unit tests #712

Other Changes

  • Includes mypy in pre-commit and fixes last needed updates #696
  • Quick Fix: Oxford Comma in README #697
  • Pre-Commit: Default setup.cfg flags #701
  • Updated setup.cfg with check-manifest #705
  • Updating the version to v0.8.3 #710
  • PSI blurb in Docs for 0.8.3 #714
  • Generate Docs for v0.8.3 #715

Full Changelog: 0.8.2post1...0.8.3

What's Changed

Full Changelog: 0.8.2.post1...0.8.3

0.8.2.post1

21 Oct 16:27
564ded0
Compare
Choose a tag to compare

Bugs

  • Fix req missing for typing_extensions #698

Full Changelog: 0.8.2...0.8.2.post1

What's Changed

Full Changelog: 0.8.2...0.8.2.post1

0.8.2

19 Oct 16:36
d6bfde3
Compare
Choose a tag to compare

Profiler

  • added static typing to data_utils.py #662
  • Added static typing to *_data classes in data_readers #677
  • Adding the types of parameters and returns of functions #681
  • Added static typing to data.py and filepath_or_buffer.py #682
  • Fix typing and missing types #684
  • Fix typing errors and missing return types #692

Documentation

  • Move contribute info to CONTRIBUTING.md #683
  • Fix typos, remove (unintended?) indentation #690

Bugs

  • Quick Fix #680
  • Fix matplotlib version requirements param #686
  • fix JSON bug with data reading #691
  • Includes mypy in pre-commit and fixes last needed updates #696

Other Changes

  • Update ghworkflow actions to use pre-commit #687
  • Updating the version to v0.8.2 #694
  • Generate Docs for v0.8.2 #695

Full Changelog: 0.8.1...0.8.2

What's Changed

New Contributors

Full Changelog: 0.8.1...0.8.2

0.8.1

05 Oct 19:38
855544a
Compare
Choose a tag to compare

Profiler

  • Added static typing to data_readers/avro_data.py #657
  • Added static typing to data_readers/structured_mixins.py #659
  • Static Typing profiler #660
  • Static Typing profilers/column profile #661
  • Static typing for profilers #663
  • Add static typing to data labeler and abstract classes #664
  • Add static typing to labeler utils #668
  • Allow diff to set format options for prepare report #669
  • Replace dict constructor with dict comprehension #676

Documentation

  • fix: nb issues being valid #655
  • Add logo readme #670

Bugs

  • fix: nb issues being valid #655
  • Fix bug with loading data labeler from disk #665
  • Fixes bug with empty data for DataLabeler Col #667
  • Fixes 0 variance in a dataset #671

Other Changes

  • Updating the version to v0.8.1 #675
  • Generate Docs for v0.8.1 #679

Full Changelog: 0.8.0...0.8.1

What's Changed

New Contributors

Full Changelog: 0.8.0...0.8.1

0.8.0

20 Sep 21:15
a432ef9
Compare
Choose a tag to compare

Profiler

  • DataProfiler: hotfix for handling nan values in diff #647
  • Static Typing profilers/profiler_options.py #644
  • refactor: validate parameters and the returns of functions #640
  • Preset option in ProfileOptions #638
  • GraphProfiler: add() NotImplementedError #636
  • ColumnNameLabeler Setup #635
  • Fix for issue #605 #634
  • GraphProfiler: diff() functionality #631
  • Post Processor for ColumnNameModel #629
  • Graph Profiler: save() and load() Functionality #628
  • Quick Add: require_module in ColumnNameModel test #627
  • New Data Labeler: ColumnNameModel Build #626

Documentation

  • fix: nb issues being valid #655
  • fix: remove extra commas #654
  • DataProfiler: structured_profilers example fix #653
  • DataProfiler: graph_data_demo update #649
  • ColumnNameLabeler Notebook Example #646
  • Update documentation README for gh-pages branch #619
  • README unti testing docs #632
  • Notebook Examples for DP + GE: expect_profile_numeric_columns_percent #625
  • Notebook Examples for DP + GE: expect_profile_numeric_columns_diff #624
  • Notebook Examples for DP + GE: expect_column_values_vs_profile #623
  • Notebook Examples for DP + GE: expect_column_value_confidence #622

Other Changes

  • Updating the version to v0.8.0 #652
  • Generate Docs for v0.8.0 #656

Full Changelog: 0.7.11...0.8.0

What's Changed

New Contributors

Full Changelog: 0.7.11...0.8.0

0.7.11

22 Aug 18:53
511d394
Compare
Choose a tag to compare

Profiler

  • Fixes GraphProfiler loading directly with class #602
  • Graph loading bug #603
  • Hot Fix: Typo in Graph Docstring #611
  • fixes profile_schema bug for unnamed columns #612
  • Include compact in fix for profile schema serialization #613

Documentation

  • Add example notebook for graph data input #594
  • Fix version for 0.7.10 #596
  • Update ReadMe to include Graph Profiler #597
  • lower --> upper #598
  • Add Graph Github Pages #599
  • Graph Pipeline Demo notebook header fix #600
  • Add GraphData to data_reader notebook #601
  • formatting issue for 0.7.11 docs #617

Other Changes

  • Updating the version to v0.7.11 #615
  • Generate Docs for v0.7.11 #616

Full Changelog: 0.7.10...0.7.11

What's Changed

Full Changelog: 0.7.10...0.7.11

v0.7.10

09 Aug 15:37
ca5d32e
Compare
Choose a tag to compare

Profiler

  • Add Graph Profiler Update to Profile Builder #587
  • HOT FIX: black formatting issue #588
  • Implement GraphData/GraphProfiler into DataProfiler Pipeline #581
  • fix: make prepare report keys json serializable #580
  • Add property array to continuous distribution profile in GraphProfiler/ Integer Node ID identified as Integer in GraphData #579
  • Reformatted codebase using flake8. #578
  • SyntheticDataOptions and metrics calculations for NaN replication #571
  • Reformatted batch 4 (see comment) using flake8. #569
  • Reformatted batch #3 (see comment) using flake8. #567
  • Reformatted batch 2 (see comment) with flake8. #566
  • Reformatted big batch #1 (see comment) using flake8. #565
  • Reformatted dataprofiler/data_readers/graph_data.py using flake8. #564
  • Reformatted histogram_utils.py using flake8. #563
  • Reformatted dataprofiler/profilers/profile_options.py using flake8. #562
  • Reformatted dataprofiler/profilers/numerical_column_stats.py using flake8. #561
  • Reformatted dataprofiler/profilers/unstructured_labeler_profile.py us… #558
  • Reformatted dataprofiler/profilers/int_column_profile.py using flake8. #557
  • Reformatted dataprofiler/labelers/base_data_labeler.py using flake8. #556
  • Reformatted init file of profilers/helpers/ using flake8. #555
  • Reformatted dataprofiler/profilers/unstructured_text_profile.py using flake8. #554
  • Reformatted dataprofiler/labelers/data_processing.py using flake8. #553
  • [UTILs] Adding top-level function for distributed merging of profiles #552
  • Reformatted dataprofiler/labelers/data_labelers.py using flake8. #551
  • Reformatted dataprofiler/dp_logging.py using flake8; changed first docstring in float, text, and order column profiler modules. #550
  • Reformatted dataprofiler/profilers/float_column_profiler.py and dataprofiler/profilers/text_column_profiler.py using flake8. #549
  • Reformatted dataprofiler/profilers/order_column_profile.py using flak… #548
  • Graph Profiler class to create a profile of an input graph #546
  • Calculate correlation only between selected columns #544
  • HOT FIX: resolving black and isort #538
  • Reformatted code using trailing-whitespace hook, excluding tests/data and speed_tests/data folders #537
  • Add data labeler tf loader for any cnn softmax model #532
  • Reformatted . using isort. #530
  • Reformatted dataprofiler/data_readers and dataprofiler/tests using bl… #529
  • Add data loader to GraphData class #528
  • Reformatted dataprofiler/labelers using black 22.3.0. #526
  • Reformatted dataprofiler/data_readers using black 22.3.0. #525
  • Reformatted resources/init.py using black 22.3.0. #524
  • Reformatted dataprofiler/init.py using black 22.3.0. #523
  • Reformatted dataprofiler/validators using black 22.3.0. #522
  • Reformatted dataprofiler/tests using black 22.3.0. #521
  • Reformatted dataprofiler/profilers using black 22.3.0. #520
  • Add a class to differentiate between Tabular and Graph CSV files #517
  • Reformatted dataprofiler/version.py using black. #515
  • Reformatted dataprofiler/profilers using black #513
  • Reformatted dataprofiler/reports using black. #512
  • Reformatted dataprofiler/tests using black. #511
  • Reformatted dataprofiler/validators using black. #510
  • Reformatted dataprofiler/dp_logging.py using black. #509
  • Add New preprocessor for using an encoding map #506

Documentation

  • fix: missing null_rep matrix info #592
  • Hot fix/notebook #586
  • HOT FIX: example notebook update #584
  • Added documentation for null_replication_metrics #583
  • Updated README to include null_replication_metrics #582
  • Adding Example for merge_profile_list #559
  • HOT FIX: update README and remove isort.cfg #536

Dependencies

  • Hot Fix: Resolving Version in Pre-Commit File #547
  • Added end-of-file-fixer hook, excluding tests/data folder from changes. #541
  • Added debug-statments pre-commit hook to yaml config file. #539

Other Changes

  • Generate Docs for 0.7.10 #593
  • Hot fix/version change #591
  • Revert "Generate Docs for 0.8.0" - > 0.7.9 #590
  • Tox #570
  • HOT FIX: update README and remove isort.cfg #536
  • Added yaml configuration file. #534
  • Reformatted setup.py using black. #514
  • Update Codeowners #507

Full Changelog: 0.7.9...0.7.10

What's Changed

  • Add New preprocessor for using an encoding map by @JGSweets in #506
  • Update Codeowners by @micdavis in #507
  • Reformatted dataprofiler/version.py using black. by @jakleh in #515
  • Reformatted setup.py using black. by @jakleh in #514
  • Reformatted dataprofiler/dp_logging.py using black. by @jakleh in #509
  • Reformatted dataprofiler/validators using black. by @jakleh in #510
  • Reformatted dataprofiler/reports using black. by @jakleh in #512
  • Reformatted dataprofiler/profilers using black by @jakleh in #513
  • Reformatted dataprofiler/tests using black. by @jakleh in #511
  • Reformatted dataprofiler/profilers using black 22.3.0. by @jakleh in #520
  • Reformatted dataprofiler/tests using black 22.3.0. by @jakleh in #521
  • Reformatted dataprofiler/validators using black 22.3.0. by @jakleh in #522
  • Reformatted dataprofiler/init.py using black 22.3.0. by @jakleh in #523
  • Reformatted resources/init.py using black 22.3.0. by @jakleh in #524
  • Reformatted dataprofiler/data_readers using black 22.3.0. by @jakleh in #525
  • Reformatted dataprofiler/labelers using black 22.3.0. by @jakleh in #526
  • Add a class to differentiate between Tabular and Graph CSV files by @MisterPNP in #517
  • Reformatted dataprofiler/data_readers and dataprofiler/tests using bl… by @jakleh in #529
  • Reformatted . using isort. by @jakleh in #530
  • Added yaml configuration file. by @jakleh in #534
  • HOT FIX: update README and remove isort.cfg by @taylorfturner in #536
  • Add data loader to GraphData class by @MisterPNP in #528
  • Add data labeler tf loader for any cnn softmax model by @JGSweets in #532
  • Reformatted code using trailing-whitespace hook, excluding tests/data and speed_tests/data folders by @jakleh in #537
  • HOT FIX: resolving black and isort by @taylorfturner in #538
  • Added debug-statments pre-commit hook to yaml config file. by @jakleh in #539
  • Added end-of-file-fixer hook, excluding tests/data folder from changes. by @jakleh in #541
  • Calculate correlation only between selected columns by @Ta7ar in #544
  • Hot Fix: Resolving Version in Pre-Commit File by @taylorfturner in #547
  • Reformatted dataprofiler/profilers/order_column_profile.py using flak… by @jakleh in #548
  • Reformatted dataprofiler/profilers/float_column_profiler.py and dataprofiler/profilers/text_column_profiler.py using flake8. by @jakleh in #549
  • Reformatted dataprofiler/dp_logging.py using flake8; changed first docstring in float, text, and order column profiler modules. by @jakleh in #550
  • Reformatted dataprofiler/labelers/data_labelers.py using flake8. by @jakleh in #551
  • Reformatted dataprofiler/labelers/data_processing.py using flake8. by @jakleh in #553
  • [UTILs] Adding top-level function for distributed merging of profiles by @taylorfturner in #552
  • Reformatted dataprofiler/profilers/unstructured_text_profile.py using flake8. by @jakleh in #554
  • Reformatted init file of profilers/helpers/ using flake8. by @jakleh in #555
  • Reformatted dataprofiler/labelers/base_data_labeler.py using flake8. by @jakleh in #556
  • Reformatted dataprofiler/profilers/int_column_profile.py using flake8. by @jakleh in #557
  • Graph Profiler class to create a profile of an input graph by @MisterPNP in #546
  • Reformatted dataprofiler/profilers/unstructured_labeler_profile.py us… by @jakleh in #558
  • Adding Example for merge_profile_list by @taylorfturner in #559
  • Reformatted dataprofiler/profilers/numerical_column_stats.py using flake8. by @jakleh in #561
  • Reformatted dataprofiler/profilers/profile_options.py using flake8. by @jakleh in #562
    ...
Read more

v0.7.9

28 Jun 20:48
6384586
Compare
Choose a tag to compare

Profiler

  • Numeric Stats Mixin: pop disabled elements from profile report #491
  • FloatColumn: enable pop when remove_disabled_flag set to True #493
  • TextColumn: enable pop when remove_disabled_flag set to True #494
  • TextProfiler: enable pop when remove_disabled_flag set to True #495
  • Profile Builder: add report for remove_disable_flag at the top level #496
  • Report functionality added to BaseColumnProfile #497
  • PR for column compiler reports #499

Documentation

  • Added Description Documentation #486
  • Added the statistic descriptions to GitHub pages #500
  • Updated the pip install ghpages documentation #501

Dependencies

  • build(deps): bump actions/setup-python from 2 to 4 #487

Other Changes

  • Updating the Version to v0.7.9 #489
  • Generate Docs for 0.7.9 #503

Full Changelog: 0.7.8...0.7.9

v0.7.8

07 Jun 18:26
1ff2bb6
Compare
Choose a tag to compare

#Profiler

  • DateTimeColumn: Handle Datetime day suffixes #458

Bug fixes

  • Applying isort and resolving circular imports #453
  • Fixes overflow bug if moments are large #481

Other Changes

  • Github issues default assignees list #457
  • README updates #475
  • Add Python 3.10 to GHA #476
  • Configure Dependabot #477
  • Updating the Version to v0.7.8 #482
  • Documentation / Github Pages updated #483

Full Changelog: 0.7.7...0.7.8

v0.7.7

05 Apr 20:41
4cdf3f4
Compare
Choose a tag to compare

Bug fixes

  • Address int64 issue in numerical profiler diff #446

Other Changes

  • Add Python 3.9 to GHA #440
  • Updating the Version to v0.7.7 #450
  • Documentation / Github Pages updated #449 #451

Full Changelog: 0.7.6...0.7.7