-
Notifications
You must be signed in to change notification settings - Fork 538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/uniqueness row level results #471
Feature/uniqueness row level results #471
Conversation
…e columns with count() result in false
Note: Tests for
|
Resolved this issue by not serializing the full column field in metrics for now |
@@ -69,6 +71,10 @@ case class DoubleMetric( | |||
extends Metric[Double] with FullColumn { | |||
|
|||
override def flatten(): Seq[DoubleMetric] = Seq(this) | |||
|
|||
override def getMetricWithoutFullColumn: DoubleMetric = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, removed
@@ -513,7 +513,7 @@ private[deequ] object MetricSerializer extends JsonSerializer[Metric[_]] { | |||
result.addProperty("instance", doubleMetric.instance) | |||
result.addProperty("name", doubleMetric.name) | |||
result.addProperty("value", doubleMetric.value.getOrElse(null).asInstanceOf[Double]) | |||
doubleMetric.fullColumn.foreach(c => result.addProperty("fullColumn", c.expr.sql)) | |||
// doubleMetric.fullColumn.foreach(c => result.addProperty("fullColumn", c.expr.sql)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can create an issue in Github as well and link it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/main/scala/com/amazon/deequ/repository/fs/FileSystemMetricsRepository.scala
Outdated
Show resolved
Hide resolved
- Added row-level results for Uniqueness - Modified tests to compare strings because comparisons involving the same columns with count() result in false - Remove serializing fullColumn for now
- Added row-level results for Uniqueness - Modified tests to compare strings because comparisons involving the same columns with count() result in false - Remove serializing fullColumn for now
- Added row-level results for Uniqueness - Modified tests to compare strings because comparisons involving the same columns with count() result in false - Remove serializing fullColumn for now
- Added row-level results for Uniqueness - Modified tests to compare strings because comparisons involving the same columns with count() result in false - Remove serializing fullColumn for now
Issue #, if available:
N/A
Description of changes:
Building on this PR to add more analyzers that support row level results.
This PR adds row level results for the
Uniqueness
analyzer, which is used by several checks (isUnique
,isPrimaryKey
,hasUniqueness
,uniqueValueRatio
)Row level results for other
ScanShareableFrequencyBasedAnalyzers
will be added in a future PR.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.