-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A dummy/marker writer component for gathering outcome of validation rules #942
Comments
We're considering different names instead of "Mark records as" (specifically the 'mark' word - discussed with @LosD) and we could consider:
More suggestions very welcome |
Implemented a first draft of this extension in https://github.com/kaspersorensen/extension_annotate Here's a screenshot: |
What do you think @LosD? I'm thinking of adding this to the extension swap (also for preparing the workshop where we might need it later this month). |
Seems nice and simple... I guess we can continue discussing the name forever (my biggest worry with using any kind of "mark", "tag", "label" or "annotate" is that I could easily see it clash with some sort of feature where we actually DO mark or tag a record for inspection later in the chain, rather than more or less count it). What's the plan for the result output? |
I not sure if it would be feasible, or even desirable for your use case, but I imagine it could be nice with a common result screen for all "mark" results, so it would be possible to compare them directly; i.e. "20 invalid, 13 valid and 7 special", or whatever the user had decided would be interesting counts. They don't seem quite as interesting in a vacuum (unless the annotation part is interesting in itself for your use). |
I think we should close this story since the functionality (except for the part described in the last comment by @LosD) is delivered via the "Mark rows as..." component. As for the last remark: I would suggest that to be more of a concern for filters, that there could be a kind of overview somewhere of all categorizations made throughout all components (especially filters since they "direct" the flow somewhere). |
Should we consider moving Mark Rows into main DC, then? However, we'll probably need to document it better, then. For a component that is just an extension, it's pretty weird to have gotten several "but... What does it do"? ☺ |
I believe it IS included in the default install of DC5! :-) |
Ah, yes you are right. But that is just in commercial distribution. I'm thinking we should put it into it into the actual DC project. But that explains the many "huh?" reactions :) |
Ok with me to change the way it is bundled if you prefer. But from an end user point this issue has been fixed for long I think. |
I don't think it is part of the community edition at all. Of course, they can always just fetch it from the ExtensionSwap. |
It's also quite an issue that no one seems to have any idea what to use it for (even less why to fill the sole non-inputcolumn property). |
I would like to revisit this issue by contributing the extension to the community edition. Consider it bumped :-) |
That sounds like a great idea! 👍 |
In a number of scenarios I have seen that users/customers want to build validation rules in DataCleaner and monitor the success-/failure-rate of that rule. Example rules could be:
We do already have various filters etc. which enable the user to build filtering rules. But we don't have a simple "writer"/consumer or analyzer which just considers all records passed to it as "counted" and pertaining to some category.
I'm having trouble figuring out if such an analyzer would be nice to have. For beginners it's purpose would be quite unclear, but for the usecases above it would make sense (I think). Feedback welcome.
The text was updated successfully, but these errors were encountered: