Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Add object marker to enable return of empty docs #800

Merged
merged 4 commits into from
Sep 15, 2022

Conversation

fredcarle
Copy link
Collaborator

Relevant issue(s)

Resolves #610

Description

This PR add an object marker to stored documents. This helps the iterator return an empty documents.

Tasks

  • I made sure the code is well commented, particularly hard-to-understand areas.
  • I made sure the repository-held documentation is changed accordingly.
  • I made sure the pull request title adheres to the conventional commit style (the subset used in the project can be found in tools/configs/chglog/config.yml).
  • I made sure to discuss its limitations such as threats to validity, vulnerability to mistake and misuse, robustness to invalidation of assumptions, resource requirements, ...

How has this been tested?

integration tests

Specify the platform(s) on which this was tested:

  • MacOS

@fredcarle fredcarle added bug Something isn't working area/query Related to the query component action/no-benchmark Skips the action that runs the benchmark. labels Sep 15, 2022
@fredcarle fredcarle added this to the DefraDB v0.3.1 milestone Sep 15, 2022
@fredcarle fredcarle requested a review from a team September 15, 2022 08:14
@fredcarle fredcarle self-assigned this Sep 15, 2022
core/key.go Outdated
Comment on lines 114 to 122
if len(elements) == 6 {
return DataStoreKey{
CollectionId: elements[3],
InstanceType: InstanceType(elements[4]),
DocKey: elements[5],
}
}

numberOfElements := len(elements)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AndrewSisley you may not like this so let me know if you think I should do something different.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would suggest adding a new Key type for this - it doesn't use all the props and you'll be able to give it a clearer name/docs.

EDIT: Although conceptually this can perhaps be seen as nil field, and breaking it out to a new type would make the fetcher code more complex? Suggest some docs in here if you keep like this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were using a new key type, it would end up being a duplicate of the primary key in a sense. This means that we could instead use the primary key to accomplish this but that would make the fetcher more complex.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fussed about the duplication (is a coincidence), but whether it is worth the extra fetcher hassle is quite tenuous

Comment on lines 230 to 234
// skip object markers
if bytes.Equal(df.kv.Value, []byte{base.ObjectMarker}) {
continue
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: You don't need the for loop anymore, as all branches now return according to logic. Is that desired? if so then remove the outer for { }

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed. I had removed it initially and then reverted while trying to find a problem and never re-removed it.

Copy link
Contributor

@AndrewSisley AndrewSisley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really surprised how little code you managed to do this in :) And I was wondering if we'd have to go this route :( Got a todo for you regarding extra scan rows for docs with fields - that one is fairly significant, and a minor skipable comment RE to dockey lines.

core/key.go Outdated
Comment on lines 114 to 122
if len(elements) == 6 {
return DataStoreKey{
CollectionId: elements[3],
InstanceType: InstanceType(elements[4]),
DocKey: elements[5],
}
}

numberOfElements := len(elements)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would suggest adding a new Key type for this - it doesn't use all the props and you'll be able to give it a clearer name/docs.

EDIT: Although conceptually this can perhaps be seen as nil field, and breaking it out to a new type would make the fetcher code more complex? Suggest some docs in here if you keep like this

db/collection.go Outdated

// write value object marker
valueKey := c.getDatastoreFromDocKey(dockey)
err = txn.Datastore().Put(ctx, valueKey.ToDS(), []byte{base.ObjectMarker})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: Would really suggest only adding these if the doc is empty/nil, otherwise you are adding an extra scan row to every doc in the db. We got a notable performance gain breaking out the PKs.

@codecov
Copy link

codecov bot commented Sep 15, 2022

Codecov Report

Merging #800 (ce9fd09) into develop (57a9394) will increase coverage by 0.00%.
The diff coverage is 69.23%.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop     #800   +/-   ##
========================================
  Coverage    59.53%   59.54%           
========================================
  Files          154      154           
  Lines        17181    17219   +38     
========================================
+ Hits         10229    10253   +24     
- Misses        6030     6040   +10     
- Partials       922      926    +4     
Impacted Files Coverage Δ
core/key.go 85.24% <62.50%> (-2.54%) ⬇️
db/collection.go 65.92% <66.66%> (+0.02%) ⬆️
db/fetcher/fetcher.go 61.06% <78.26%> (+0.36%) ⬆️
connor/lt.go 72.72% <0.00%> (-4.55%) ⬇️

core/key.go Outdated
if isDataObjectMarker(elements) {
return DataStoreKey{
CollectionId: elements[3],
InstanceType: InstanceType(elements[4]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion(non-bocking): IMO it would be clearer to just hardcode InstanceType here, as the current suggests that it can be stuff other than Value

Copy link
Contributor

@AndrewSisley AndrewSisley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job :) Hope you had fun figuring this out :D

@fredcarle fredcarle merged commit 39033f1 into develop Sep 15, 2022
@fredcarle fredcarle deleted the fredcarle/fix/empty-doc branch September 15, 2022 20:01
shahzadlone pushed a commit to shahzadlone/defradb that referenced this pull request Feb 23, 2024
)

This PR add an object marker to stored documents. This helps the iterator return an empty documents.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/no-benchmark Skips the action that runs the benchmark. area/query Related to the query component bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fetcher does not return documents with no populated fields
3 participants