Fix format of the keys to support startUid. #3310

martinmr · 2019-04-23T01:08:21Z

Currently only reverse keys and data keys correctly understand the
concept of a startUid. This change adds an extra byte of metadata that
declares whether the last eight bytes of the key correspond to a start
uid so that all the required types of keys can understand startUids.

Added more unit tests, including tests for count keys, which previously
were missing.

This change is

Currently only reverse keys and data keys correctly understand the concept of a startUid. This change adds an extra byte of metadata that declares whether the last eight bytes of the key correspond to a start uid so that all the required types of keys can understand startUids. Added more unit tests, including tests for count keys, which previously were missing.

manishrjain

Got a couple of comments. Get @mangalaman93 or @gitlw to review this as well.

Reviewed 3 of 3 files at r1.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @martinmr)

x/keys.go, line 366 at r1 (raw file):

// GetSplitKey takes a key baseKey and generates the key of the list split that starts at startUid.
func GetSplitKey(baseKey []byte, startUid uint64) []byte {

Add tests to test for schema key, type key, count key and all others.

x/keys.go, line 370 at r1 (raw file):

	copy(keyCopy, baseKey)

	p := Parse(baseKey)

if len(keyCopy) <= index { panic("...") }

x/keys.go, line 444 at r1 (raw file):

	case ByteCount, ByteCountRev:
		if len(k) < 4 {
			if Config.DebugMode {

We need to get rid of these Config.DebugMode things. We can make fmt.Printfs glog.V(2) or something. Maybe another change where you can go over all these DebugMode things.

mangalaman93 · 2019-05-15T19:55:14Z

x/keys.go

 func GetSplitKey(baseKey []byte, startUid uint64) []byte {
 	keyCopy := make([]byte, len(baseKey)+8)
 	copy(keyCopy, baseKey)
+
+	p := Parse(baseKey)
+	index := 1 + 2 + len(p.Attr) + 1


This assumes that all the keys have similar structure, i.e. -

// byte 0: key type prefix (set to DefaultPrefix) // byte 1-2: length of attr // next len(attr) bytes: value of attr // next byte: data type prefix // next byte: byte to determine if this key corresponds to a list that has been split // into multiple parts

If that is the case, it will be useful to extract the logic of building the kyes from functions such as IndexKey, DataKey, CountKey into single function, instead of repeating it everywhere in each function.

mangalaman93 · 2019-05-15T19:57:37Z

x/keys.go

-		k = k[8:]
-		if len(k) < 8 {
+
+		if len(k) < 16 {


The print comment do not match with the condition.

mangalaman93 · 2019-05-15T20:01:00Z

x/keys.go

+
+		if len(k) < 12 {
+			if Config.DebugMode {
+				fmt.Printf("Error: StartUid length < 8 for key: %q, parsed key: %+v\n", key, p)


mangalaman93 · 2019-05-15T20:04:36Z

Seems to have a conflict too, please check

gitlw

Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @martinmr)

posting/index.go, line 440 at r1 (raw file):

	// Also delete all the parts of any list that has been split into multiple parts.
	// Such keys have a different prefix (the last byte is set to 1).
	prefix = pk.IndexPrefix()

I feel it's more modular to change the IndexPrefix signature so that it takes an argument to specify the bytesplit flag. And the same argument for the ReversePrefix and CountPrefix methods.

x/keys.go, line 183 at r1 (raw file):

//   into multiple parts
// next four bytes: value of count.
// next eight bytes (optional): if the key corresponds to a split list, the startUid of

Does it make sense for the count key to have multiple split lists? I think it just contains a single number.

x/keys.go, line 371 at r1 (raw file):

	p := Parse(baseKey)
	index := 1 + 2 + len(p.Attr) + 1

Instead of calculating the index on the fly, I feel it's better to have a constant to represent offset of the ByteSplit.

x/keys_test.go, line 59 at r1 (raw file):

	startUid := uint64(math.MaxUint64)
	for uid = 0; uid < 1001; uid++ {
		sattr := fmt.Sprintf("attr:%d", uid)

It's confusing to see the attr have a number encoded, and keeps changing. Maybe remove this fmt.Sprintf and use "attr" directly as the attribute?

x/keys_test.go, line 91 at r1 (raw file):

	startUid := uint64(math.MaxUint64)
	for uid = 0; uid < 1001; uid++ {
		sattr := fmt.Sprintf("attr:%d", uid)

ditto

martinmr

Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @gitlw, @mangalaman93, and @manishrjain)

posting/index.go, line 440 at r1 (raw file):

Previously, gitlw (Lucas Wang) wrote…

I feel it's more modular to change the IndexPrefix signature so that it takes an argument to specify the bytesplit flag. And the same argument for the ReversePrefix and CountPrefix methods.

Won't do. I made IndexPrefix and all the others return the key with byteSplit set to zero for safety (usually these methods are called in contexts where only the main part of the posting list needs to be accessed, not any of the parts). Parts of the code that need to consider split keys should explicitly call this method.

I could have something like IndexKey() and IndexKey(split bool) but golang won't allow me to override methods.

x/keys.go, line 183 at r1 (raw file):

Previously, gitlw (Lucas Wang) wrote…

Does it make sense for the count key to have multiple split lists? I think it just contains a single number.

Good catch. I'll still keep the split byte for consistency but I'll explain that for count keys it will always be zero.

x/keys.go, line 366 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Add tests to test for schema key, type key, count key and all others.

I forgot at the time but there's already methods for this in keys_test.go

x/keys.go, line 370 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

if len(keyCopy) <= index { panic("...") }

Done.

x/keys.go, line 371 at r1 (raw file):

Previously, gitlw (Lucas Wang) wrote…

Instead of calculating the index on the fly, I feel it's better to have a constant to represent offset of the ByteSplit.

The offset is not a constant since it depends on the length of attr. This would require have another field in ParsedKey which I'd prefer not to do.

x/keys.go, line 371 at r1 (raw file):

Previously, mangalaman93 (Aman Mangal) wrote…

This assumes that all the keys have similar structure, i.e. -
// byte 0: key type prefix (set to DefaultPrefix)
// byte 1-2: length of attr
// next len(attr) bytes: value of attr
// next byte: data type prefix
// next byte: byte to determine if this key corresponds to a list that has been split
//   into multiple parts
If that is the case, it will be useful to extract the logic of building the kyes from functions such as IndexKey, DataKey, CountKey into single function, instead of repeating it everywhere in each function.

Done. I have put the logic to generate the first part of the list into a separate method.

x/keys.go, line 416 at r1 (raw file):

Previously, mangalaman93 (Aman Mangal) wrote…

The print comment do not match with the condition.

It does. The condition is that whatever is left has at least 8 bytes for the uid and 8 bytes for the startUid.

x/keys.go, line 444 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

We need to get rid of these Config.DebugMode things. We can make fmt.Printfs glog.V(2) or something. Maybe another change where you can go over all these DebugMode things.

Done. I've added a todo to handle this in another PR.

x/keys.go, line 457 at r1 (raw file):

Previously, mangalaman93 (Aman Mangal) wrote…

same here

Same as above. This is checking the total size of whatever is left of the key to process. In this case we need four bytes for the count and eight for the start uid.

x/keys_test.go, line 59 at r1 (raw file):

Previously, gitlw (Lucas Wang) wrote…

It's confusing to see the attr have a number encoded, and keeps changing. Maybe remove this fmt.Sprintf and use "attr" directly as the attribute?

I kind of copied this from other tests but I think the original intent was to make attr have variable length in order to test that the length is properly encoded.
That test is more thorough than having the attribute have the same length all the time. I'll add a comment here to explain.

x/keys_test.go, line 91 at r1 (raw file):

Previously, gitlw (Lucas Wang) wrote…

ditto

See above.

martinmr

Resolved the conflicts

Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @gitlw, @mangalaman93, and @manishrjain)

x/keys.go

golangcibot · 2019-05-15T23:38:02Z

x/keys.go

-	bytePrefix byte
+	byteType    byte
+	Attr        string
+	Uid         uint64


struct field Uid should be UID (from golint)

golangcibot · 2019-05-15T23:38:02Z

x/keys.go

+	byteType    byte
+	Attr        string
+	Uid         uint64
+	HasStartUid bool


struct field HasStartUid should be HasStartUID (from golint)

golangcibot · 2019-05-15T23:38:02Z

x/keys.go

+	Attr        string
+	Uid         uint64
+	HasStartUid bool
+	StartUid    uint64


struct field StartUid should be StartUID (from golint)

golangcibot · 2019-05-15T23:38:03Z

x/keys.go

-		p.Term = string(k)
+
+		term := k[:len(k)-8]
+		startUid := k[len(k)-8:]


var startUid should be startUID (from golint)

mangalaman93 · 2019-05-16T22:18:04Z

This is already merged? I still see comments from golangcibot unaddressed, and I didn't even have a chance to look at it again.

martinmr · 2019-05-16T23:51:24Z

The lint comments are about renaming Uid to UID. We ignore those errors.

Currently only reverse keys and data keys correctly understand the concept of a startUid. This change adds an extra byte of metadata that declares whether the last eight bytes of the key correspond to a start uid so that all the required types of keys can understand startUids. Added more unit tests, including tests for count keys, which previously were missing.

martinmr requested a review from a team April 23, 2019 01:08

martinmr added 5 commits April 23, 2019 10:30

Change startUid to math.MaxUint64

9b617f0

Fix comments.

3afaff0

Fix methods to delete indices.

d4287db

Update comments.

c827ba3

Add line.

6c7711d

martinmr requested a review from manishrjain as a code owner April 23, 2019 22:09

manishrjain approved these changes May 15, 2019

View reviewed changes

martinmr requested a review from mangalaman93 May 15, 2019 01:18

mangalaman93 requested changes May 15, 2019

View reviewed changes

Merge remote-tracking branch 'origin/master' into martinmr/fix-keys

dbec1f5

gitlw suggested changes May 15, 2019

View reviewed changes

martinmr commented May 15, 2019

View reviewed changes

Address comments.

2b8b48b

golangcibot reviewed May 15, 2019

View reviewed changes

go fmt

3158aef

martinmr merged commit c0a78ec into master May 16, 2019

martinmr deleted the martinmr/fix-keys branch May 16, 2019 00:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix format of the keys to support startUid. #3310

Fix format of the keys to support startUid. #3310

martinmr commented Apr 23, 2019 •

edited by manishrjain

Loading

manishrjain left a comment

mangalaman93 May 15, 2019

mangalaman93 May 15, 2019

mangalaman93 May 15, 2019

mangalaman93 commented May 15, 2019

gitlw left a comment

martinmr left a comment

martinmr left a comment

golangcibot May 15, 2019

golangcibot May 15, 2019

golangcibot May 15, 2019

golangcibot May 15, 2019

mangalaman93 commented May 16, 2019

martinmr commented May 16, 2019

Fix format of the keys to support startUid. #3310

Fix format of the keys to support startUid. #3310

Conversation

martinmr commented Apr 23, 2019 • edited by manishrjain Loading

manishrjain left a comment

Choose a reason for hiding this comment

mangalaman93 May 15, 2019

Choose a reason for hiding this comment

mangalaman93 May 15, 2019

Choose a reason for hiding this comment

mangalaman93 May 15, 2019

Choose a reason for hiding this comment

mangalaman93 commented May 15, 2019

gitlw left a comment

Choose a reason for hiding this comment

martinmr left a comment

Choose a reason for hiding this comment

martinmr left a comment

Choose a reason for hiding this comment

golangcibot May 15, 2019

Choose a reason for hiding this comment

golangcibot May 15, 2019

Choose a reason for hiding this comment

golangcibot May 15, 2019

Choose a reason for hiding this comment

golangcibot May 15, 2019

Choose a reason for hiding this comment

mangalaman93 commented May 16, 2019

martinmr commented May 16, 2019

martinmr commented Apr 23, 2019 •

edited by manishrjain

Loading