Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-column indexes #87

Open
baryluk opened this issue May 21, 2022 · 1 comment
Open

Multi-column indexes #87

baryluk opened this issue May 21, 2022 · 1 comment
Labels
enhancement New feature or request

Comments

@baryluk
Copy link

baryluk commented May 21, 2022

Hi,

I have many own indexing layer on top of badger, but found about badgerhold few days ago, and I am considering migrating to using it. badgerhold looks pretty nice.

The blocker issue is, how to do multi-column indexes, with proper iteration order on them.

I was thinking maybe something like this:

type Item struct {
	ID       int
	Created  time.Time
	XKey string
	Updated  time.Time
}

func (i *Item) Type() string {
	return "Item"
}

var itemIndex map[string]badgerhold.Index
func init() {
	itemIndex = map[string]badgerhold.Index{
		"XKey_Updated": {
			IndexFunc: func(_ string, value interface{}) ([]byte, error) {
				i := value.(*Item)
				v1 := i.XKey
				v2 := i.Updated
				return append(badgerhold.DefaultEncode(v1), badgerhold.DefaultEncode(v2)...)
			},
			Unique: false,
		},
	}
}
func (i *Item) Indexes() map[string]badgerhold.Index {
	return itemIndex
}

In fact I am not even sure badgerhold supports inequeality test on indexed keys (i.e. to do range scans): https://github.com/timshannon/badgerhold/blob/master/query.go#L1012

I would like to this for example:

key := "foobar"
oldestAllowed := time.Now().Add(-1*time.Hour)
badgerhold.Where("XKey").Eq(key).And("Updated").Ge(oldestAllowed).Index("XKey_Updated")

(I would also like to be able to do Lt/Le, not just Ge, and of course Ge+Lt/Le, to do range scans using an index.

Even if I make the index value ([]byte) be properly sorted (i.e. make everything fixed size, and pad any values so it is correctly lexicographical order), I still do not think badgerhold will use that.

@timshannon timshannon added the enhancement New feature or request label May 21, 2022
@timshannon
Copy link
Owner

Yeah, that's currently not supported. The index system is very simple in that the index is just array of byte / keys. If the underlying storage structure of the indexes were the same as the datastore itself, then we'd have a lot more options for range scans, and we'd have much better performance as well, because we could take advantage of page splits, and all of the other nice things the underlying datastore uses to keep reads and writes consistently fast as the data increases.

It's a similar issue to this one (timshannon/bolthold#106). The solution is a complete refactor of how indexes are handled, and I have yet to find a way to do it simply, without basically reinventing all of things that boltdb / badgerdb are already doing in their underlying datastores.

Until then, you could use a compound field to accomplish what you want, at the cost of duplicating some of your data, which is the same tradeoff you get from indexes, but with more manual work:

type Item struct {
	ID       int
	Created  time.Time
	XKey string
	Updated  time.Time
        XKey_Updated []byte `badgerhold:"index"` // could use string or whatever type you want
}


badgerhold.Where("XKey_Updated").Eq(key).And("Updated").Ge(oldestAllowed).Index("XKey_Updated")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants