Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use string version of regexp methods to reduce allocs #1614

Merged
merged 1 commit into from
Aug 9, 2024
Merged

Use string version of regexp methods to reduce allocs #1614

merged 1 commit into from
Aug 9, 2024

Conversation

Juneezee
Copy link
Contributor

@Juneezee Juneezee commented Aug 9, 2024

Both (*Regexp).Match and (*Regexp).FindAllSubmatchIndex have string-based equivalents: (*Regexp).MatchString and (*Regexp).FindAllStringSubmatchIndex. We should use the string version to avoid unnecessary []byte conversions.

Benchmark:

var regex = regexp.MustCompile("foo.*")

func BenchmarkMatch(b *testing.B) {
	for i := 0; i < b.N; i++ {
		if match := regex.Match([]byte("foo bar baz")); !match {
			b.Fail()
		}
	}
}

func BenchmarkMatchString(b *testing.B) {
	for i := 0; i < b.N; i++ {
		if match := regex.MatchString("foo bar baz"); !match {
			b.Fail()
		}
	}
}

func BenchmarkFindAllSubmatchIndex(b *testing.B) {
	for i := 0; i < b.N; i++ {
		if match := regex.FindAllSubmatchIndex([]byte("foo bar baz"), -1); len(match) == 0 {
			b.Fail()
		}
	}
}

func BenchmarkFindAllStringSubmatchIndex(b *testing.B) {
	for i := 0; i < b.N; i++ {
		if match := regex.FindAllStringSubmatchIndex("foo bar baz", -1); len(match) == 0 {
			b.Fail()
		}
	}
}

Result:

goos: linux
goarch: amd64
pkg: github.com/johnkerl/miller/pkg/lib
cpu: AMD Ryzen 7 PRO 4750U with Radeon Graphics
BenchmarkMatch-16                         	 2198350	       517.5 ns/op	      16 B/op	       1 allocs/op
BenchmarkMatchString-16                   	 3143605	       371.5 ns/op	       0 B/op	       0 allocs/op
BenchmarkFindAllSubmatchIndex-16          	  921711	      1199 ns/op	     273 B/op	       3 allocs/op
BenchmarkFindAllStringSubmatchIndex-16    	 1212321	       981.0 ns/op	     257 B/op	       2 allocs/op
PASS
coverage: 0.0% of statements
ok  	github.com/johnkerl/miller/pkg/lib	6.576s

Both `(*Regexp).Match` and `(*Regexp).FindAllSubmatchIndex` have
string-based equivalents: `(*Regexp).MatchString` and
`(*Regexp).FindAllStringSubmatchIndex`. We should use the string version
to avoid unnecessary `[]byte` conversions.

Benchmark:

var regex = regexp.MustCompile("foo.*")

func BenchmarkMatch(b *testing.B) {
	for i := 0; i < b.N; i++ {
		if match := regex.Match([]byte("foo bar baz")); !match {
			b.Fail()
		}
	}
}

func BenchmarkMatchString(b *testing.B) {
	for i := 0; i < b.N; i++ {
		if match := regex.MatchString("foo bar baz"); !match {
			b.Fail()
		}
	}
}

func BenchmarkFindAllSubmatchIndex(b *testing.B) {
	for i := 0; i < b.N; i++ {
		if match := regex.FindAllSubmatchIndex([]byte("foo bar baz"), -1); len(match) == 0 {
			b.Fail()
		}
	}
}

func BenchmarkFindAllStringSubmatchIndex(b *testing.B) {
	for i := 0; i < b.N; i++ {
		if match := regex.FindAllStringSubmatchIndex("foo bar baz", -1); len(match) == 0 {
			b.Fail()
		}
	}
}

goos: linux
goarch: amd64
pkg: github.com/johnkerl/miller/pkg/lib
cpu: AMD Ryzen 7 PRO 4750U with Radeon Graphics
BenchmarkMatch-16                         	 2198350	       517.5 ns/op	      16 B/op	       1 allocs/op
BenchmarkMatchString-16                   	 3143605	       371.5 ns/op	       0 B/op	       0 allocs/op
BenchmarkFindAllSubmatchIndex-16          	  921711	      1199 ns/op	     273 B/op	       3 allocs/op
BenchmarkFindAllStringSubmatchIndex-16    	 1212321	       981.0 ns/op	     257 B/op	       2 allocs/op
PASS
coverage: 0.0% of statements
ok  	github.com/johnkerl/miller/pkg/lib	6.576s

Signed-off-by: Eng Zer Jun <[email protected]>
Copy link
Owner

@johnkerl johnkerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fabulous -- thank you @Juneezee !! :)

@johnkerl johnkerl merged commit 3966a6a into johnkerl:main Aug 9, 2024
6 checks passed
@johnkerl johnkerl changed the title lib/regex: use string version of regexp methods to reduce allocs Use string version of regexp methods to reduce allocs Oct 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants