Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strings,bytes: tune inliner for Contains or add ContainsByte #32257

Open
martisch opened this issue May 26, 2019 · 3 comments
Open

strings,bytes: tune inliner for Contains or add ContainsByte #32257

martisch opened this issue May 26, 2019 · 3 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@martisch
Copy link
Contributor

martisch commented May 26, 2019

For strings.Index there already exists a specialized version strings.IndexByte besides strings.IndexRune.
There currently is no Byte variant of strings.Contains but strings.ContainsRune exists.

Sampling from a large go code corpus:

  • ~1/3 of calls to strings.Contains use a string of length 1 with an ASCII character
  • strings.Contains appears ~100 times more frequently than strings.ContainsRune
  • strings.Contains appears ~6 times more frequently then strings.Index and strings.IndexByte together.

There is overhead for both the binary code size, loading the string from memory and generally performance loss when using Contains or ContainsRune instead of ContainsByte (that uses IndexByte).

Make the inliner sufficiently aware of Contains and ContainsRune (+possible code layout change) to inline IndexByte calls directly.

Alternative

Add a specialized ContainsByte function to strings and bytes package.

Downside: current code does not immediately get the benefits without change.

Data

name                  time/op
Contains              5.88ns ± 1%
ContainsByte          3.30ns ± 0%
ContainsRune          5.40ns ± 1%
ContainsRuneFastPath  5.33ns ± 1%

Benchmark (I have seen 2 clock cycle variances due to branch addresses depending on where benchmark loops are, these here favor Contains on my Workstation)

var global bool
var data = "golang:contains"

func ContainsByte(s string, b byte) bool {
	return strings.IndexByte(s, b) >= 0
}

func ContainsRuneFastPath(s string, r rune) bool {
	if r <= utf8.RuneSelf {
		return strings.IndexByte(s, byte(r)) >= 0
	}
	return strings.IndexRune(s, r) >= 0
}

func BenchmarkContains(b *testing.B) {
	var sink bool
	for i := 0; i < b.N; i++ {
                // ...
                // LEAQ 0x4c655(IP), AX  // = 7 bytes			
                // MOVQ AX, 0x10(SP)	 // = 5 bytes	
                // MOVQ $0x1, 0x18(SP)   // = 9 bytes
		sink = strings.Contains(data, ":")
	}
	global = sink
}

func BenchmarkContainsByte(b *testing.B) {
	var sink bool
	for i := 0; i < b.N; i++ {
                // ...
                // MOVB $0x3a, 0x10(SP) // = 5 Bytes
                // CALL internal/bytealg.IndexByteString(SB)						
		sink = ContainsByte(data, ':')
	}
	global = sink
}

func BenchmarkContainsRune(b *testing.B) {
	var sink bool
	for i := 0; i < b.N; i++ {
                // ...
                // MOVL $0x3a, 0x10(SP) // = 8 bytes	
		sink = strings.ContainsRune(data, ':')
	}
	global = sink
}

func BenchmarkContainsRuneFastPath(b *testing.B) {
	var sink bool
	for i := 0; i < b.N; i++ {
		sink = ContainsRuneFastPath(data, ':')
	}
	global = sink
}
@martisch martisch added this to the Unplanned milestone May 26, 2019
@julieqiu
Copy link
Member

/cc @bradfitz @ianlancetaylor

@julieqiu julieqiu added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 28, 2019
@zx2c4
Copy link
Contributor

zx2c4 commented Dec 23, 2021

After reading @rsc's comment in #46336:

I recently searched for uses of strings.Index, strings.IndexByte, or strings.IndexRune in the main repo [...] 20 should have been Contains [...] 2 should have been 1 call to ContainsAny

, I did the same thing. One use I found was:

func (e *Endpoint) String() string {
	if strings.IndexByte(e.Host, ':') != -1 {
		return fmt.Sprintf("[%s]:%d", e.Host, e.Port)
	}
	return fmt.Sprintf("%s:%d", e.Host, e.Port)
}

It seems like converting this IndexByte into Contains, ContainsAny, or ContainsRune would incur a performance penalty. I went searching for a ContainsByte function and found this issue.

@josharian
Copy link
Contributor

@zx2c4 for this particular case, it looks like you could use net.JoinHostPort.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

4 participants