Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utils: parse using byteslice in parseDateTime #1113

Merged
merged 11 commits into from
May 31, 2020
Merged

utils: parse using byteslice in parseDateTime #1113

merged 11 commits into from
May 31, 2020

Conversation

Code-Hex
Copy link
Contributor

@Code-Hex Code-Hex commented May 26, 2020

Description

When the time information is passed in byteslice, it takes a long time to parse it. The order in which the current parsing is done is

  1. cast dest[i] to byteslice
  2. cast byteslice to string
  3. parseDateTime

And since the format for returning the time information to the client has been decided, I think it's obviously faster to do the parsing ownself right away.

スクリーンショット 2020-05-26 11 47 23

The benchmark results

$ go test -benchmem . -bench "(DateTime|Cast)$"
goos: darwin
goarch: amd64
pkg: github.com/go-sql-driver/mysql
BenchmarkParseDateTime-4                         5039470               250 ns/op               0 B/op          0 allocs/op
BenchmarkParseByteDateTime-4                    11188809                99.2 ns/op             0 B/op          0 allocs/op
BenchmarkParseByteDateTimeStringCast-4           3342747               356 ns/op              32 B/op          1 allocs/op
Code
func deprecatedParseDateTime(str string, loc *time.Location) (t time.Time, err error) {
	const base = "0000-00-00 00:00:00.000000"
	switch len(str) {
	case 10, 19, 21, 22, 23, 24, 25, 26: // up to "YYYY-MM-DD HH:MM:SS.MMMMMM"
		if str == base[:len(str)] {
			return
		}
		if loc == time.UTC {
			return time.Parse(timeFormat[:len(str)], str)
		}
		return time.ParseInLocation(timeFormat[:len(str)], str, loc)
	default:
		err = fmt.Errorf("invalid time string: %s", str)
		return
	}
}

func BenchmarkParseDateTime(b *testing.B) {
	str := "2020-05-13 21:30:45"
	loc := time.FixedZone("test", 8*60*60)
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		_, _ = deprecatedParseDateTime(str, loc)
	}
}

func BenchmarkParseByteDateTime(b *testing.B) {
	bStr := []byte("2020-05-25 23:22:01.159491")
	loc := time.FixedZone("test", 8*60*60)
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		_, _ = parseDateTime(bStr, loc)
	}
}

func BenchmarkParseByteDateTimeStringCast(b *testing.B) {
	bStr := []byte("2020-05-25 23:22:01.159491")
	loc := time.FixedZone("test", 8*60*60)
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		_, _ = deprecatedParseDateTime(string(bStr), loc)
	}
}

MySQL datetime format
https://dev.mysql.com/doc/refman/8.0/en/datetime.html

Checklist

  • Code compiles correctly
  • Created tests which fail without the change (if possible)
  • All tests passing
  • Extended the README / documentation, if necessary
  • Added myself / the copyright holder to the AUTHORS file

The benchmark results

$ go test -benchmem . -bench "^BenchmarkParseByte"
goos: darwin
goarch: amd64
pkg: github.com/go-sql-driver/mysql
BenchmarkParseByteDateTime-4                    12023173               104 ns/op               0 B/op          0 allocs/op
BenchmarkParseByteDateTimeStringCast-4           3394239               355 ns/op              32 B/op          1 allocs/op
@methane
Copy link
Member

methane commented May 26, 2020

Can we just change the argument type of parseDateTime, instead of adding parseByteDateTime?

@Code-Hex
Copy link
Contributor Author

Code-Hex commented May 26, 2020

@methane Honestly, it bothers me.
I added a new one because there is already a case for handling strings. (I think there's a cost to casting to string to byte slice)

mysql/nulltime.go

Lines 34 to 38 in 8c3a2d9

case string:
nt.Time, err = parseDateTime(v, time.UTC)
nt.Valid = (err == nil)
return
}

@methane
Copy link
Member

methane commented May 26, 2020

@Code-Hex It is deprecated function (See #960).
I think type conversion (allocation + copy) in the function is reasonable than two same implementation of parseDateTime.

@Code-Hex
Copy link
Contributor Author

@methane Thanks for your reply.
Sounds good to me. I fix it.

utils.go Outdated Show resolved Hide resolved
@Code-Hex Code-Hex requested a review from methane May 27, 2020 02:43
utils.go Outdated Show resolved Hide resolved
utils.go Outdated Show resolved Hide resolved
utils.go Outdated Show resolved Hide resolved
utils_test.go Outdated
loc := time.FixedZone("test", 8*60*60)
b.ResetTimer()
for i := 0; i < b.N; i++ {
_, _ = deprecatedParseDateTime(string(bStr), loc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please post the result of this benchmark and remove deprecatedParseDateTime.
We do not keep old implementations only for benchmark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed 2c6aa27

@Code-Hex Code-Hex requested a review from methane May 28, 2020 08:40
@chanxuehong
Copy link
Contributor

chanxuehong commented May 30, 2020

nice job!!!

I just thought that the parseDateTime can be optimized in the past few days, so I mentioned a PR #1117, and then I saw this implementation, it's interesting.

Here I have a few questions:

  1. Is it better to use bytes.Equal here???

    mysql/utils.go

    Line 113 in 2c6aa27

    if string(b) == base[:len(b)] {

  2. Is it really necessary to check that year(month, day) is equal to 0(in fact they will not be less than 0 here), we have not checked here if month is greater than 12, nor day is greater than 31

mysql/utils.go

Line 121 in 2c6aa27

if year <= 0 {

mysql/utils.go

Line 133 in 2c6aa27

if m <= 0 {

mysql/utils.go

Line 146 in 2c6aa27

if day <= 0 {

@shogo82148
Copy link
Contributor

I have same question with chanxuehong about the range of year(month, day).
Why month 0 is treated as January?
In the current implementation, parseDateTime returns "month out of range" error with month 0.

https://play.golang.org/p/HgQkcLCOGMU

package main

import (
	"fmt"
	"time"
)

func main() {
	fmt.Println(time.Parse("2006-01-02 15:04:05.999999", "0000-00-00 00:00:00.000000"))
	fmt.Println(time.Parse("2006-01-02 15:04:05.999999", "9999-99-99 99:99:99.999999"))
}
0001-01-01 00:00:00 +0000 UTC parsing time "0000-00-00 00:00:00.000000": month out of range
0001-01-01 00:00:00 +0000 UTC parsing time "9999-99-99 99:99:99.999999": month out of range

I think the range checks are not needed, because MySQL returns correct datetime and we don't need to take care of incorrect datetime.
In addition, it is a little complex to calculate the correct range of the day. It varies depending on month and year (there are leap years!).

https://play.golang.org/p/0Z4QX-QtOxn

@Code-Hex
Copy link
Contributor Author

Code-Hex commented May 31, 2020

@chanxuehong @shogo82148 thanks for the mention to me.

  1. In my understanding, the maintainer wanted to take readability (I had implemented at utils: parse using byteslice in parseDateTime #1113 (comment)).

When I actually measured it, it wasn't much of a difference, so I thought it was good.

Code & Results
const base = "0000-00-00 00:00:00.000000"

var bbase = []byte(base)

func BenchmarkByteStringEq(b *testing.B) {
	for i := 0; i < b.N; i++ {
		b := []byte("0000-00-00 00:00:00.000000")
		_ = string(b) == base[:len(b)]
	}
}

func BenchmarkByteEq(b *testing.B) {
	for i := 0; i < b.N; i++ {
		b := []byte("0000-00-00 00:00:00.000000")
		_ = bytes.Equal(b, bbase[:len(b)])
	}
}

Result (two times)

$ go test . -bench "BenchmarkByte"
goos: darwin
goarch: amd64
pkg: github.com/go-sql-driver/mysql
BenchmarkByteStringEq-4         171111117                6.86 ns/op
BenchmarkByteEq-4               177546639                6.86 ns/op
PASS
ok      github.com/go-sql-driver/mysql      3.837s

$ go test . -bench "BenchmarkByte"
goos: darwin
goarch: amd64
pkg: github.com/go-sql-driver/mysql
BenchmarkByteStringEq-4         174836755                6.76 ns/op
BenchmarkByteEq-4               168449960                6.94 ns/op
PASS
ok      github.com/go-sql-driver/mysql      3.823s
  1. I've been thinking a lot.

Assumptions, I basically believe that we can trust the values that MySQL passes. So I don't think it's really necessary to do a range check here.

The reason for checking here is so that time.Time can be treated as zero value even if the parsed result is less than or equal to zero value. I thought that if we could treat it as a zero value, it would be possible to handle it on the user side at worst.

https://play.golang.org/p/4iQJgoOGYIQ

But in this case, time.Parse can parse and not return an error.

https://play.golang.org/p/wRD2ylQiF_9


Anyway, I think it's okay to leave out year, day, month range confirmation logic. I'll fix it if the maintainer is agree.

@shogo82148
Copy link
Contributor

shogo82148 commented May 31, 2020

bytes.Equal(a, b []byte) is the same as string(a) == string(b). (I didn't know that!)

https://github.com/golang/go/blob/f1f8f9af9a55d73dfc6603a93bee0559fdc9024d/src/bytes/bytes.go#L18-L21

As the comment says, the Go compiler optimize this comparison.
So no need to use bytes.Equal for the performance.

time.Date(1, 1, 1, 0, 0, 0, 0, loc) is not the zero value if the loc is not time.UTC.

https://play.golang.org/p/T9-R0Bdpfzr

So it is not safe that users treat the zero value.


Anyway, my question doesn't block merging.

@shogo82148 shogo82148 closed this May 31, 2020
@shogo82148 shogo82148 reopened this May 31, 2020
@shogo82148
Copy link
Contributor

Sorry...
It's my miss operation.

@julienschmidt julienschmidt changed the title fixed the way of parsing datetime when byte slice string utils: parse using byteslice in parseDateTime May 31, 2020
@julienschmidt julienschmidt merged commit 12508c8 into go-sql-driver:master May 31, 2020
@Code-Hex Code-Hex deleted the fix/parse-byte-time branch May 31, 2020 23:44
tz70s pushed a commit to tz70s/mysql that referenced this pull request Sep 5, 2020
* fixed the way of parsing datetime when byte slice string

The benchmark results

$ go test -benchmem . -bench "^BenchmarkParseByte"
goos: darwin
goarch: amd64
pkg: github.com/go-sql-driver/mysql
BenchmarkParseByteDateTime-4                    12023173               104 ns/op               0 B/op          0 allocs/op
BenchmarkParseByteDateTimeStringCast-4           3394239               355 ns/op              32 B/op          1 allocs/op

* added line to AUTHORS file

* fixed error handling

* fixed nanosec digits

* added more tests for error

* renamed parseByteDateTime to parseDateTime

* reverted base null time

* Update utils.go

Co-authored-by: Inada Naoki <[email protected]>

* Update utils.go

Co-authored-by: Inada Naoki <[email protected]>

* Update utils.go

Co-authored-by: Inada Naoki <[email protected]>

* removed deprecatedParseDateTime from test

Co-authored-by: Inada Naoki <[email protected]>
tz70s pushed a commit to tz70s/mysql that referenced this pull request Sep 5, 2020
* fixed the way of parsing datetime when byte slice string

The benchmark results

$ go test -benchmem . -bench "^BenchmarkParseByte"
goos: darwin
goarch: amd64
pkg: github.com/go-sql-driver/mysql
BenchmarkParseByteDateTime-4                    12023173               104 ns/op               0 B/op          0 allocs/op
BenchmarkParseByteDateTimeStringCast-4           3394239               355 ns/op              32 B/op          1 allocs/op

* added line to AUTHORS file

* fixed error handling

* fixed nanosec digits

* added more tests for error

* renamed parseByteDateTime to parseDateTime

* reverted base null time

* Update utils.go

Co-authored-by: Inada Naoki <[email protected]>

* Update utils.go

Co-authored-by: Inada Naoki <[email protected]>

* Update utils.go

Co-authored-by: Inada Naoki <[email protected]>

* removed deprecatedParseDateTime from test

Co-authored-by: Inada Naoki <[email protected]>
@methane
Copy link
Member

methane commented Aug 24, 2021

I think the range checks are not needed, because MySQL returns correct datetime and we don't need to take care of incorrect datetime.
In addition, it is a little complex to calculate the correct range of the day. It varies depending on month and year (there are leap years!).

You are right. These checks caused regression: #1252.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants