Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure SQL. long query - driver: bad connection #166

Closed
sunnyque opened this issue Aug 31, 2016 · 43 comments
Closed

Azure SQL. long query - driver: bad connection #166

sunnyque opened this issue Aug 31, 2016 · 43 comments

Comments

@sunnyque
Copy link

got a headache with a strange behavior when working with SQL Azure.

this code causes "driver: bad connection" error when executing second select

var dbConnString = "server=ttt.database.windows.net;port=1433;user id=user;password=pass;database=db;log=63"

func connect() (*sql.DB, error) {
    db, err := sql.Open("mssql", dbConnString)
    if err != nil {
        return nil, err
    }
    err = db.Ping()
    if err != nil {
        return nil, err
    }
    return db, nil
}

func main() {
    // open db in this function
    db, err := connect()
    if err != nil {
        return
    }
    defer db.Close() // close db in this function

    fmt.Println("SEL 1")
    _, err = db.Exec("SELECT 1")
    if err != nil {
                fmt.Println("Sel error: " + err.Error())
        return
    }

    fmt.Println("SEL bad")

    _, err = db.Exec(`SELECT 2 -- long query FFD8FFE000104A46494600010100000100010000FFDB0043000302020302020303030304030304050805050404050A070706080C0A0C0C0B0A0B0B0D0E12100D0E110E0B0B1016101113141515150C0F171816141812141514FFDB00430103040405040509050509140D0B0D141414141414141414141414141414141414141414141414141414141414141414141414141414140419D47E2AFC78F89DA0F80FC2BAA412EB9ACCCD1406EA28228A3554692491D8A70A91A3B9C02C4290AACC403D67C50D4E0F0D780C78ABC07F1BEDBC656F16BF36853E99A86896DA56A2EAB10923BE82DBCD99DED1C6E5F3241132B6D5299276C1FB05F8A74CF82BFB6C781EFF00C7773FF08A5AE99797D61A83EAB1B43F629DED2E2DC24C08CC5895D558BE027258A8048F45FD0419D47E2AFC78F89DA0F80FC2BAA412EB9ACCCD1406EA28228A3554692491D8A70A91A3B9C02C4290AACC403D67C50D4E0F0D780C78ABC07F1BEDBC656F16BF36853E99A86896DA56A2EAB10923BE82DBCD99DED1C6E5F3241132B6D5299276C1FB05F8A74CF82BFB6C781EFF00C7773FF08A5AE99797D61A83EAB1B43F629DED2E2DC24C08CC5895D558BE027258A8048F45FDAE0F8907ECE7A38F8EF83FB409F18DF7F677DBB67F687FC23BB5FCCDDE4FCBF66FB7F9DE479BC6CDFF0067FDD66803E60FF85E9E38FF00A0E7FE4A41FF00C45765F06F52F8ADF1F3E27683E03F0C6AB04DAEEB3334507DAA2B78A28D551A49247629C2A468EE700B10B85566201E03C3FF0013758F0CF85AFF0040B4B3F0F4D617BE679B2EA1E1BD3AF2ED77A046F2     
    141414141414141414FFC00011080438078003012200021101031101FFC4001F0000010501010101010100000000000000000102030405060708090A0BFFC400B5100002010303020403050504040000017D01020300041105122131410613516107227114328191A1082342B1C11552D1F02433627282090A161718191A25262728292A3435363738393A434445464748494A530F8907ECE7A38F8EF83FB409F18DF7F677DBB67F687FC23BB5FCCDDE4FCBF66FB7F9DE479BC6CDFF0067FDD66803E60FF85E9E38FF00A0E7FE4A41FF00C45765F06F52F8ADF1F3E27683E03F0C6AB04DAEEB3334507DAA2B78A28D551A49247629C2A468EE700B10B85566201E03C3FF0013758F0CF85AFF0040B4B3F0F4D617BE679B2EA1E1BD3AF2ED77A046F2
        `)
    if err != nil {
        fmt.Println("error: " + err.Error())
    }
}

log:

2016/08/31 16:46:36 (1 row(s) affected)
2016/08/31 16:46:36 (1 row(s) affected)
2016/08/31 16:46:36 SELECT 1
2016/08/31 16:46:36 (1 row(s) affected)
2016/08/31 16:46:36 SELECT 2 -- long query FFD8FFE000104A46494600010100000100010000FFDB004300030202030202030303030403030
... cut ... ... cut ... ... cut ... ... cut ...
78A28D551A49247629C2A468EE700B10B85566201E03C3FF0013758F0CF85AFF0040B4B3F0F4D617BE679B2EA1E1BD3AF2ED77A046F2

2016/08/31 16:46:36 got error: read tcp 172.17.0.35:49981->191.235.193.75:1433: wsarecv: An existing connection was forcibly closed by the remote host.
2016/08/31 16:46:36 (1 row(s) affected)
2016/08/31 16:46:36 (1 row(s) affected)
2016/08/31 16:46:36 SELECT 2 -- long query FFD8FFE000104A46494600010100000100010000FFDB004300030202030202030303030403030
4050805050404050A070706080C0A0C0C0B0A0B0B0D0E12100D0E110E0B0B1016101113141515150C0F171816141812141514FFDB004301030404050
... cut ... ... cut ... ... cut ... ... cut ...
78A28D551A49247629C2A468EE700B10B85566201E03C3FF0013758F0CF85AFF0040B4B3F0F4D617BE679B2EA1E1BD3AF2ED77A046F2

2016/08/31 16:46:36 got error: read tcp 172.17.0.35:49982->191.235.193.75:1433: wsarecv: An existing connection was forcibly closed by the remote host.
2016/08/31 16:46:37 (1 row(s) affected)
2016/08/31 16:46:37 (1 row(s) affected)
2016/08/31 16:46:37 SELECT 2 -- long query FFD8FFE000104A46494600010100000100010000FFDB004300030202030202030303030403030
4050805050404050A070706080C0A0C0C0B0A0B0B0D0E12100D0E110E0B0B1016101113141515150C0F171816141812141514FFDB004301030404050
... cut ... ... cut ... ... cut ... ... cut ...
78A28D551A49247629C2A468EE700B10B85566201E03C3FF0013758F0CF85AFF0040B4B3F0F4D617BE679B2EA1E1BD3AF2ED77A046F2

2016/08/31 16:46:37 got error: read tcp 172.17.0.35:49983->191.235.193.75:1433: wsarecv: An existing connection was forcibly closed by the remote host.

error: driver: bad connection

so as we see there is 3 attempt to execute long query (why???) and we have "wsarecv error by unknown reason" for each of them

tried to connect and run with SQL2005 - no problems at all.
wireshark doesn't hep because connection with sql azure is encrypted

got problem only with long queries (I'm inserting images into database like 0x123456....), all the rest of db routine working well

@kylescottmcgill
Copy link

We also see this with any query with over 1000 bytes. We only just got this error, so going to debug a little bit and see what we can find, this only happens with Encrypted connections too.

@kylescottmcgill
Copy link

curiously, when developing on a test application using similar techniques as in the *_test.go files, a 1kb query works fine - will continue hunting, but we are seeing these issues in production, so for now we have disabled Encryption until we can figure out what is going on. Will provide any info if we can.

@RoarkeRandall
Copy link

RoarkeRandall commented Sep 14, 2016

We're seeing this in production as well, on a large merge statement.

Edit: Not on SQL Azure, on SQL Server 2012 (12.0.4422)

@asw
Copy link

asw commented Sep 23, 2016

I'm seeing this issue (the "error: driver: bad connection", not the wsarecv...) with longer running queries (encrypted, I haven't tried turning it off). I increased the connection timeout (connection timeout=300; in the connection string) and it solved my problem for now.

@denisenkom
Copy link
Owner

Try again on latest version, it now sets TCP keep alive to 30 seconds by default

@rbucker
Copy link

rbucker commented May 30, 2017

turning encryption off is not an option. My keep alive and connection timeout are30 and 600 and they still fail. This needs to be reopened

@kardianos
Copy link
Collaborator

Can you provide a test program that connects to an azure database with encryption and fails? That would be super useful.

@kardianos kardianos reopened this May 30, 2017
@rbucker
Copy link

rbucker commented May 30, 2017

    ;with
    config_cte (config) as (
            select *
                    from ( values
                    ('_partition:{\"Fill\":{\"PatternType\":\"solid\",\"FgColor\":\"99ff99\"}}')
                    , ('_separation:{\"Fill\":{\"PatternType\":\"solid\",\"FgColor\":\"99ffff\"}}')
                    , ('Monthly Earnings:\$#,##0.00 ;(\$#,##0.00)')
                    , ('Weekly Earnings:\$#,##0.00 ;(\$#,##0.00)')
                    , ('Total Earnings:\$#,##0.00 ;(\$#,##0.00)')
                    , ('Average Earnings:\$#,##0.00 ;(\$#,##0.00)')
                    , ('Last Month Earning:#,##0.00 ;(#,##0.00)')
                    , ('Award:\$#,##0.00 ;(\$#,##0.00)')
                    , ('Amount:\$#,##0.00 ;(\$#,##0.00)')
                    , ('Grand Total:\$#,##0.00 ;(\$#,##0.00)')
                    , ('Total:\$#,##0.00 ;(\$#,##0.00)')
                    , ('Price Each:\$#,##0.00 ;(\$#,##0.00)')
                    , ('Hyperwallet:\$#,##0.00 ;(\$#,##0.00)')
                    , ('Credit/Debit:\$#,##0.00 ;(\$#,##0.00)')
                    , ('Earning:#,##0.00 ;(#,##0.00)')
                    , ('Change Earning:#,##0.00 ;(#,##0.00)')
                    , ('CheckAmount:#,##0.00 ;(#,##0.00)')
                    , ('Residual:#,##0.00 ;(#,##0.00)')
                    , ('Prev Residual:#,##0.00 ;(#,##0.00)')
                    , ('Team Bonuses:#,##0.00 ;(#,##0.00)')
                    , ('Change:#,##0.00 ;(#,##0.00)')
                    , ('Shipping Total:#,##0.00 ;(#,##0.00)')
                    , ('SubTotal:\$#,##0.00 ;(\$#,##0.00)')
                    , ('Total Diff:#,##0.00 ;(#,##0.00)')
                    , ('SubTotal Diff:#,##0.00 ;(#,##0.00)')
                    , ('Return Total:#,##0.00 ;(#,##0.00)')
                    , ('Return SubTotal:#,##0.00 ;(#,##0.00)')
                    , ('Return Total Diff:#,##0.00 ;(#,##0.00)')
                    , ('Return SubTotal Diff:#,##0.00 ;(#,##0.00)')
                    , ('Cancel Total:#,##0.00 ;(#,##0.00)')
                    , ('Cancel SubTotal:#,##0.00 ;(#,##0.00)')
                    , ('Cancel Total Diff:#,##0.00 ;(#,##0.00)')
                    , ('Cancel SubTotal Diff:#,##0.00 ;(#,##0.00)')
                    , ('Replacement Total:#,##0.00 ;(#,##0.00)')
                    , ('Replacement SubTotal:#,##0.00 ;(#,##0.00)')
                    , ('Replacement Total Diff:#,##0.00 ;(#,##0.00)')
                    , ('Replacement SubTotal Diff:#,##0.00 ;(#,##0.00)')
                    , ('Jan Residual:#,##0.00 ;(#,##0.00)')
                    , ('Jan Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Jan Total:#,##0.00 ;(#,##0.00)')
                    , ('January Residual:#,##0.00 ;(#,##0.00)')
                    , ('Feb Residual:#,##0.00 ;(#,##0.00)')
                    , ('Feb Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Feb Total:#,##0.00 ;(#,##0.00)')
                    , ('February Residual:#,##0.00 ;(#,##0.00)')
                    , ('Mar Residual:#,##0.00 ;(#,##0.00)')
                    , ('Mar Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Mar Total:#,##0.00 ;(#,##0.00)')
                    , ('March Residual:#,##0.00 ;(#,##0.00)')
                    , ('Apr Residual:#,##0.00 ;(#,##0.00)')
                    , ('Apr Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Apr Total:#,##0.00 ;(#,##0.00)')
                    , ('April Residual:#,##0.00 ;(#,##0.00)')
                    , ('May Residual:#,##0.00 ;(#,##0.00)')
                    , ('May Bonus:#,##0.00 ;(#,##0.00)')
                    , ('May Total:#,##0.00 ;(#,##0.00)')
                    , ('Jun Residual:#,##0.00 ;(#,##0.00)')
                    , ('Jun Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Jun Total:#,##0.00 ;(#,##0.00)')
                    , ('June Residual:#,##0.00 ;(#,##0.00)')
                    , ('Jul Residual:#,##0.00 ;(#,##0.00)')
                    , ('Jul Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Jul Total:#,##0.00 ;(#,##0.00)')
                    , ('July Residual:#,##0.00 ;(#,##0.00)')
                    , ('Aug Residual:#,##0.00 ;(#,##0.00)')
                    , ('Aug Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Aug Total:#,##0.00 ;(#,##0.00)')
                    , ('August Residual:#,##0.00 ;(#,##0.00)')
                    , ('Sep Residual:#,##0.00 ;(#,##0.00)')
                    , ('Sep Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Sep Total:#,##0.00 ;(#,##0.00)')
                    , ('September Residual:#,##0.00 ;(#,##0.00)')
                    , ('Oct Residual:#,##0.00 ;(#,##0.00)')
                    , ('Oct Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Oct Total:#,##0.00 ;(#,##0.00)')
                    , ('October Residual:#,##0.00 ;(#,##0.00)')
                    , ('Nov Residual:#,##0.00 ;(#,##0.00)')
                    , ('Nov Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Nov Total:#,##0.00 ;(#,##0.00)')
                    , ('November Residual:#,##0.00 ;(#,##0.00)')
                    , ('Dec Residual:#,##0.00 ;(#,##0.00)')
                    , ('Dec Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Dec Total:#,##0.00 ;(#,##0.00)')
                    , ('December Residual:#,##0.00 ;(#,##0.00)')
                    , ('January Bonus:#,##0.00 ;(#,##0.00)')
                    , ('February Bonus:#,##0.00 ;(#,##0.00)')
                    , ('March Bonus:#,##0.00 ;(#,##0.00)')
                    , ('April Bonus:#,##0.00 ;(#,##0.00)')
                    , ('May Bonus:#,##0.00 ;(#,##0.00)')
                    , ('June Bonus:#,##0.00 ;(#,##0.00)')
                    , ('July Bonus:#,##0.00 ;(#,##0.00)')
                    , ('August Bonus:#,##0.00 ;(#,##0.00)')
                    , ('September Bonus:#,##0.00 ;(#,##0.00)')
                    , ('October Bonus:#,##0.00 ;(#,##0.00)')
                    , ('November Bonus:#,##0.00 ;(#,##0.00)')
                    , ('December Bonus:#,##0.00 ;(#,##0.00)')
                    , ('January Adj:#,##0.00 ;(#,##0.00)')
                    , ('February Adj:#,##0.00 ;(#,##0.00)')
                    , ('March Adj:#,##0.00 ;(#,##0.00)')
                    , ('April Adj:#,##0.00 ;(#,##0.00)')
                    , ('May Adj:#,##0.00 ;(#,##0.00)')
                    , ('June Adj:#,##0.00 ;(#,##0.00)')
                    , ('July Adj:#,##0.00 ;(#,##0.00)')
                    , ('August Adj:#,##0.00 ;(#,##0.00)')
                    , ('September Adj:#,##0.00 ;(#,##0.00)')
                    , ('October Adj:#,##0.00 ;(#,##0.00)')
                    , ('November Adj:#,##0.00 ;(#,##0.00)')
                    , ('December Adj:#,##0.00 ;(#,##0.00)')
                    , ('2016- 2015 YTD Dif:#,##0.00 ;(#,##0.00)')
                    , ('2017- 2016 YTD Dif:#,##0.00 ;(#,##0.00)')
                    , ('2018- 2017 YTD Dif:#,##0.00 ;(#,##0.00)')
                    , ('Dec to Jan Dif Residual:#,##0.00 ;(#,##0.00)')
                    , ('Jan to Feb Dif Residual:#,##0.00 ;(#,##0.00)')
                    , ('Feb to Mar Dif Residual:#,##0.00 ;(#,##0.00)')
                    , ('Mar to Apr Dif Residual:#,##0.00 ;(#,##0.00)')
                    , ('Apr to May Dif Residual:#,##0.00 ;(#,##0.00)')
                    , ('May to Jun Dif Residual:#,##0.00 ;(#,##0.00)')
                    , ('Jun to Jul Dif Residual:#,##0.00 ;(#,##0.00)')
                    , ('Jul to Aug Dif Residual:#,##0.00 ;(#,##0.00)')
                    , ('Aug to Sep Dif Residual:#,##0.00 ;(#,##0.00)')
                    , ('Sep to Oct Dif Residual:#,##0.00 ;(#,##0.00)')
                    , ('Oct to Nov Dif Residual:#,##0.00 ;(#,##0.00)')
                    , ('Nov to Dec Dif Residual:#,##0.00 ;(#,##0.00)')
                    , ('Dec to Jan Dif Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Jan to Feb Dif Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Feb to Mar Dif Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Mar to Apr Dif Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Apr to May Dif Bonus:#,##0.00 ;(#,##0.00)')
                    , ('May to Jun Dif Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Jun to Jul Dif Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Jul to Aug Dif Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Aug to Sep Dif Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Sep to Oct Dif Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Oct to Nov Dif Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Nov to Dec Dif Bonus:#,##0.00 ;(#,##0.00)')
                    , ('Dec to Jan Dif Total:#,##0.00 ;(#,##0.00)')
                    , ('Jan to Feb Dif Total:#,##0.00 ;(#,##0.00)')
                    , ('Feb to Mar Dif Total:#,##0.00 ;(#,##0.00)')
                    , ('Mar to Apr Dif Total:#,##0.00 ;(#,##0.00)')
                    , ('Apr to May Dif Total:#,##0.00 ;(#,##0.00)')
                    , ('May to Jun Dif Total:#,##0.00 ;(#,##0.00)')
                    , ('Jun to Jul Dif Total:#,##0.00 ;(#,##0.00)')
                    , ('Jul to Aug Dif Total:#,##0.00 ;(#,##0.00)')
                    , ('Aug to Sep Dif Total:#,##0.00 ;(#,##0.00)')
                    , ('Sep to Oct Dif Total:#,##0.00 ;(#,##0.00)')
                    , ('Oct to Nov Dif Total:#,##0.00 ;(#,##0.00)')
                    , ('Nov to Dec Dif Total:#,##0.00 ;(#,##0.00)')
                    , ('Jan Refund Cnt:#,##0 ;(#,##0)')
                    , ('Feb Refund Cnt:#,##0 ;(#,##0)')
                    , ('Mar Refund Cnt:#,##0 ;(#,##0)')
                    , ('Apr Refund Cnt:#,##0 ;(#,##0)')
                    , ('May Refund Cnt:#,##0 ;(#,##0)')
                    , ('Jun Refund Cnt:#,##0 ;(#,##0)')
                    , ('Jul Refund Cnt:#,##0 ;(#,##0)')
                    , ('Aug Refund Cnt:#,##0 ;(#,##0)')
                    , ('Sep Refund Cnt:#,##0 ;(#,##0)')
                    , ('Oct Refund Cnt:#,##0 ;(#,##0)')
                    , ('Nov Refund Cnt:#,##0 ;(#,##0)')
                    , ('Dec Refund Cnt:#,##0 ;(#,##0)')
                    , ('Jan Purchase Cnt:#,##0 ;(#,##0)')
                    , ('Feb Purchase Cnt:#,##0 ;(#,##0)')
                    , ('Mar Purchase Cnt:#,##0 ;(#,##0)')
                    , ('Apr Purchase Cnt:#,##0 ;(#,##0)')
                    , ('May Purchase Cnt:#,##0 ;(#,##0)')
                    , ('Jun Purchase Cnt:#,##0 ;(#,##0)')
                    , ('Jul Purchase Cnt:#,##0 ;(#,##0)')
                    , ('Aug Purchase Cnt:#,##0 ;(#,##0)')
                    , ('Sep Purchase Cnt:#,##0 ;(#,##0)')
                    , ('Oct Purchase Cnt:#,##0 ;(#,##0)')
                    , ('Nov Purchase Cnt:#,##0 ;(#,##0)')
                    , ('Dec Purchase Cnt:#,##0 ;(#,##0)')
                    , ('Jan Refund Amt:#,##0.00 ;(#,##0.00)')
                    , ('Feb Refund Amt:#,##0.00 ;(#,##0.00)')
                    , ('Mar Refund Amt:#,##0.00 ;(#,##0.00)')
                    , ('Apr Refund Amt:#,##0.00 ;(#,##0.00)')
                    , ('May Refund Amt:#,##0.00 ;(#,##0.00)')
                    , ('Jun Refund Amt:#,##0.00 ;(#,##0.00)')
                    , ('Jul Refund Amt:#,##0.00 ;(#,##0.00)')
                    , ('Aug Refund Amt:#,##0.00 ;(#,##0.00)')
                    , ('Sep Refund Amt:#,##0.00 ;(#,##0.00)')
                    , ('Oct Refund Amt:#,##0.00 ;(#,##0.00)')
                    , ('Nov Refund Amt:#,##0.00 ;(#,##0.00)')
                    , ('Dec Refund Amt:#,##0.00 ;(#,##0.00)')
                    , ('Jan Purchase Amt:#,##0.00 ;(#,##0.00)')
                    , ('Feb Purchase Amt:#,##0.00 ;(#,##0.00)')
                    , ('Mar Purchase Amt:#,##0.00 ;(#,##0.00)')
                    , ('Apr Purchase Amt:#,##0.00 ;(#,##0.00)')
                    , ('May Purchase Amt:#,##0.00 ;(#,##0.00)')
                    , ('Jun Purchase Amt:#,##0.00 ;(#,##0.00)')
                    , ('Jul Purchase Amt:#,##0.00 ;(#,##0.00)')
                    , ('Aug Purchase Amt:#,##0.00 ;(#,##0.00)')
                    , ('Sep Purchase Amt:#,##0.00 ;(#,##0.00)')
                    , ('Oct Purchase Amt:#,##0.00 ;(#,##0.00)')
                    , ('Nov Purchase Amt:#,##0.00 ;(#,##0.00)')
                    , ('Dec Purchase Amt:#,##0.00 ;(#,##0.00)')
                    ) X(a))
    select * from config_cte

@kardianos
Copy link
Collaborator

Great! I've got a confirmed failing test case. I'll be able to diagnose and fix now. Thanks!

@rbucker
Copy link

rbucker commented May 31, 2017

There was an ISSUE that suggested that setting the keepalive=30 and connection timeout=300 would work. It worked for a while on different queries but then crashed today. FOO!

@rbucker
Copy link

rbucker commented Jun 2, 2017

I'm debugging the code and I noticed one thing.... Since I'm using pooled connections.... immediately before this SQL was executed the driver opened a new connection to the driver. I inserted a small SQL before the LONG SQL and it still failed on the long.

@rbucker
Copy link

rbucker commented Jun 2, 2017

This link to the MS TDS documentation suggests that maybe TDS 7.4 is required for SQL Server 2016: https://msdn.microsoft.com/en-us/library/dd339982.aspx

@rbucker
Copy link

rbucker commented Jun 2, 2017

Here is some interesting information.... There is clearly a problem with the server in that it must be performing some sort of sanity check on the SQL and if it looks suspicious that is resets the connection. I suppose this must be some sort of DOS or PEN protection. For example I had a small SQL that looks like: select getdate()*/}} d
and it resulted in a connection reset.

@rbucker
Copy link

rbucker commented Jun 2, 2017

And another note:

select protocol_type, MASTER.dbo.fn_varbintohexstr(CONVERT(varBINARY(4), protocol_version)) protocol_version from sys.dm_exec_connections

produced these results. I would have expected one row or one row per supported TDS version

protocol_type	protocol_version
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x00000000
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x00000000
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x00000000
TSQL	0x00000000
TSQL	0x74000004
TSQL	0x00000000
TSQL	0x00000000
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x00000000
TSQL	0x00000000
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x74000004
TSQL	0x74000004

@kardianos
Copy link
Collaborator

So far I have tested a few things. No fixes yet. Issue is probably in this driver's handling of encrypted connection in the netlib TDS layer. I can replicate the issue locally on both a linux and windows installed SQL Server if I turn encryption on and run TestAzureDatabase.

I'll continue to look into it as I get a chance. If someone else wants to look into it, I would recommend comparing the TDS spec for how to do encrypted connections with the implementation.

@rbucker
Copy link

rbucker commented Jun 2, 2017

well... I have a SQL String that is 1923 bytes in length. It works on SQL Server 2008 but not on SQL Server 2016.

@rbucker
Copy link

rbucker commented Jun 9, 2017

The Azure documentation talks about sql server firewalls. Since I'm using pooling could that be the issue? Also... I execute a query, read the first row, then generate some SQL and open a second connection, execute that query... which may or may not reset... close that result set, read another record from the the first result set and repeat until exhausted.

@audgster
Copy link

Hello! Do you have an ETA on this issue?

@kardianos
Copy link
Collaborator

No. I'm currently working on some database/sql issues for go1.9 that need to get done before the first cut. I'll see if I can work on it tomorrow morning.

@kardianos
Copy link
Collaborator

database/sql changes are merged and docs are sent. I'll work on this next.

@rbucker
Copy link

rbucker commented Jun 14, 2017

I know you have not started in this yet... but wanted to point out that the error has changed per changes made in a different issue (#269):

BeginRead failed driver: bad connection

@rbucker
Copy link

rbucker commented Jun 21, 2017

still no progress here.

@sunnyque
Copy link
Author

this issue got me again so we found a workaround (not clearly understand why its working)

while reproducing this issue with local MSSQL 2016 with "encrypt=true" when using "packet size=2048" problem goes away (with 4096 issue reproduces just like with Azure SQL)

then I switched to test this with Azure and got "Invalid packet size, it is longer than buffer size" error when using smaller packet size. It seems Azure ignores packet size while handshaking and sends 4kb packet anyway. So I slightly modified source code to get bigger buffer size while using in login packet size=2048 and then it worked

need to test it more

@kardianos
Copy link
Collaborator

Thanks. That is in line with my suspicion that this is in the netlib layer of TDS. We are probably clobbering part of the packet sometimes, and it only becomes readily apparent when we turn on encryption.

@rbucker
Copy link

rbucker commented Jun 24, 2017

this continues to be a problem for me.... but I keep working around the issue by refactoring my SQL but it's getting harder with each failure.

@Tanz0rz
Copy link

Tanz0rz commented Sep 22, 2017

This issue hasn't been closed and I am running into a very similar error:

read: connection reset by peer

I am seeing this only when I have encryption enabled, but have not dug into the issue further than that.

@rbucker
Copy link

rbucker commented Sep 22, 2017

it seems to be a buffer problem... change your buffer size to 2K. The solution is not very happy but at least I can work again. I'm anxious for a proper fix. I tried a code review with no luck.

@Tanz0rz
Copy link

Tanz0rz commented Sep 22, 2017

I am pretty new to sql land. I was reading that the buffer solution may require refactoring of some of the (bigger?) sql queries. Is that right? Is there an easy explanation of why?

Btw the buffer fix has definitely gotten the errors to go away at least so far

@bjg2
Copy link

bjg2 commented Oct 5, 2017

How to change buffer size to 2k? Can't seem to do it through connection string.

@rbucker
Copy link

rbucker commented Oct 5, 2017

add packet size=2048; to the connection string. the space between packet and size is required.

@bjg2
Copy link

bjg2 commented Oct 5, 2017

Yes, I did that already, then it says 'invalid packet size, it ias longer than buffer size'. To explain, I use SQL Azure database, I was having issues with big query/big number of parameters, I changed packet size, and now I don't know how to change abovementioned buffer size.

@rbucker
Copy link

rbucker commented Oct 5, 2017

pretty sure you cannot change the buffersize and that it's static. There was mention about the packet size needed to be an even number to handle double byte characters some place. So check that. Also, when I saw a similar error I was trying a packet size < 1024 (I think). You'll have to read the code to kind the conditions where that error occurred.

One last thing to check... I remember something about timeouts or keepalive. Here are the other params I use. I'm certain the LOG setting is meaningless to the problem.

packet size=2048;log=7;keepAlive=15;connection timeout=300`

@sunnyque
Copy link
Author

sunnyque commented Oct 5, 2017

as I mention before there is an issue with Azure when using smaller packets size
It seems Azure ignores packet size while handshaking and sends 4kb packet anyway.
so you need allocate more space in input buffer, not exact packet size (make it 4k minimum please)

@kardianos
Copy link
Collaborator

I've added a test for this in the master branch. I can reproduce it on Azure and local databases.
https://github.com/denisenkom/go-mssqldb/blob/master/queries_go19_test.go#L128
@sunnyque I'll look into decoupling the packet size from the internal buffer size and retest.

@bjg2
Copy link

bjg2 commented Oct 6, 2017

So, my initial issue is #300, but I thought it might be related to this one.

If I change packet size to 2048 and SQL Azure ignores it and use 4096 what's the point? In case I missed something, I changed packet size to 2048, changed buffer size in driver code to 4096 manually and still I have the same issue I had before changing packet size at all - connection reset by peer.

So, if I understand correctly, it is impossible to use golang with SQL Azure with this driver at the moment?

@kylescottmcgill
Copy link

Just chiming in to give our user experience. We are currently using MsSQL in AWS in production, but where our implementation differs is we were unable to use full encryption. There are 3 levels, none, communication encryption, and full encryption - this includes Authentication. In our experience when using full encryption and passing across a SQL query which is larger than the buffer the decryption on MsSQL's side fails and then closes the connection.

I have Go debugged my way through this a lot and not been able to figure out under what circumstances the query and the connection fails (i.e.: where the failure is). Go's encryption is a dark and scary place (at least for me) so we have opted to just secure the communication for now. We would much rather have full encryption, but we had to choose stability over security (it was not taken lightly).

In future, I would love to jump back into seeing if I can fix this, but the time to do so has been hard to find. So, I wouldn't say it’s impossible, and this is a big issue for us as well, however we have managed to get by.

kardianos added a commit that referenced this issue Oct 6, 2017
If the packet size is set too small, the read buffer size will
overflow. Prevent this by setting the buffer size independent of the
packet size.

For #166
@kardianos
Copy link
Collaborator

@sunnyque In be9747b (branch kardianos-buf-size) I've de-coupled the packet size from the buffer size. While it does allow you to set the packet size to 512 (before it would overflow), it doesn't solve this issue.

@kylescottmcgill It is possible that if you use the above branch/commit (not in master as of this writing) and set "packet size=512" in the DSN, you may have better luck connecting with full encryption.

I don't think this is a Go TLS issue, at least not directly. I'm reaching out to the SQL Server Drivers team for assistance, but I haven't heard back as of yet from the team themselves.

@sunnyque
Copy link
Author

sunnyque commented Oct 6, 2017

thanks, will test it with our environment

kardianos added a commit that referenced this issue Oct 9, 2017
If the packet size is set too small, the read buffer size will
overflow. Prevent this by setting the buffer size independent of the
packet size.

For #166
@kardianos
Copy link
Collaborator

I was finally able to use Microsoft Message Analyzer on it.
Notes for future self:

  • Run on port 1433
  • Force client to use fixed certs (no forward security suites)
  • Create self signed cert and add to MMA and SQL Server

According to MMA, it isn't seeing a EOM message type at the end of the SQL Batch packet. Sometimes this isn't a problem, but maybe sometimes it is? Unsure. I'm also seeing odd NULL bytes inserted into the SQL Batch data. I'm unsure if that is a problem with MMA or if there really is an issue with the driver writing NULL values.

I don't think I've seen any issue when SQL can fit in a single TDS packet.

@kardianos
Copy link
Collaborator

It looks like the above issues of not marking the EOM byte is just MMA being buggy. I bet the internal NULL bytes are also a product of buggy MMA.

kardianos added a commit that referenced this issue Dec 4, 2017
If the packet size is set too small, the read buffer size will
overflow. Prevent this by setting the buffer size independent of the
packet size.

For #166
@glebteterin
Copy link
Contributor

Hello! I stumbled upon this issue several months ago and now I just wondering if you have an ETA on this?

@kardianos
Copy link
Collaborator

Nope. I tried to solve it, but came up short. I haven't heard back from the MS SQL Server drivers team either. There is a test you can un-skip and work with to try to debug.

@glebteterin
Copy link
Contributor

So I followed @kardianos steps of debugging this issue with MMA and compared the network traffic with .Net implementation of the same sql query. The investigation resulted in a pull request: #327

odeke-em pushed a commit to orijtech/go-mssqldb that referenced this issue Jul 26, 2020
gabrielcorado pushed a commit to gravitational/go-mssqldb that referenced this issue Oct 4, 2024
* lazy initialization of charset maps

* initialize each charsetmap separately

* switch from init to get
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests