Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opamp.Connection needs a Mutex to avoid concurrent write panic #90

Closed
andykellr opened this issue May 27, 2022 · 4 comments
Closed

opamp.Connection needs a Mutex to avoid concurrent write panic #90

andykellr opened this issue May 27, 2022 · 4 comments

Comments

@andykellr
Copy link
Contributor

In a busy server, we encountered a concurrent write panic. I added protection in the server, but ideally the library would add a Mutex to prevent this from happening.

Here is the relevant stack fragment:

panic: concurrent write to websocket connection
goroutine 115 [running]:
github.com/gorilla/websocket.(*messageWriter).flushFrame(0xc001dc95a8, 0x1, {0xc001d88160?, 0x2?, 0xc001dc95d8?})
        .../go/pkg/mod/github.com/gorilla/[email protected]/conn.go:617 +0x52b
github.com/gorilla/websocket.(*Conn).WriteMessage(0xc001da8f20, 0xc0003d2930?, {0xc001d88160, 0x142, 0x142})
        .../go/pkg/mod/github.com/gorilla/[email protected]/conn.go:770 +0x139
github.com/open-telemetry/opamp-go/server.wsConnection.Send({0xc001dc9750?}, {0xc001d950e0?, 0xc001dba050?}, 0x270c?)
        .../go/pkg/mod/github.com/open-telemetry/[email protected]/server/wsconnection.go:30 +0x55
...
@andykellr
Copy link
Contributor Author

This is working as intended per the Connection interface:

// Send a message. Should not be called concurrently for the same Connection instance.
// Can be called only for WebSocket connections. Will return an error for plain HTTP
// connections.
// Blocks until the message is sent.
// Should return as soon as possible if the ctx is cancelled.
Send(ctx context.Context, message *protobufs.ServerToAgent) error

I will do some benchmarking with a large number of connections to determine the overhead of adding a mutex per connection.

@tigrannajaryan
Copy link
Member

FYI, we protect the connection with a mutex in our example. So it seems reasonable to just move this mutex inside the connection struct and have proper encapsulation.

One argument against having the mutex in the connection is that in the server implementation one may have a mutex per agent anyway (e.g. to protect agent's data fields) and that same mutex can be used to protect the connection. However, we don't do it in our example (we have a separate mutex for data fields) so perhaps the combined mutex is not such a good option (otherwise why don't we use it?).

@evan-bradley
Copy link
Contributor

I missed this issue while working on #200. In my opinion having separate mutexes for the WS connection and the agent data fields makes sense because the cost of the mutexes is low and it would cause contention between updating the agent and sending it messages if we had a shared mutex.

@andykellr @tigrannajaryan Are we satisfied that this is resolved?

@tigrannajaryan
Copy link
Member

Yes, I think we can close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants