Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ship websockets in core #1010

Closed
piscisaureus opened this issue Feb 28, 2015 · 49 comments
Closed

Ship websockets in core #1010

piscisaureus opened this issue Feb 28, 2015 · 49 comments
Labels
discuss Issues opened for discussions and feedbacks.

Comments

@piscisaureus
Copy link
Contributor

Websockets is a standardized protocol that is core to the web. I think it should ship in core.

Discussion taking place here: nodejs/node-v0.x-archive#8005

@mscdex
Copy link
Contributor

mscdex commented Feb 28, 2015

-1 WebSockets is still something better suited for userland IMHO. Sure it's something that many people do use, but there are also many other widely used standardized protocols/APIs that could be argued as "core" to the "web" that are not currently included in node/io.js core. For example: multipart parsing/generation, Server-sent events, HTTP/2, ICMP, SSH/SFTP, FTP, SOCKS, VNC/RFB, SMTP/IMAP/POP3, SOAP, Web Workers (as an API), XHR/XHR2 (as an API), etc.

The main complaint in nodejs/node-v0.x-archive#8005 had to do with an issue in node-gyp and even then the node-gyp issue was not something that affected the installation and use of the third-party module in question.

There was also a different discussion in there about the need for easily distributing compiled addons for Windows users, but as someone already noted there, node-pre-gyp already provides a mechanism for solving that problem. Maybe npm should incorporate something like that, but that's a separate topic.

@jbergstroem
Copy link
Member

My problem with issue nodejs/node-v0.x-archive#8005 is that it grew from ws to bundling compilers. Would the crowd in that issue be happy with "just" supporting ws in core? Also, supporting ws but for instance not http2 is a pretty tough decision that probably is better to avoid unless there's a clear path forward every time the decision about what belongs in core decides to show up.

@Qard
Copy link
Member

Qard commented Mar 1, 2015

FYI: The ws module is pure JS now. The last bits of native stuff have been pulled out to optional dependencies. So that Windows issue isn't really relevant anymore, if people just update their dependencies.

@benjamingr
Copy link
Member

What advantages would putting this in core offer?

@yosuke-furukawa
Copy link
Member

+1 : websocket is implemented by core members.
-1 : websocket is in core API.

I think core module would better to keep simple. ws are a little complicated.

IMO, websocket is not core API, but websocket should be implemented by core members like readable-stream.

Golang team chooses similar strategy. They has official but not core API like these. websocket is listed in the link. spdy is also listed.

And I have a tiny concerns about einaros/ws module, the module does not have main active committers....
So, I have highly recommend core members implemented websocket as a SUB-STANDARD module.

@Qard
Copy link
Member

Qard commented Mar 1, 2015

I've seen the topic of "official" userland modules come up a couple times now. Perhaps it could be a topic for a future TC meeting?

@benjamingr
Copy link
Member

I'm definitely in favour of userland official supported modules - decoupling shipments from core and modularizing code sounds like an excellent idea for io and will allow better interop and more modular code and deployment.

@tellnes
Copy link
Contributor

tellnes commented Mar 1, 2015

When you are asking for official supported userland modules, then what you really is asking for is a repo under the iojs org and a WG for that project? Maybe the possibility to run project tests on jenkins?

The problem with calling those officially is that it implies some guaranty. If a official module is abandoned by its WG, what happens then?

@cjihrig
Copy link
Contributor

cjihrig commented Mar 1, 2015

+1 for a blessed module outside of core.

@benjamingr
Copy link
Member

@tellnes yes - this is exactly what the point is - under the iojs org and WG. The whole point is indeed that guarantee - that it will not be abandoned by the WG and that "grown up"s give it the same LTS support as shipments get.

@Fishrock123
Copy link
Contributor

I'm not sure about adding this to core. http is different. Since we already have http and have to maintain it for the long run, adding http/2 support is more reasonable as it is only maintaining an existing core module.

@piscisaureus
Copy link
Contributor Author

What advantages would putting this in core offer?

There would be no need for compiling a native addon in order to use websockets.

@benjamingr
Copy link
Member

@piscisaureus Does this have to be written as a native addon?

Sorry for the native question - but why can't this be implemented in (low level, closure free) JavaScript?

@piscisaureus
Copy link
Contributor Author

BTW I'm not saying that we should put "ws" into core, although it'd be an obvious starting point for the exploration. But from an API perspective, the websockets implementation should integrate well with the built-in http(s) server, and be as simple as possible.

@piscisaureus
Copy link
Contributor Author

@piscisaureus Does this have to be written as a native addon?

Sorry for the native question - but why can't this be implemented in (low level, closure free) JavaScript?

Many websocket consumers (e.g. socket.io) opt to use a parser written in c++, for performance reasons. There is a much slower fallback implementation in pure javascript, used only when the c++ parser fails to build.

Imagine what a novice node user has to deal with when installing socket.io.
On 'npm install' they'll see a bunch of errors and obscure warnings, which in most cases will leave the application in a nonfunctional state.
Except that when it's ws that fails to compile, they'll see a bunch of errors, but the app will still work.
Websockets however have become slow, which the user may not realize at all, and if they do they may not realize it has anything to do with that failed compilation step.
It's really a mess.

@mscdex
Copy link
Contributor

mscdex commented Mar 1, 2015

You do not need to compile a native addon to use WebSockets though. The ws addon dependencies were merely optional helper functions, but as of websockets/ws#c84c880cbc they are explicitly set as optionalDependencies so it's even less of a problem now.

@mscdex
Copy link
Contributor

mscdex commented Mar 1, 2015

Many websocket consumers (e.g. socket.io) opt to use a parser written in c++, for performance reasons.

socket.io uses ws which uses a js parser. The C++ stuff was just for performing miscellaneous tasks like masking and such.

@piscisaureus
Copy link
Contributor Author

You do not need to compile a native addon to use WebSockets though. The ws addon dependencies were merely optional helper functions, but as of websockets/ws#c84c880cbc they are explicitly set as optionalDependencies so it's even less of a problem now.

I know - the point I was trying to make is that the UX of using an optional dependency that fails to compile is abysmal.

socket.io uses ws which uses a js parser. The C++ stuff was just for performing miscellaneous tasks like masking and such.

Ok, so parts of the parser are written in c++ but not entirely. That's beside the point; clearly there are good reasons to have c++ bits. It's not that different from how http works in core (the parser is written in c, but a large proportion of logic is written in javascript).

@benjamingr
Copy link
Member

Many websocket consumers (e.g. socket.io) opt to use a parser written in c++, for performance reasons. There is a much slower fallback implementation in pure javascript, used only when the c++ parser fails to build.

Well:

  • Is there evidence that a pure JS implementation is actually slower? People have been successful in beating native implementations in a lot of problems like sorting and parsing.
  • With v8 now supporting asmjs directly - perhaps it should be considered if "regular" JS isn't fast enough - although VM people always insist that regular JS carefully written can be fast enough.

Obviously, you've spent a lot more time than me researching this - I'm just playing 10th man here to make sure we're not ruling a pure JS implementation that would be a portability win.

@vkurchatkin
Copy link
Contributor

With v8 now supporting asmjs directly

Why do you think it does?

@benjamingr
Copy link
Member

@vkurchatkin my bad, it's assigned - it appears like it's waiting for TF to get actual support. Still shouldn't change the first point. That said earlier issues in node-forward don't look too promising on that front node-forward/discussions#16 .

@llafuente
Copy link

websockets should be in the core only if are implemented at c level.

Performance matter that's why ws has a compiled version, also compatibility that why ws has a JS counterpart.

but: soon we will start about discussing BSON and other modules that gain a lot to be compiled... at the what we need it's NPM to compile those module for us in a decent /easy to use way.

@chrisdickinson
Copy link
Contributor

With regards to bufferutil and utf-8-validate, would it be generally useful to add {un,}mask, merge, and utf8validate to the Buffer API? If so, we can avoid vendoring a fairly complex chunk of functionality.

@rvagg
Copy link
Member

rvagg commented Mar 11, 2015

FTR, from the TC meeting last week:

  • We should start an issue in NG about the core modules/modernity stuff.
  • We should publicly state that we’re interested in supporting HTTP/2 in the related issue thread, although we’re not sure about implementation strategy and want to see experiments.

See #1123, removing tc-agenda label

@rvagg rvagg removed the tc-agenda label Mar 11, 2015
@mscdex mscdex added future discuss Issues opened for discussions and feedbacks. labels Mar 12, 2015
@rauchg
Copy link
Contributor

rauchg commented Mar 23, 2015

as the current eco has shown, there are many ways of implementing a WebSocket API.

This is especially true for HTTP as well, and HTTP is in core.

That said, I wouldn't assign bundling a WebSocket server a high priority. I don't think there are any problems right now with userland implementations that would be solved by supporting it in core. We have cross platform compatibility, ease of use and good performance in ws already. We also don't have issues with people finding it.

@Fishrock123
Copy link
Contributor

For posterity: The current TC suggestion was to implement the required buffer methods, ala #1202

@lpinca
Copy link
Member

lpinca commented Mar 23, 2015

The current TC suggestion was to implement the required buffer methods, ala #1202

This is very good IMHO, it's a win for everyone.
If I'm not wrong the TC also talked about an "official userland module" for web sockets and I basically agree with what @domenic has written here nodejs/NG#10 (comment).

@andrewrk
Copy link

Incomplete list of things that belong outside of core:

  • http
  • https
  • websockets
  • assert
  • query strings
  • TLS/SSL
  • URL parsing
  • zlib

Some things that are not in core but should be:

  • Threading and shared memory buffers

@theturtle32
Copy link

I agree with @3rd-Eden that we shouldn't just put a particular WebSocket implementation in core. But I'm not really too excited about the idea of just putting pieces of it into core either. Why should core have and maintain (and document as part of the public API) only the frame parser and/or xor masking components? It seems odd to me for those elements to live on their own, elevated to the level of a core platform API. And to what end? To alleviate the frustration with building native binary modules on Windows, but only solve it for WebSocket libraries?

It seems to me the focus be directed toward finding a holistic way to build/bundle/package/distribute _all_ compiled modules for Windows users. It's tough because there are more iojs/node users on *nix-like platforms where we're used to having extremely easy access to C++ compilers, so it's easy to forget about the Windows users and end up leaving them behind.

But at the end of the day, if we just put the pieces we need for optimizing WebSockets directly into core, then there will be substantially less demand for fixing the _real_ underlying platform issue.

On the other hand, this is not a problem that is unique to the npm ecosystem. Ruby, Python, Perl, etc have all had to deal with this as well. I don't know how or even whether they have solved it, since I'm not a Windows user, but it can't be unique to io.js/node.

@chrisdickinson
Copy link
Contributor

@3rd-Eden I'd love to get your thoughts on the PR that adds mask. Also, is unmask necessary in the face of buf.mask(mask, buf)?

@theturtle32 Masking, unmasking, and utf-8 validation are all useful operations on their own, outside of websockets. Bringing them into the core API is net win, and is fairly low-effort for the amount of utility it provides to existing users.

It seems to me [that] the focus [should] be directed toward finding[...]

While there is a larger problem to be solved, tackling this smaller, easier-to-agree-on problem benefits Windows users in the short term and in no way distracts from the efforts of others to tackle the larger issue.

@theturtle32
Copy link

@chrisdickinson Yes, UTF-8 validation is broadly applicable. One might be able to make a case for the masking/unmasking, though IMO the masking algorithm used for WebSockets still relatively niche, using a 32-bit integer as a repeating mask key. Personally, think it seems kind of weird to put a purpose-built, specialized, buffer-based XOR masking function into core.

But I definitely don't think we should put a WebSocket protocol parser/encoder into core, as was mentioned by others, without adding a full WebSocket implementation to go with it, and I don't think we should put WebSockets in core.

I still think that doing something cheap and easy here to benefit Windows users in the short term is a double-edged sword. Yes it will help for now, it will absolutely alleviate frustration for Windows users that install WebSocket-based modules, but that very reduction in frustration will, I believe, dramatically reduce the demand for creating a real solution to the problem of distributing native modules, which will lead to less interest and fewer resources being focused on solving what is essentially the real problem.

Also, both ws and my websocket module will work fine even without the native modules, just somewhat more slowly and less strictly, in that it will accept malformed UTF-8. So to me, it doesn't seem like dire enough of a situation to warrant putting all these things into core. Especially if we can make some improvements to the module installation experience in situations when a C++ compiler isn't available. Novice users can still tinker with WebSockets, and advanced users who are prepared to work with MSVC++ can still deploy the optimized native-code versions.

@piscisaureus
Copy link
Contributor Author

@andrewrk

Incomplete of the list of things that belong outside of core:
«snip»

That's nice and all that, but your list lacks any form of motivation. To demonstrate.

Worst artists of all time (incomplete list):

  • Michael Jackson
  • Steve Aioki

Best artists of all time:

  • Justin Bieber
  • Johannes Sebastian Bach
  • Madonna

See what I mean? :)

@piscisaureus
Copy link
Contributor Author

During the TC meeting at March 11th (iirc) it was decided that we'd just add the buffer operations that are required to build a fast websocket implementation to core. So we won't be adding full websocket support to core.

Therefore I'm closing this issue. The discussion about the implementation continues in #1202.

@3rd-Eden
Copy link
Contributor

If the decision is made to not add WebSocket support in core then you might as well remove the buffer mask as well as I don't really see a broad enough use case here that would justify the addition of it in to the core as there isn't even a demand expressed by the community for it. Adding API's that is only going to be implemented by 10~ people is just stupid. Either ship something workable or ship nothing.

@petkaantonov
Copy link
Contributor

Sorry for off-topic but I am dying to know what problem there is for implementing UTF-8 validation in javascript?

@jcoglan
Copy link

jcoglan commented Mar 25, 2015

@petkaantonov I'm not sure; https://github.com/faye/websocket-driver-node has always done UTF-8 validation in JavaScript and it's worked since Node 0.8 -- see http://faye.jcoglan.com/autobahn/clients/

I think that in 0.8 some fixes were made so that new Buffer(buffer.toString('utf8'), 'utf8') would correctly round-trip astral codepoints, which were failing on 0.6. I've had no issues with UTF-8 since.

@theturtle32
Copy link

@petkaantonov there's nothing preventing anyone from doing UTF-8 validation in JavaScript. It would just be slower than doing it in C++ and the only reason we do it is to be able to strictly follow the specification and immediately terminate the connection with an error if invalid UTF-8 data is received. If we don't have the native extensions built, that level of strictness probably isn't worth the speed tradeoff. Because the worst that could happen if invalid UTF-8 data is received is that some characters could get mangled.

@jcoglan
Copy link

jcoglan commented Mar 25, 2015

@theturtle32 I would consider accepting invalid UTF-8 to be a serious problem. Mangling human-written text is not a trivial problem, and admitting byte sequences you cannot validate into your backend can trigger security problems.

@theturtle32
Copy link

@jcoglan - The reason I'm not terribly worried about it is that none of the major browsers transmit invalid UTF-8 data in the first place. While there are ways to get browsers and http servers to disagree about character encodings and end up with, for example, utf-8 data being interpreted as windows-1252 or iso-8859-1, there's literally nothing you can do in JavaScript to get a browser to transmit a utf-8 websocket message that contains invalid utf-8 data, and websocket servers will always interpret all received text messages as utf-8, because no other character encoding is allowed. So the same possibility of getting incorrectly interpreted characters from your users doesn't exist in practical situations with WebSockets, regardless of whether or not we validate the utf-8 data.

The only time you would ever receive malformed utf-8 data is if a malicious or buggy non-browser client connected to your service. If that's happening to you, then malformed utf-8 still isn't your primary issue.

@theturtle32
Copy link

@piscisaureus @3rd-Eden - While I'm not crazy about adding WebSocket-specific masking to core, I _would_ be in favor of adding something like this for UTF-8 validation:

buffer.toString('utf8', 0, buffer.length, { strict: true }); // Throws on invalid UTF-8

@jcoglan
Copy link

jcoglan commented Mar 26, 2015

@theturtle32 This presupposes that browsers are free of bugs both in their WebSocket code, and in all other network interaction code such that JavaScript cannot spoof headers and fake a WebSocket connection. Such things are rare but they have happened. And just this week I have dealt with our server receiving invalid unicode (application/x-www-form-urlencoded) via a form on our site, for reasons unknown. Clients and proxies can and do mess this stuff up.

Also, since browsers allow cross-origin WebSocket connections, you not only have to trust code running on your site (i.e. assume it is free of XSS holes), but also trust code running on any other site the user has open.

Beyond that, I'd be far more concerned about non-browser clients. If it's possible to sneak invalid UTF-8 past the edge of your stack, that can be used to exploit any system that data is forwarded to. The server must protect internal systems against malformed input.

@jcoglan
Copy link

jcoglan commented Mar 26, 2015

I should also have mentioned that if you're using ws:, or even if you're using wss: and your SSL isn't well-configured, then you can't assume the input actually came from a browser.

@eljefedelrodeodeljefe
Copy link
Contributor

Is this issue closed indefinitely? I was playing around with the ws repo as an experiment and refactored large parts to be compatible w/ core.

Also I see a strong use case for websockets in core, since this is a primary reason why companies not using node at all are making long-term commitments to it anyhow (worked for several). In order to unburden userland contributors and for the greate sake of the ecosystem, I'd give it a +1.

Cons are a multitude: All though the API being really minimal, there currently is a lot of code to digest. Large part of it could use refactoring especially because of Buffer and style issues, imo.

Sorry in advance to add to the noise @piscisaureus :)

@d3x0r
Copy link
Contributor

d3x0r commented Feb 28, 2016

I also would vote that it be native, and linked to inimately with the HTTP socket that receives it. I mean, sure this looks like HTTP protocol upgrade, but there's no event triggered for an onupgrade... so I guess the weboscket is doomed to be its own port instead of sharing the request on the actual server port?

I mean ya, there is a lot on the browser side to open a websocket to a server... but not so much on what you actually need for a server.
I didn't see a recent issue about this and a quick search I didn't find this... but I started another thread
#5467 (comment) ... but that is just my C reference.... somehow I had to coordinate registry of "GET /some/page.html" and "GET /some/websock/name" ... so I'm not even sure what the standard IS? I guess it's what ws socket has for opening...
When I fist started searching I did find that there are 5 websocket libraries (link in other thread) and there's actually a dozen or so more.

About the only thing this does is add packet-length encoding and packet fragmentation logics. Ya it has opcodes and has keep-alive functions... but primarliy it serves to add a length-data protocol instead of once again rolling my own. (which in javascript maybe I'll just use 4 decimal digits since it's apparently less work than using a javascript plugin, which is a shame.)

I'd imagine that the HTTP server object would have a emit( "upgrade", http_head_upgrade), which could be used to make a WS object from that... and a way to take a standard HTTP reqeuest connection and enable WS on that... (which triggers the upgrade request).

Something different than opening a new socket for the communication.

-- and the masking is just to get it through proxies, and is only a client-side requirement, all the existing implementations puke when the server sends a masked stream... and it's for 'smart' proxies that may do filtering like 'already sent' or 'already requested' on HTTP embedded in the stream or something? It's not meant for secure transport... but it should be reasonably efficient.

I suppose the ping emit and on-ping events could also be exposed?

@eugeneo
Copy link
Contributor

eugeneo commented Jun 10, 2016

Node.js now includes a partial WebSocket implementation (inspector_socket.{h,cpp}) that is used for inspector protocol.

The implementation is entirely native but only supports features used by inspector (no compressed frames, no binary frames).

@3rd-Eden
Copy link
Contributor

3rd-Eden commented Dec 5, 2016

As time goes by our opinions and points of view change with us based on past and new experiences.

I personally feel it's time to re-evaluate this topic now that a lot of time has passed since the final WebSocket specification got released. It is not a technology that has died off since the introduction of P2P technologies like WebRTC. It actually got adopted by that as well for handshaking purposes and what not. WebSocket and real-time technologies are still being adopted by new projects and with browsers moving to automatic upgrades a new era has broken were just using WebSockets directly in your projects is now made possible without significantly hurting your user base.

I still would love to see a WebSocket server and client being shipped in Node.js and there is no need to having this be ws or any other existing project either. I don't care if it's JavaScript based or fully written in C++. As long as it works for every single platform that Node.js supports it's fine by me.

This means there no more need for 10's of competing projects that all attempt achieve the same goal. A developer friendly, fast WebSocket server and client implementation for Node.js. It's just mind blowing to think about the amount time we as authors have wasted by writing (or inheriting), maintaining and hand tuning our WebSockets implementations. I would have rather seen us invest all our energy (and that of our amazing contributors) in to a single implementation that would have been the defacto standard for Node.js.

@lpinca
Copy link
Member

lpinca commented Dec 5, 2016

Refs: nodejs/node-eps#6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issues opened for discussions and feedbacks.
Projects
None yet
Development

No branches or pull requests