Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed crashing bug when parsing invalid messages #8

Closed
wants to merge 4 commits into from

Conversation

psorowka
Copy link
Contributor

Fixes a crash that was observed in Mosca (moscajs/mosca#393) when starting to parse a TLS encrypted stream while assuming to parse plain MQTT data.

So what happened? The TLS connect message that arrived at mosca looked like this:

<Buffer 16 03 01 01 01 01 00 00 fd 03 03 00 09 9a 85 fc 19 19 fa 06 a4
8c a5 53 5d 6e c1 01 23 be 68 00 85 b9 0c 29 8b 24 87 30 09 e6 e4 00 00
80 c0 2f c0 2b ... >

Note the 16 at the start, this suggests that we have a regular CONNECT packet. The next byte 3 suggests a remaining length of the packet of 3 bytes which is obviously not true, but the implementation of the parser just checks if at least 3 bytes remain in the buffer before it starts processing the payload.

The parseConnect function of the parser now checks the next two bytes 01 01 which would describe the protocol identifier string length. Usually this would be 4 or 6 bytes, but here we have 257 bytes. So the next 257 bytes are consumed into a string and our parsing pointer is at buffer position 259. Unfortunately, the buffer only has 260 bytes in total. That is why the next parsing steps produce an out of range exception..

We obviously have various problems here:

  • the parser does not check for plausibility when a string ought to be longer than the packet
  • the parser does not check for a valid protocol identifier string at all
  • the parser does not check (at some points) if the buffer has remaining length before parsing the next bytes
  • the parser does not catch any (further, unexpected) exceptions in order to safe the upper layers from running into bad behavior

All of these where fixed and respective tests added.

Peter Sorowka added 2 commits January 14, 2016 21:57
Fixes a crash that was observed when starting to parse a TLS encrypted stream while assuming to parse plain MQTT data.

So what happened? The TLS connect message that arrived at looked like this:

```
<Buffer 16 03 01 01 01 01 00 00 fd 03 03 00 09 9a 85 fc 19 19 fa 06 a4
8c a5 53 5d 6e c1 01 23 be 68 00 85 b9 0c 29 8b 24 87 30 09 e6 e4 00 00
80 c0 2f c0 2b ... >
```

Note the `16` at the start, this suggests that we have a regular CONNECT packet. The next byte 3 suggests a remaining length of the packet of 3 bytes which is obviously not true, but the implementation of the parser just checks if at least 3 bytes remain in the buffer before it starts processing the payload.

The parseConnect function of the parser now checks the next two bytes `01 01` which would describe the protocol identifier string length. Usually
this would be 4 or 6 bytes, but here we have 257 bytes. So the next 257 bytes are consumed into a string and our parsing pointer is at buffer position 259. Unfortunately, the buffer only has 260 bytes in total. That is the next parsing steps produce an out of range exception..

We obviously have various problems here:

- the parser does not check for plausibility when a string ought to be longer than the packet
- the parser does not check for a valid protocol identifier string at all
- the parser does not check (at some points) if the buffer has remaining length before parsing the next bytes
- the parser does not catch any (further, unexpected) exceptions in order to safe the upper layers from running into bad behavior
When a string length within a variable header would exceed the overall
packet length, the string parsing should return an error
catch (e) {

return this.emit('error', e);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try-catch is slow, and this will really slow down everything.

Is this needed in the first place to fix this issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it's not needed. All potental pitfalls i have spotted are checked for explicitely now.

Let's talk about that at another place, we should really introduce a catch somewhere, any bug within the processing should only kill the specific client connection and not the whole broker

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that any try catch will make things slow. Moreover, any exception in node are better to crash the process anyway, as it is hard to know if you are not leaking memory/file descriptor/whatever.

@psorowka
Copy link
Contributor Author

will push requested changes tomorrow

@@ -163,14 +171,27 @@ Parser.prototype._parseConnect = function () {
if (protocolId === null)
return this.emit('error', new Error('cannot parse protocol id'))

if (['MQTT', 'MQIsdp'].indexOf(protocolId) < 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nice and old && is faster.

@mcollina
Copy link
Member

Also, would you mind splitting the long explanation among the various tests or complete it for the ones that are missing the explanation?

@mcollina
Copy link
Member

Anyway, this is really a GOOD catch.

As soon as it lands, I'll backport it to the v3 series, so it will be catched by MQTT.js and Mosca.

@psorowka
Copy link
Contributor Author

Okay I have pushed everything

@mcollina
Copy link
Member

I did that myself :P. This is released as v4.0.4.

@mcollina mcollina closed this Jan 15, 2016
@mcollina
Copy link
Member

Also as v3.4.5.

@psorowka
Copy link
Contributor Author

By the way, out of curiosity I have added a try/catch block to the benchmark:

var mqtt    = require('../')
  , parser  = mqtt.parser()
  , max     = 10000000
  , i
  , start   = Date.now() / 1000
  , time

for (i = 0; i < max; i++) {
  parser.parse(new Buffer([
    48, 10, // Header
    0, 4, // Topic length
    116, 101, 115, 116, // Topic (test)
    116, 101, 115, 116 // Payload (test)
  ]))
}

console.log('No try/catch');
time = Date.now() / 1000 - start
console.log('Total packets', max)
console.log('Total time', Math.round(time * 100) / 100)
console.log('Packet/s', max / time)

start   = Date.now() / 1000

for (i = 0; i < max; i++) {
  try {
    parser.parse(new Buffer([
      48, 10, // Header
      0, 4, // Topic length
      116, 101, 115, 116, // Topic (test)
      116, 101, 115, 116 // Payload (test)
    ]))
  }
  catch (e) {

    console.log(e);
  }
}

console.log('-----------');
console.log('With try/catch');
time = Date.now() / 1000 - start
console.log('Total packets', max)
console.log('Total time', Math.round(time * 100) / 100)
console.log('Packet/s', max / time)

The result does not suggest a performance dip. Only thing notable is a 15% performance difference between node 0.10 and node 4

NODE 4

$ node benchmarks/parse.js 
No try/catch
Total packets 10000000
Total time 31.13
Packet/s 321192.26504089916
-----------
With try/catch
Total packets 10000000
Total time 31.28
Packet/s 319662.436545051

NODE 0.10

$ node benchmarks/parse.js 
No try/catch
Total packets 10000000
Total time 36.96
Packet/s 270533.4921440323
-----------
With try/catch
Total packets 10000000
Total time 36.57
Packet/s 273433.2274061726

@mcollina
Copy link
Member

This is actually different from putting it in the same function of a while.
Performance-wise, putting it in a small function that just wraps another one is ok.

@psorowka
Copy link
Contributor Author

i see.. in this case parseStream in mqtt-connection could be a candidate

@psorowka
Copy link
Contributor Author

ok just to confirm that, when putting try/catch at the original position I receive 32.79s for node4 and 42.75s for node 0.10. So obviously node4 is also in this respect a bit better in optimizing but still slower compared to the first benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants