Skip to content
This repository has been archived by the owner on Sep 13, 2022. It is now read-only.

Uncaught exception from finish() #137

Merged
merged 14 commits into from
Jul 20, 2017

Conversation

frankgreco
Copy link
Contributor

@frankgreco frankgreco commented Jul 19, 2017

Resolves #136

@CLAassistant
Copy link

CLAassistant commented Jul 19, 2017

CLA assistant check
All committers have signed the CLA.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) to 98.709% when pulling 5361db9 on frankgreco:finish-bug into 3a33bd1 on uber:master.

src/tracer.js Outdated
this._reporter.report(span);
} catch (err) {
this._logger.error(`could not report span ${span.operationName}: ${err.message}`)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while it is tempting to protect all tracer methods with try-catch, doing so at the top level methods leads to Node engine marking them as non-optimizable. The report method is called a lot more often than the actual place that was causing your exception, which was around the networking code, so I would rather move try-catch down the stack to the place that actually can throw an exception.

@yurishkuro
Copy link
Member

Also, let's add a unit test for the exception.

@frankgreco
Copy link
Contributor Author

frankgreco commented Jul 19, 2017

@yurishkuro alright working on it now

@coveralls
Copy link

Coverage Status

Coverage remained the same at 98.806% when pulling 5d1ed2b on frankgreco:finish-bug into 3a33bd1 on uber:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) to 98.709% when pulling 5d1ed2b on frankgreco:finish-bug into 3a33bd1 on uber:master.

@frankgreco
Copy link
Contributor Author

frankgreco commented Jul 19, 2017

@yurishkuro concerning code coverage, is it required that the coverage not drop even if the committed code is fully covered? (see image below):

This commit only contains changes in upd_sender.js which is 100% covered.

screen shot 2017-07-19 at 5 32 28 pm


it ('flush gracefully handles errors emitted by socket.send', () => {
sender._client.on('error', err => {
assert.equal(err.message, 'getaddrinfo ENOTFOUND foo.bar.com')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this isn't really testing the new error handler you're adding. The test should simply try to send a span when the sender has the wrong agent address, and without the new handler it should throw uncaught error, but with the handler just log it to console and succeed.

Also, I am not sure you would actually get uncaught error here since the default reporter buffers the spans.

The ideal test would be to validate that console.log was called with expected error string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I am not sure you would actually get uncaught error here since the default reporter buffers the spans.

I can confirm that my fix indeed solves the problem. Here is a simple reproducible example:

var http = require('http')
var tracer = require('jaeger-client').initTracer({
    'serviceName': 'myService',
    'reporter': {
      logSpans: false,
      agentHost: 'foo.bar.com',
      flushIntervalMs: 10
    },
    'sampler': {
      host: 'foo.bar.com',
      param: 1,
      type: 'const'
    }
  })

http.createServer((req, res) => {
  span = tracer.startSpan('mySpan')
  res.end('response string')
  span.finish()
}).listen('8080')
{
  "name": "example",
  "version": "1.0.0",
  "dependencies": {
    "jaeger-client": "3.5.3"
  }
}
$ npm install
$ node index.js
events.js:160
      throw er; // Unhandled 'error' event
      ^

Error: getaddrinfo ENOTFOUND foo.bar.com
    at errnoException (dns.js:28:10)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:76:26)

Now to see the fix in action, navigate to:

node_modules/jaeger-client/dist/src/reporters/udp_sender.js

and add my fix to line 141:

this._client.on('error', err => {
    console.log(`error sending span: ${err}`)
})

now restart the node server and hit localhost:8080 and you will see the error is handled gracefully.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this isn't really testing the new error handler you're adding. The test should simply try to send a span when the sender has the wrong agent address, and without the new handler it should throw uncaught error, but with the handler just log it to console and succeed.

sender._host = 'foo.bar.com';
sender._port = 1234;
let reporter = new RemoteReporter(sender);
tracer = new Tracer(
    'test-service-name',
     reporter,
     new ConstSampler(true)
);

the above code (in my test case) is creating as sender that will attempt to send the span to a host that will throw an error on DNS lookup. now if I commented out the event handler in that test case, the test would fail and mirror the error that the issue stemmed from. How the same event emitter that is in the code is in my test and we expect that event emitter to be called which it is.

The above reasoning is why I believe it is a valid test case: Here are some resources to back up my method for testing event emitters:
https://stackoverflow.com/questions/37232574/node-js-testing-if-event-was-emitted-using-sinon-js-without-the-use-of-timeouts

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not saying that the event is not emitted, rather that the test is not testing the actual event handler. Can you use sinon to mock console.log? If so, you don't have to add another event in the test, and you will validate that the event handler in the sender is actually invoked, by validating that console.log was invoked.

new ConstSampler(true)
).startSpan('testSpan').finish();
sender.flush();
setTimeout(() => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: setTimeout is needed due to Node's asynchronous nature. Because I am spying on console.log, if setTimeout is not used, the error message we expect will be printed to the console but not before mocha prints other messages to the console.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 1sec wait is unfortunate. What if you patch console.log directly?

var oldLog = console.log;
console.log = function (message) {
    assert(message == ...);
    console.log = oldLog;
    done();
};
// rest of the test

@coveralls
Copy link

Coverage Status

Coverage increased (+0.002%) to 98.808% when pulling 7bb14fd on frankgreco:finish-bug into 3a33bd1 on uber:master.

setTimeout(() => {
assert(spy.calledWith('error sending span: Error: getaddrinfo ENOTFOUND foo.bar.com', 'error sending span: Error: getaddrinfo ENOTFOUND'));
let oldLog = console.log;
console.log = message => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro how's that

).startSpan('testSpan').finish();
sender.flush();
let oldLog = console.log;
console.log = message => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be done before sender.flush()?

also, since you're resetting the function back to oldLog, I don't think you need sinon anymore.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.002%) to 98.808% when pulling e7a4fc1 on frankgreco:finish-bug into 3a33bd1 on uber:master.

).startSpan('testSpan').finish();
let oldLog = console.log;
sender.flush();
console.log = message => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still a race condition, is there not? The flush might call the log function before you have a chance to replace it.

Copy link
Contributor Author

@frankgreco frankgreco Jul 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so. In the context of replacing console.log, the code in question is:

this._client.on('error', err => {
    console.log(`error sending span: ${err}`)
})

Now, there is no way this event code will be called before socket.send has a chance to execute. Now, according to the Node dgram docs, Note that DNS lookups delay the time to send for at least one tick of the Node.js event loop. During that time, the next line of code, the code that replaces the logger to our mock one, will have a chance to execute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regardless, moving sender.flush() after the console.log assignment on line 264 should remove any doubt.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.002%) to 98.808% when pulling f094799 on frankgreco:finish-bug into 3a33bd1 on uber:master.

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@yurishkuro yurishkuro merged commit f5c2570 into jaegertracing:master Jul 20, 2017
@frankgreco frankgreco deleted the finish-bug branch July 20, 2017 20:13
@frankgreco
Copy link
Contributor Author

@yurishkuro what is Uber's process with cutting new releases. Does a new release get cut every time master is updated?

@yurishkuro
Copy link
Member

No, the release process is manual. I assume this fix is not too critical to go out? It will be a new minor release because of some new features since the last release.

@frankgreco
Copy link
Contributor Author

okay thanks for the explanation. No, it's not critical.

@djeeg djeeg mentioned this pull request Dec 8, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants