-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove access to process.env from hot code paths #5657
Conversation
if (process.env.NODE_ENV === 'development') { | ||
if (NODE_ENV === 'development') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not clear in the diff, but this is for the trace
method. A production environment would still pay the cost for this check, even though it would eventually evaluate to false
and skip the tracing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is fine. RESTDataSource hasn't been actively maintained in a little while (the current Apollo Server team has not yet ramped up on it; it's basically an entirely unrelated project that is very loosely coupled to Apollo Server). I don't know if anyone depends on the ability to set process.env.NODE_ENV at runtime (eg, potentially after this file is imported) and have it take effect. But if it breaks people we can revert it.
if (options.debug === undefined) { | ||
const nodeEnv = | ||
'__testing_nodeEnv__' in options | ||
? options.__testing_nodeEnv__ | ||
: process.env.NODE_ENV; | ||
'__testing_nodeEnv__' in options ? options.__testing_nodeEnv__ : NODE_ENV; | ||
options.debug = debugFromNodeEnv(nodeEnv); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one also looks like it would happen in production environments, where debug
would likely be undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I would like to be sensitive to runtime NODE_ENV changes. We actually already capture NODE_ENV once in the ApolloServer constructor and pass that through to various ways. Unfortunately for historical reasons much of the core processing in this package happens technically outside of the ApolloServer class even though it is only usable when called from ApolloServer, so this file doesn't have access to that.
That said, I basically implemented this already but just as a "testing" hack. What I'd suggest is taking the __testing_nodeEnv__
fields of Config
and GraphQLServerOptions
and renaming them to nodeEnv
. Then in the place in packages/apollo-server-core/src/ApolloServer.ts
where we do const nodeEnv =
we should set that on this.requestOptions.nodeEnv
instead. That should lead to options.nodeEnv
on this line in runHttpQuery being equal to the env var's value (or a manual override) already, so no need to read the env var in this file.
Is this something you can try?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've spent a few days trying to do this, and I'm not sure what to do.
The logic differs on the existence of nodeEnv
, not necessarily its value. More specifically, providing nodeEnv: undefined
is not the same thing as omitting it altogether.
What I would expect to happen is that if you don't provide nodeEnv
, ApolloServer would default to the NODE_ENV
environment variable. In other words, if nodeEnv
is undefined
, then default to NODE_ENV
.
This doesn't work with the current implementation, at least as per the tests. The current implementation will simply check if you attempt to set it, even if it's value is undefined
; as long as it's there, it's considered an override.
So what is happening is that the tests fail, because when nodeEnv
is not provided, it defaults to NODE_ENV
, and when NODE_ENV
is test
, debug
is false, and you don't get stacktraces back in errors
.
The old implementation would avoid defaulting to test
because __testing_nodeEnv__
was simply there, even though it was undefined
. The test cases look like they are attempting to validate behaviour when no node environment is set.
I'm not sure how to accommodate this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That being said, I don't think it's possible to get this to not refer NODE_ENV
in some way anyways, because resolveGraphqlOptions
is what is supposed to give you the nodeEnv
option, but that can error out and put you in a state where you need to know the node environment, but have no ability to get nodeEnv
, and thus have to default to NODE_ENV
anyways.
This would also introduce a quirk that even if you provided nodeEnv
, it will be ignored depending on how the thing errors out, which sounds like a foot gun to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When __testing_nodeEnv__
was only being used in a few tests, making undefined
distinct from "unset" seemed like a reasonable choice. It sounds like this is more of a burden when it's an actual part of the API.
How about we just change that? So we could make nodeEnv: undefined
and "nodeEnv
unset" be the same thing (use the actual environment variable if it's set), and we can use nodeEnv: ''
to mean "seriously, don't be sensitive to the actual process environment, act as if the node env is unspecified". (Technically in the Unix model, there is a distinction between unset environment variables and environment variables that are set to the empty string, but it's not great to treat them differently and our code that reads this doesn't, so this should be fine.)
Does that resolve all the concerns you have here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it!
This seems like a good idea! I look forward to reviewing it soon. |
if (process.env.NODE_ENV === 'development') { | ||
if (NODE_ENV === 'development') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is fine. RESTDataSource hasn't been actively maintained in a little while (the current Apollo Server team has not yet ramped up on it; it's basically an entirely unrelated project that is very loosely coupled to Apollo Server). I don't know if anyone depends on the ability to set process.env.NODE_ENV at runtime (eg, potentially after this file is imported) and have it take effect. But if it breaks people we can revert it.
if (options.debug === undefined) { | ||
const nodeEnv = | ||
'__testing_nodeEnv__' in options | ||
? options.__testing_nodeEnv__ | ||
: process.env.NODE_ENV; | ||
'__testing_nodeEnv__' in options ? options.__testing_nodeEnv__ : NODE_ENV; | ||
options.debug = debugFromNodeEnv(nodeEnv); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I would like to be sensitive to runtime NODE_ENV changes. We actually already capture NODE_ENV once in the ApolloServer constructor and pass that through to various ways. Unfortunately for historical reasons much of the core processing in this package happens technically outside of the ApolloServer class even though it is only usable when called from ApolloServer, so this file doesn't have access to that.
That said, I basically implemented this already but just as a "testing" hack. What I'd suggest is taking the __testing_nodeEnv__
fields of Config
and GraphQLServerOptions
and renaming them to nodeEnv
. Then in the place in packages/apollo-server-core/src/ApolloServer.ts
where we do const nodeEnv =
we should set that on this.requestOptions.nodeEnv
instead. That should lead to options.nodeEnv
on this line in runHttpQuery being equal to the env var's value (or a manual override) already, so no need to read the env var in this file.
Is this something you can try?
Ok, I took a first attempt. Still getting my feet wet with Typescript, so I'm sure I missed something. I'll come back to this later and correct any mistakes. |
// Apollo Server only uses process.env.NODE_ENV to determine defaults for | ||
// other behavior which have other mechanisms of setting explicitly. Sometimes | ||
// our tests want to test the exact logic of how NODE_ENV affects defaults; | ||
// they can set this parameter, but there's no reason to do so other than for | ||
// tests. Note that an explicit `__testing_nodeEnv__: undefined` means "act as | ||
// if the environment variable is not set", whereas the absence of | ||
// `__testing_nodeEnv__` means to honor the environment variable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't think that this comment was necessary anymore, since I suppose you can consider nodeEnv
a first class option now.
@@ -29,6 +29,7 @@ describe('runHttpQuery', () => { | |||
query: '{ testString }', | |||
}, | |||
options: { | |||
debug: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was required to avoid returning a stack trace, which the test didn't expect. Without running through ApolloServer first, the nodeEnv
stuff never runs, so it doesn't get setup correctly for this test. This avoids that.
Sorry for the delay — hope to get to this tomorrow. |
@@ -131,16 +133,12 @@ export async function runHttpQuery( | |||
// debug. Therefore, we need to do some unnatural things, such | |||
// as use NODE_ENV to determine the debug settings | |||
return throwHttpGraphQLError(500, [e as Error], { | |||
debug: debugFromNodeEnv(process.env.NODE_ENV), | |||
debug: debugFromNodeEnv(NODE_ENV), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably at least extend the comment above to note that unlike other aspects of NODE_ENV in this package, this one isn't sensitive to runtime changes to the environment variable.
(Alternatively, perhaps it's OK performance-wise to read the environment variable in this 500 situation?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, perhaps it's OK performance-wise to read the environment variable in this 500 situation
Yeah, I suppose this isn't a hot code path. I imagine an error here would be something the would only happen intermittently... and if it's happening more than that, you likely got bigger problems than a runtime cost!
Either way, in my new revision I have debugFromNodeEnv
default to the "cached" NODE_ENV
, so I added the comment you requested. I think that this keeps it consistent with the default behaviour (ie. nodeEnv
defaults to NODE_ENV
and is also not sensitive to runtime changes).
I'm not sure if you want me to respond to the various comments that you've "resolved". The PR looks largely good, though I think you were right to point out that the current API (where undefined and unset are distinct) isn't great. Do you want to make that change? |
Looks good! I tightened up a few types by dropping |
nodeEnv?: string | undefined; | ||
nodeEnv?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gah! I made a note to do this, too! Totally missed it.
Awesome! Thank you @glasser ! |
Removes access to
process.env
from hot code paths. My understanding on this is admittedly somewhat limited, but there is a performance penalty that occurs when accessingprocess.env
(or perhaps justprocess
) because Node needs to drop down to C which is (apparently) quite expensive. The penalty wouldn't be noticeable as a one time thing, but for things that are running often at scale, it can really add up.I vaguely recall this being an issue in React SSR early on... remember reading something about this being discovered and remedied, leading to immediate perf improvements. Can't find the article about it... and now I'm wondering if my memory is failing me... but I recall this being the first thing that made me aware of the process.env penalty.
(Edit: I think it was this: https://reactjs.org/blog/2017/09/26/react-v16.0.html#better-server-side-rendering)
Offhand, I know express recommends this, too.
From what I could tell, these aren't transpiled out (I looked in the
dist
directory that gets installed innode_modules
). Perhaps this is already being handled differently, making this PR moot, but figured I'd throw this up and get your thoughts.There are other spots I saw doing this, but they looked like they were only being done when the server started, so the penalty wouldn't be noticeable at all. Suppose it could be done across the board for consistency.