Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENV variable gets overridden in an unexpected way #1

Open
victorblaga opened this issue Oct 7, 2015 · 3 comments
Open

ENV variable gets overridden in an unexpected way #1

victorblaga opened this issue Oct 7, 2015 · 3 comments

Comments

@victorblaga
Copy link

If I provide an environment variable called "INPUT_QUEUE", the daemons library will override this and set it to the -i value, which, if not specified, gets automatically set to the name of the component (e.g. opener-language-identifier).

I'm not sure if this behaviour is intentional or not, however I find it unexpected. If I provide a default value through for some parameter through an environment variable, I expect this value to override whatever values I pass in through argv.

Fix is underway, I'll create a pull request.

@giannismelidis
Copy link
Member

Hello Victor,

thanks for using OpeNER and providing feedback! I don't see a right or
wrong in both cases to be honest, both solutions might give trouble.
With your solution, INPUT_QUEUE now overrides the preferred way of
providing a queue name (-i or --input) which is also causing unexpected
behavior.

OpeNER components are meant to be executed on the same machine as well,
sometimes having a different queue name for each component. if
INPUT_QUEUE was to override the --input option, then there is no way to
specify different queues for different components. So this is somewhat
intentional behavior.

May I ask why you do not use the preferred way of setting the queue name
through --input option?

Best,
Giannis Melidis

Victor Blaga wrote:

If I provide an environment variable called "INPUT_QUEUE", the daemons
library will override this and set it to the -i value, which, if not
specified, gets automatically set to the name of the component (e.g.
opener-language-identifier).

I'm not sure if this behaviour is intentional or not, however I find
it unexpected. If I provide a default value through for some parameter
through an environment variable, I expect this value to override
whatever values I pass in through argv.

Fix is underway, I'll create a pull request.


Reply to this email directly or view it on GitHub
#1.

@victorblaga
Copy link
Author

Hi Giannis,

I agree that there is no right and wrong, however, reading through the code, I would expect the ENV['INPUT_QUEUE'] to have priority over the -i option, or at least be taken into account when there is no -i option provided.

For example, when I don't specify a -i option, even though I do have a ENV['INPUT_QUEUE'] defined, the env is not taken into account and the queue name will always be the the options parser default, which is "opener-component-name".

In daemon.rb, line 20:

set :queue_name, proc { Daemons.input_queue }

In daemons.rb, line 26:

def self.input_queue
  return ENV['INPUT_QUEUE']
end

This method suggests that ENV has some kind of priority.

I understand that the component is started as a daemon, and the daemon has its ENV injected by the daemon starter (controller), which means that the daemon ENV and the initial ENV can be different, which I find reasonable.

In controller.rb, line 141:

env = ENV.to_hash.merge(
          'INPUT_QUEUE'    => options[:input].to_s,
          'DAEMON_THREADS' => options[:threads].to_s,
          'OUTPUT_BUCKET'  => options[:bucket].to_s,
          'NRCONFIG'       => newrelic_config,
          'APP_ROOT'       => File.expand_path('../../../../', __FILE__),
          'APP_NAME'       => name
)

However, I believe that the intent behind this ENV.to_hash.merge call is to state something like this: by default take the INPUT_QUEUE from ENV, and, if it is not present, take the options[:input] value.

Unfortunately, since the OptionsParsers -i always defaults to something (even when the option is not provided), the ENV['INPUT_QUEUE'] will always be ignored.

The reason I'm using environment variables and not command line options is that during my deployment setup, I have all the input parameters injected by my deployment script: AWS_CREDENTIALS, AWS_REGION, etc. and also INPUT_QUEUE.

I admit that I don't have more than one opener-component running per machine, so providing an environment variable works for me. I do understand the problems that could arise if there are more than one component running in the same environment.

I could change my scripts to provide the queue name as a command line parameter, but in this case I would have environment setup logic in 2 places (part through ENV and part through command line parameters), which I think is a bad design (since I want only one point of defining this).

@giannismelidis
Copy link
Member

Hello Victor,

Yes, I understand your reasoning, but still --input should have priority since giving INPUT_QUEUE priority, wouldn't allow someone to use multiple queues for different components. So the solution should cover all these cases:

  • If no --input is set and no INPUT_QUEUE env is set, use the default --input.
  • If no --input is set and INPUT_QUEUE env is set, use INPUT_QUEUE (and not the default --input).
  • If --input is set and INPUT_QUEUE is not set, use --input.
  • If --input is set and INPUT_QUEUE is set, use --input.

Will look into it when I have some spare time, unless you have it solved already :-)

Best,
Giannis Melidis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants