Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Command Argument STDOUT/capistrano.log Hiding #430

Merged
merged 11 commits into from
Jun 30, 2018
Merged

[Feature] Command Argument STDOUT/capistrano.log Hiding #430

merged 11 commits into from
Jun 30, 2018

Conversation

NorseGaud
Copy link
Contributor

@NorseGaud NorseGaud commented Jun 8, 2018

Take the following task in capistrano-postgresql for example:

desc 'Create pg_username in database'
  task :create_database_user do
    on roles :db do
      unless database_user_exists?
        # If you use CREATE USER instead of CREATE ROLE the LOGIN right is granted automatically; otherwise you must specify it in the WITH clause of the CREATE statement.
        execute [:psql, "-d #{fetch(:pg_system_db)}", '-c', %Q{"CREATE USER \\"#{fetch(:pg_username)}\\" PASSWORD '}, ["#{fetch(:pg_password)}"], %Q{';"}]
      end
    end
  end

The wrapping of "#{fetch(:pg_password)}" with [ ] will allow us to hide the password the :pretty log STDOUT and capistrano.log entry:

      01 sudo -i -u postgres psql -d postgres -c "CREATE USER \"growtrader\" PASSWORD ' *HIDDEN* ';"
      01 CREATE ROLE
    ✔ 01 [email protected] 0.096s
 DEBUG [600d7e15] Command: ( export PATH="$HOME/.gem/ruby/2.5.0/bin:$PATH" ; /usr/bin/env sudo -i -u postgres psql -d postgres -c "CREATE USER \"growtrader\" PASSWORD ' *HIDDEN* ';" )
 DEBUG [c4a860dc] 	CREATE ROLE
  INFO [c4a860dc] Finished in 0.096 seconds with exit status 0 (successful).

Yet, server side, I see

development:
  adapter: postgresql
  encoding: UTF-8
  database: growtrader
  pool: 100
  username: growtrader
  password: eaf38e7bec5d860c44d5
  host: growtrader.dev
  socket:
  port: 5432
  timeout: 5000
  1. It supports multiple strings in the args array you pass in.
  2. It removed the args array (using join, just in case the user has multiple things in the hidden array) before executing it (current tests, except for the vagrant stuff, pass)
  3. I will create tests and also document it once I get sign off on the current code. It's 12:11AM right now, so I need to sleep. :)
  4. RuboCop only has one offense, and it's minor, and I can include that with the readme and tests I add if this is approved.
  5. I'll add the Changelog entry once approved.

Look forward to your feedback!

Array wrap for args so we can hide passwords (or anything else) from the capistrano.log and also STDOUT
@NorseGaud
Copy link
Contributor Author

Travis failures look to be from RuboCop. I'll fix those after the code is approved.

@leehambley
Copy link
Member

Really nice approach, of course as you alluded to it needs tests/etc but it's workable.

I wonder if another approach might be cleaner and/or give a better developer experience. Namely that we don't abuse [] arrays but that we do something like using Ruby's Refinements feature to mark something as .redactable or similar.

I don't want to bog you down with more research, and I really really appreciate your contribution, the array idea is really elegant to single out elements for special treatment, I just don't know how well it works in the API as a whole.

@leehambley
Copy link
Member

I just tried (and failed) to make something with refinements work, but I think it's because I'm shooting from the hip instead of doing it properly.

It just came to my mind though that Ruby has Object.trust, Object.taint and Object.trusted?, Object.tainted? etc - these depend a bit on env vars, but we might have success with something like:

unless database_user_exists?
        # If you use CREATE USER instead of CREATE ROLE the LOGIN right is granted automatically; otherwise you must specify it in the WITH clause of the CREATE statement.
        execute [:psql, "-d #{fetch(:pg_system_db)}", '-c', %Q{"CREATE USER \\"#{fetch(:pg_username)}\\" PASSWORD '}, "#{fetch(:pg_password)}".taint, %Q{';"}]
      end

and then simply use .tainted? in the logger/formatter.

@NorseGaud
Copy link
Contributor Author

Oooo, yeah! I like that direction a lot. I'll play around with it in my developer environment and see if I can get it to work. Thanks for the suggestion!

@leehambley
Copy link
Member

leehambley commented Jun 8, 2018 via email

@mattbrictson
Copy link
Member

Capistrano currently supports Ruby 2.0 and higher, and I think refinements didn't fully settle until 2.1. I'm not sure I'd want to bump the Ruby requirement just for this feature, so it might be best to avoid using refinements in the implementation.

@NorseGaud
Copy link
Contributor Author

Yep, experimental in 2.0.0. Would the code I’ve written be ok for now, until we eventually bump the ruby version requirement? I can put a note in the code about it.

@leehambley
Copy link
Member

Good catch @mattbrictson I always forget how much work you put into the backwards compatibility.

Can you live with the "misuse" of trusted/tainted or similar to avoid the version bump, or do you have a better idea?

@will-in-wi
Copy link
Contributor

Do we have a more or less official backwards compatibility policy? As of April of last year, 2.1 is fully unsupported. We may want to take this opportunity to drop support for 2.0 if it means cleaner code.

@leehambley
Copy link
Member

Do we have a more or less official backwards compatibility policy? As of April of last year, 2.1 is fully unsupported. We may want to take this opportunity to drop support for 2.0 if it means cleaner code.

Either time is accelerating, or I'm getting old. I remember looking forward to Ruby 2.x like it was the bright new future.

I'd throw a vote for our policy being "anything that is not supported/EOL'ed upstream" being our target.

@NorseGaud
Copy link
Contributor Author

I'll go ahead and make the necessary remaining changes and also use tainted/etc like you suggested Lee.

@NorseGaud
Copy link
Contributor Author

NorseGaud commented Jun 12, 2018

Alright, just the readme and finalizing the tests. @leehambley, do you guys test the capitrano.log contents at all, or do we know it will be ok if I see the STDOUT show the redaction?

Also, testing of deploy in my dev environment is showing some weird interactions using taint:

cap development deploy               
00:00 git:wrapper
      01 mkdir -p *HIDDEN*
    ✔ 01 [email protected] 0.266s
      Uploading /tmp/git-ssh-growtrader-development-norsegaud.sh 100.0%
      02 chmod 700 *HIDDEN*
    ✔ 02 [email protected] 0.059s

Those definitely don't need to be hidden. I checked git:wrapper in https://github.com/capistrano/capistrano/blob/5ccddea330c206bd6f4431f677858719e69cd4aa/lib/capistrano/scm/tasks/git.rake and it seems like .shellescape is tainting stuff.

Tomorrow I'm going to work on a .redact method instead. That should prevent the issue.

@leehambley
Copy link
Member

Those definitely don't need to be hidden. I checked git:wrapper in https://github.com/capistrano/capistrano/blob/5ccddea330c206bd6f4431f677858719e69cd4aa/lib/capistrano/scm/tasks/git.rake and it seems like .shellescape is tainting stuff.

That sounds reasonable. The taninted stuff is supposed to be used for "untrusted" strings. In a Rails context for example anything that comes from a form post or is read from a file on disk (?) is considered tainted. I think the reasoning is that you can then refuse to exec anything that is tainted to protect yourself from injection attacks.

I don't know whether it's seldom used, or so well integrated into rails that it appears to be invisible, but it rarely comes up in conversation about data hygiene.

@NorseGaud
Copy link
Contributor Author

NorseGaud commented Jun 13, 2018

Ok, I think the new changes are user-friendly enough and I've also confirmed the tests in vagrant are passing. I tested it in my own project and it's working well.

What do you think @leehambley?

@leehambley
Copy link
Member

The user-facing API is super elegant, and a wrapper method avoids all the issues with Ruby versions, and how Refinements, etc all work, so I applaud you for that.

The slight abuse or arrays still makes me a little bit uncomfortable, but now that it's hidden behind a method call I'm ok with it. I'd like to see some const like REDACTION_STRING = "*REDACTED*" and a check in the logger around arg.is_a?(Array) ? arg[0] : arg also to check that arg[0] actually is the redaction string, just to avoid that someone accidentally passes an array and gets really hard to debug behaviour.

With all that said, from my side, this is good to go if we get that extra level of safety in. I hope the others would agree.

I want to thank you for the best FOSS contributorship experience I've had in a long time, this has been fun.

@NorseGaud
Copy link
Contributor Author

Glad to hear it Lee! My pleasure man. I really love capistrano and appreciate the warm welcome by the community.

I totally agree with the elimination of the [redacted,arg] array in the redact method. I'm sort of out of ideas since we can't use .taint. If you have any suggestions on how to accomplish it, let me know!

In the meantime, I've added a check to raise if the user tries to pass in redact([]) or redact({}). I've added to the test_redaction a simple check for the exception as well.

CHANGELOG.md Outdated
@@ -7,6 +7,10 @@ appear at the top.

* Your contribution here!

## [1.17.0][] (2018-06-08)

* [#430](https://github.com/capistrano/sshkit/pull/430): [Feature] Command Argument STDOUT/capistrano.log Hiding - [@NorseGaud](https://github.com/NorseGaud)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this to the Unreleased section (next to the "Your contribution here!" bullet).

@@ -1,3 +1,3 @@
module SSHKit
VERSION = "1.16.1".freeze
VERSION = "1.17.0".freeze
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, only maintainers are allowed to change the version 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problemo!

@@ -42,13 +42,31 @@ def test_simple_netssh
], command_lines
end

def test_redaction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test! 💯


# Redaction
cmd_w_redaction = cmd.dup
cmd.dup.args.map!{|arg| arg.is_a?(Array) ? arg[0] : arg }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does redact the arguments from the logs (as verified by the unit test), but I suspect it is also redacting (and thus breaking) the actual command that is being executed.

The issue is that dup only makes a shallow copy. Therefore cmd.dup.args is the same reference as cmd.args, and so args.map! is mutating both copies of the command. Furthermore, don't we want to log arg[0] (i.e. the *REDACTED* string), but execute arg[1]? I don't see where the latter is taking place in the code.

I tested this branch using a Capistrano task:

task :test_redact do
  on roles(:all) do
    execute :echo, redact("secret!"), "> test_redact"
    puts capture(:cat, "test_redact") #  this should print `secret!`
  end
end

And confirmed that secret! is being redacted from the log, but the original secret! is never actually passed to the command being executed:

00:00 test_redact
      01 echo *REDACTED* > test_redact
    ✔ 01 [email protected] 2.344s
*REDACTED*

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, ok, I’ll run more tests. I had manually confirmed it was working but could have just confused myself into thinking it was fine

.gitignore Outdated
@@ -6,3 +6,4 @@ bin/rake
.vagrant*
test/tmp
Gemfile.lock
.idea
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not put IDE-specific patterns in our gitignore. Those are more appropriate for a global gitignore (see https://thomashunter.name/blog/global-gitignore-vs-repository-gitignore/)

@@ -443,6 +443,33 @@ SSHKit.config.output = SSHKit::Formatter::Pretty.new(output)
SSHKit.config.output = SSHKit::Formatter::SimpleText.new(File.open('log/deploy.log', 'wb'))
```

#### Output & Log Redaction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, documentation!! ❤️

@leehambley
Copy link
Member

Anything standing in the way of merging this @mattbrictson , @will-in-wi

@NorseGaud
Copy link
Contributor Author

Yep, Matt is right about it not working fully. I’m testing a new way of doing it. Should be done by next week.

@NorseGaud
Copy link
Contributor Author

NorseGaud commented Jun 20, 2018

@leehambley you or @mattbrictson getting these weird vagrant/test errors?

test_upload_large_file ERROR (0.04s)
Net::SSH::HostKeyUnknown: Net::SSH::HostKeyUnknown: fingerprint SHA256:I2efvQ+daDNq+OLVw9uThbTBKiu63nB0Qf+geS7cggo is unknown for "[localhost]:3001,[127.0.0.1]:3001"
/usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/verifiers/always.rb:46:in process_cache_miss' /usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/verifiers/always.rb:21:in verify'
/usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/transport/kex/diffie_hellman_group1_sha1.rb:188:in verify_server_key' /usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/transport/kex/diffie_hellman_group1_sha1.rb:72:in exchange_keys'
/usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/transport/algorithms.rb:389:in exchange_keys' /usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/transport/algorithms.rb:211:in proceed!'
/usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/transport/algorithms.rb:150:in accept_kexinit' /usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/transport/session.rb:210:in block in poll_message'
/usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/transport/session.rb:190:in loop' /usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/transport/session.rb:190:in poll_message'
/usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/transport/session.rb:225:in block in wait' /usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/transport/session.rb:223:in loop'
/usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/transport/session.rb:223:in wait' /usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/transport/session.rb:90:in initialize'
/usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh.rb:242:in new' /usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh.rb:242:in start'
/Users/norsegaud/RubymineProjects/sshkit/lib/sshkit/backends/connection_pool.rb:59:in call' /Users/norsegaud/RubymineProjects/sshkit/lib/sshkit/backends/connection_pool.rb:59:in with'
/Users/norsegaud/RubymineProjects/sshkit/lib/sshkit/backends/netssh.rb:184:in with_ssh' /Users/norsegaud/RubymineProjects/sshkit/lib/sshkit/backends/netssh.rb:67:in upload!'
/Users/norsegaud/RubymineProjects/sshkit/test/functional/backends/test_netssh.rb:193:in block in test_upload_large_file' /Users/norsegaud/RubymineProjects/sshkit/lib/sshkit/backends/abstract.rb:29:in instance_exec'
/Users/norsegaud/RubymineProjects/sshkit/lib/sshkit/backends/abstract.rb:29:in run' /Users/norsegaud/RubymineProjects/sshkit/test/functional/backends/test_netssh.rb:195:in test_upload_large_file'

[5] pry(#SSHKit::Backend::TestNetssh)> Netssh.new(a_host) do |host|
[5] pry(#SSHKit::Backend::TestNetssh)* capture(:uname)
[5] pry(#SSHKit::Backend::TestNetssh)* host_ssh_options = host.ssh_options
[5] pry(#SSHKit::Backend::TestNetssh)* end.run
Net::SSH::HostKeyUnknown: fingerprint SHA256:I2efvQ+daDNq+OLVw9uThbTBKiu63nB0Qf+geS7cggo is unknown for "[localhost]:3001,[127.0.0.1]:3001"
from /usr/local/lib/ruby/gems/2.5.0/gems/net-ssh-5.0.1/lib/net/ssh/verifiers/always.rb:46:in `process_cache_miss'

[1] pry(#SSHKit::Backend::TestNetssh)> a_host
=> #<SSHKit::Host:0x00007f87c10a6e28 @hostname="localhost", @keys=[], @Local=false, @password="vagrant", @PORT=3001, @ssh_options={:verify_host_key=>:always}, @user="vagrant">

…n't execute *REDACTED* (just log it), new execute_command logic for redaction
test/functional/backends/test_netssh.rb:61:9: W: Parenthesize the param @output.lines.select { |line| line.start_with?("\ttest.file") } to make sure that the block will be associated with the @output.lines.select method call.
        assert_equal [ ...
        ^^^^^^^^^^^^^^
68 files inspected, 1 offense detected
RuboCop failed!
@NorseGaud
Copy link
Contributor Author

Ok, passing all test in Travis and confirmed it works (after fixing my postgresql issue locally) in capistrano-postgresql.

Let me know if that covers every request.

@mattbrictson
Copy link
Member

Thanks! I'll look at it soon, probably over the weekend.

@NorseGaud
Copy link
Contributor Author

No rush :)


def redact(arg) # Used in execute_command to hide redact() values user passes in
raise ArgumentError, 'redact() does not support Array or Hash' if arg.is_a?(Array) || arg.is_a?(Hash)
['*REDACTED*',arg]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works and I am happy to accept this PR as-is, but there might be a more elegant solution if you want to explore it. I am thinking of using a module as a way to "mark" the arg being redacted. In other words:

module Redacted
  # intentionally empty
end

Then the redact method becomes:

def redact(arg)
  arg.extend(Redacted)
end

Finally, testing whether something is redacted is:

arg.is_a?(Redacted)

The advantage to this approach is that the arguments are modified in a way that should have no side-effects. Whereas replacing a string with a specially-formed array inherently modifies the arguments in a way that has to explicitly reversed later.


def redact(arg) # Used in execute_command to hide redact() values user passes in
raise ArgumentError, 'redact() does not support Array or Hash' if arg.is_a?(Array) || arg.is_a?(Hash)
['*REDACTED*',arg]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I anticipate that some SSHKit users will complain that redact works for SSH backends but not for local backends. Generally speaking we try to keep the two in parity. Is there an easy way to add the redact functionality to the local backend as well?

output.log_command_start(cmd)
cmd.options[:redacted] = false
Copy link
Member

@mattbrictson mattbrictson Jun 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code strikes me as being a bit brittle. We are setting an option purely to get a desirable side-effect, when a more direct approach would be more self-explanatory.

What if the Command object had a method to get a redacted version of itself, and that is what gets logged?

I am open to suggestions on the name, but one possibility is:

output.log_command_start(cmd.with_sensitive_args_redacted)

The with_sensitive_args_redacted method would be responsible for returning a copy of the command with a different set of args.

def with_sensitive_args_redacted
  new_args = args.map( ... ) # logic to replace redacted args goes here
  redacted_cmd = dup
  redacted_cmd.instance_variable_set(:@args, new_args)
  redacted_cmd
end

@@ -206,7 +206,14 @@ def to_command

def to_s
if should_map?
[SSHKit.config.command_map[command.to_sym], *Array(args)].join(' ')
if options[:redacted]
new_args = args.map{|arg| arg.is_a?(Array) && arg[0] == '*REDACTED*' ? arg[0] : arg }.join(' ')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the join(' ') here is necessary

- Module Redaction under SSHKit, so it's available everywhere we need it to be
- Simplified redact and moved it to abstact.rb so we can use it where we need
- with_redaction method available for SSHKit::Command
- Added .with_redaction to netssh backends, log_command_start
- Fixed tests, added Array, Hash, etc
- Rewrote README
@NorseGaud
Copy link
Contributor Author

Total rewrite using suggestions from @mattbrictson. I LOVE it. It's so simple now. Thank you sir!

  • Module Redaction under SSHKit, so it's available everywhere we need it to be
  • Simplified redact and moved it to abstact.rb so we can use it where we need
  • with_redaction method available for SSHKit::Command
  • Added .with_redaction to netssh backends, log_command_start
  • Fixed tests, added Array, Hash, etc
  • Rewrote README

Copy link
Member

@mattbrictson mattbrictson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit biased, but this looks great! 😁

My only request would be to undo the change to the .gitignore. Thanks for your patience in the review process!

Removed .idea
@NorseGaud
Copy link
Contributor Author

Weird, thought I did that already. Good to go. Thanks a ton guys!

@mattbrictson
Copy link
Member

@leehambley @will-in-wi I'll merge this if there are no further suggestions?

@leehambley
Copy link
Member

I'm very happy with this, the empty is_a is elegant in ways that give me the heebie jeebies, but I'm good with it, good shout with that approach.

@mattbrictson mattbrictson merged commit 6af6cad into capistrano:master Jun 30, 2018
@mattbrictson
Copy link
Member

Congrats on your first contribution to the project, @NorseGaud 🎉

@NorseGaud
Copy link
Contributor Author

Thanks again guys! What an enjoyable learning experience for me!

@NorseGaud
Copy link
Contributor Author

Quick question: what’s the next steps for getting the new gem version released? Anything I have to do or request?

@mattbrictson
Copy link
Member

We're on roughly a six-week release cycle. The next version is due out in the upcoming week and will include this PR. Nothing you need to do 🙂

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Sep 23, 2018
## [1.17.0][] (2018-07-07)

  * [#430](capistrano/sshkit#430): [Feature] Command Argument STDOUT/capistrano.log Hiding - [@NorseGaud](https://github.com/NorseGaud)

## [1.16.1][] (2018-05-20)

  * [#425](capistrano/sshkit#425): Command#group incorrectly escapes double quotes, resulting in a a syntax error when specifying the group execution using `as`. This issue manifested when user command quotes changed from double quotes to single quotes. This fix removes the double quote escaping - [@pblesi](https://github.com/pblesi).
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Dec 17, 2018
## [1.18.0][] (2018-10-21)

  * [#435](capistrano/sshkit#435): Consistent verbosity configuration #capture and #test methods - [@NikolayRys](https://github.com/NikolayRys)

## [1.17.0][] (2018-07-07)

  * [#430](capistrano/sshkit#430): [Feature] Command Argument STDOUT/capistrano.log Hiding - [@NorseGaud](https://github.com/NorseGaud)

## [1.16.1][] (2018-05-20)

  * [#425](capistrano/sshkit#425): Command#group incorrectly escapes double quotes, resulting in a a syntax error when specifying the group execution using `as`. This issue manifested when user command quotes changed from double quotes to single quotes. This fix removes the double quote escaping - [@pblesi](https://github.com/pblesi).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants