Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support non-ASCII characters on Windows #130

Merged
merged 3 commits into from
Jan 12, 2019

Conversation

james-stocks
Copy link

Use CreateProcessW instead of CreateProcessA to create processes on Windows; allowing for non-ASCII characters in the command line and the environment.

This PR causes one spec test to be skipped for Windows when Ruby < 2.3; and it leaves a TODO: comment in the code of that test. For this case, the Ruby ENV hash cannot be used to check the child environment because it will not be reliably encoded/decoded by Ruby.
The TODO item is to re-write the test in a way that does not require ruby's ENV hash.

James Stocks added 2 commits October 24, 2017 15:00
Support wide characters on Windows. This allows for non-ASCII characters
in the command line.
@glennsarti
Copy link

@coveralls
Copy link

coveralls commented Oct 30, 2017

Coverage Status

Coverage increased (+0.1%) to 94.993% when pulling c4f8e49 on james-stocks:CreateProcessW_skiptest into 0bce27d on enkessler:master.

@james-stocks
Copy link
Author

@glennsarti The limitation affects only the spec_helper code and not childprocess itself.

Following what Puppet does is one solution; an easier approach may be to have the test case avoid needing to check the environment e.g. have the test spawn a process that signals success only if the required environment value is set; and pass/fail the test on that signal.

In either case; I would prefer this PR to merge as-is if possible


it 'allows unicode characters in the environment' do
# This test does not work on Windows for Ruby < 2.3 because ENV will not be properly decoded
# This was fixed in Ruby 2.3 here: https://github.com/ruby/ruby/commit/5e3467c4414df815b3b00d2b0372026b069e7f7d
Copy link

@glennsarti glennsarti Oct 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that ruby only fixed using UTF8 characters in ENV VALUES not keys

If you change the test the code to

    Tempfile.open("env-spec") do |file|
      process = write_env(file.path)
      process.environment['baör'] = 'baör'
      process.start
      process.wait
      
      child_env = eval rewind_and_read(file)
      
      expect(child_env['baör'].force_encoding('UTF-8')).to eql 'baör'
    end

The test fails even on modern rubys

Inspecting child_env in pry you can see the corruption

[1] pry(#<RSpec::ExampleGroups::ChildProcess>)> child_env
=> {"ALLUSERSPROFILE"=>"C:\\ProgramData",
 "ANSICON"=>"185x2000 (185x65)",
 "ANSICON_DEF"=>"7",
 "APPDATA"=>"C:\\Users\\glenn.sarti\\AppData\\Roaming",
 "BUNDLER_VERSION"=>"1.15.4",
 "BUNDLE_BIN_PATH"=>"C:/tools/ruby2.3.1x64/lib/ruby/gems/2.3.0/gems/bundler-1.15.4/exe/bundle",
 "BUNDLE_GEMFILE"=>"C:/Source/tmp/childprocess/Gemfile",
 "ChocolateyInstall"=>"C:\\ProgramData\\chocolatey",
 "ChocolateyLastPathUpdate"=>"Sun Aug 27 19:31:25 2017",
 "ChocolateyToolsLocation"=>"C:\\tools",
...
"ba\xC3\xB6r"=>"ba\xC3\xB6r"}

Also note that this testing method mangles any existing UTF8 env vars
e.g. in PowerShell create a UTF8 envvar

PS> $ENV:UTF8ᚠᛇᚻ = 'UTF8ᚠᛇᚻ'
PS> dir ENV:

Name                           Value
----                           -----
ALLUSERSPROFILE                C:\ProgramData
ANSICON                        100x2000 (100x33)
ANSICON_DEF                    7
APPDATA                        C:\Users\glenn.sarti\AppData\Roaming
ChocolateyInstall              C:\ProgramData\chocolatey
...
UTF8ᚠᛇᚻ                        UTF8ᚠᛇᚻ

Then run the spec test and pry into child_env

...
 "USERNAME"=>"glenn.sarti",
 "USERPROFILE"=>"C:\\Users\\glenn.sarti",
 "UTF8???"=>"UTF8???",
 "windir"=>"C:\\WINDOWS",
 "ba\xC3\xB6r"=>"ba\xC3\xB6r"}

Note the UTF8 characters are now ? and the UTF8 from childprocess is now bytes. This has not been fixed in Ruby 2.4.x yet either.

This test will not currently work as the Ruby ENV hash has encoding limitations.
This is fixed in Ruby 2.3
(ruby/ruby@5e3467c)
@james-stocks
Copy link
Author

@glennsarti I pushed a minor change to the comment on the test, which makes your last comment "outdated" (thanks Github).

I believe the limitations you describe do not affect ChildProcess itself; but only this spec testing method for checking the environment.

Any objection to merging this PR, and I can file a ticket on this project to get the spec tests moved away from their reliance on the ENV hash?

@glennsarti
Copy link

I'm not a maintainer so I have no authority :-)

Only reason I would object is; You're changing the gem so it accepts non-ASCII (i.e. non US English text) but the testing fixtures aren't capable enough. Seems odd to add this without tests to prove it works and nothing regresses.

@glennsarti
Copy link

Perhaps a quick method would be to use a batch file or powershell to output the environment variables on Windows instead of using ruby ENV hash. If I get time I'll try something, but I have no rights on your branch or this repo.

@coveralls
Copy link

coveralls commented Nov 1, 2017

Coverage Status

Coverage decreased (-4.08%) to 90.795% when pulling 83c2a88 on james-stocks:CreateProcessW_skiptest into 0bce27d on enkessler:master.

@da-ar
Copy link
Contributor

da-ar commented Jan 8, 2018

@enkessler, I work for the same company as @james-stocks and I'm taking over responsibility for this PR from him.

On picking this up, I can confirm the following:

Is there anything you need / want to see in order for the merge to go ahead?

@enkessler
Copy link
Owner

Sorry about the delay in my addressing this PR.

@da-ar My main concern in switching from ANSI to Unicode is not breaking things and backward compatibility. So,

  1. Do Unicode chars now work?
  2. Do ANSI chars still work?
  3. Can we prove it?

Looking at the code changes in the PR and reading up a bit on CreateProcessA and CreateProcessW, things look reasonable enough. It's not my expertise and I'm not 100% on the number of terminating characters being slapped onto the strings, but there are tests in the project that use 'regular' chars and this PR has a test the uses 'fancy' chars and they both seem to be passing.

I'd say the only holdup is point 3. From the conversation in this PR, it sounds like there is a workaround that would allow the functionality to be tested on older Rubies as well. I'd rather that happen now rather than later.

@da-ar
Copy link
Contributor

da-ar commented Jan 10, 2018

I'll have a look at what needs to happen with the tests to prove this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants