Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test unicode use and encoding awareness #104

Closed
svanoort opened this issue Nov 3, 2015 · 3 comments
Closed

Test unicode use and encoding awareness #104

svanoort opened this issue Nov 3, 2015 · 3 comments

Comments

@svanoort
Copy link
Owner

svanoort commented Nov 3, 2015

We need better test coverage of Unicode functionality. This ties into Python3 compatibility, because otherwise that enhancement has high potential to break things. This would be a very easy item for a new contributor to add to!

Places to add:

  • test_tests.py -- test of test parsing here
  • contenthandler.py -- test_contenthandling.py (use a tempfile for testing handling of file read)
  • functionaltest.py -- actual running test with a local DB (CRUD operations)
  • content-test.yaml -- do a basic CRUD test of content Created unicode-test.yaml, adding in section at a time

Components:

  • Fix to url handling to better concatenate/encode unicode characters
  • Test that request bodies work right with different encodings when: (test_contenthandling.py)
  • Test that URLs correctly do unicode & special character URL encoding, with and without templating -- this turns out to be amazingly complete/painful, left some tests in place and opened Support internationalized URIs (unicode URLs) - RFC3986 #123 for the URI/IRI mapping bits
  • Test that validators/extractors/etc can work with unicode content - done, json parser is thankfully unicode-smart, whee.

* Internally insure that string data is stored as unicode and encoded into raw bytes at the LAST moment
* functional tests for unicode use - Tests are actually passing correctly

@svanoort svanoort added this to the 1.7.0 - Python 3 + Parsing/Configuration Internals milestone Nov 3, 2015
@svanoort svanoort changed the title Verify encoding awareness and unicode handling Test unicode use and encoding awareness Nov 3, 2015
@svanoort
Copy link
Owner Author

svanoort commented Nov 3, 2015

@alexeyknyshev This would be a great place to contribute, since I know you're already looking at how unicode works with PyRestTest!

@svanoort
Copy link
Owner Author

Unicode handling policy:

  • Internally all string values are converted to Unicode for consistency at the first opportunity
    • For YAML reading, they're unicode anyway (UTF-8 is the YAML encoding AFAICT)
  • Binary content is kept as bytes where necessary (example: request/response bodies)
  • PyCurl will accept unicode containing only ASCII character points, any escapes/encoding to bytes are done at the last possible second when configuring PyCurl itself
  • All operations should be Unicode-safe (templating included, via helper methods to do un-encode and re-encode operations since native string.Template doesn't allow Unicode)

EDIT:
URL handling is tricky, pipeline needs to go like this:
URL base + url --> templating (yay) --> URL encoding if contains non-ASCII characters?

@svanoort
Copy link
Owner Author

Completed as of 2899269 (plus previous commits in branch)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant