Skip to content
This repository has been archived by the owner on Mar 13, 2022. It is now read-only.

decode add a replace option #104

Merged
merged 1 commit into from
Jun 21, 2019
Merged

decode add a replace option #104

merged 1 commit into from
Jun 21, 2019

Conversation

saberuster
Copy link
Contributor

as mentioned above #88 .

we can determine whether data is utf8 encode in another way,instead of raise a exception on reading stream.

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If you signed the CLA as a corporation, please sign in with your organization's credentials at https://identity.linuxfoundation.org/projects/cncf to be authorized.
  • If you have done the above and are still having issues with the CLA being reported as unsigned, please email the CNCF helpdesk: [email protected]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 26, 2018
@codecov-io
Copy link

codecov-io commented Nov 26, 2018

Codecov Report

Merging #104 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #104   +/-   ##
=======================================
  Coverage   92.04%   92.04%           
=======================================
  Files          13       13           
  Lines        1182     1182           
=======================================
  Hits         1088     1088           
  Misses         94       94

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 879ab01...15474ef. Read the comment docs.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Nov 26, 2018
@dbazhal
Copy link

dbazhal commented Dec 25, 2018

can anyone approve this?

@saberuster
Copy link
Contributor Author

/assign @roycaihw

@dbazhal
Copy link

dbazhal commented Jan 9, 2019

Btw, don't know if it is correct to discuss here - i know that 'replace' is kind of solution, but it seems to me that it is incorrect to decode chunks of data without preliminary concatenation. For me it looks like chunk can really be broken - if it's bytes it's not necessary complete information about encoded symbol, but it may be just the first byte of two bytes total. In that case adding 'replace' could just corrupt data received by client.

@saberuster
Copy link
Contributor Author

saberuster commented Jan 10, 2019

i am agree with your. but in my scene:
if i exec kubectl exec mypod cat /bin/my_binary i can recieve some data like this:

\����̋���܍��,  

if i use python-base , i get an exception when reading data. i want to do same things with kubectl ...
What do you think about this?
@dbazhal

@dbazhal
Copy link

dbazhal commented Jan 10, 2019

if i exec kubectl exec mypod cat /bin/my_binary i can recieve some data like this:

\����̋���܍��,  

What do you think about this?
@dbazhal

I beleive kubectl throws raw binary data to output, but I'm not sure it tries to decode data partially. I assume correct solution to this wold be to merge chunks, then try to decode it with 'replace'. But I'm not sure, that's just guess, need to dig deeper -)

@dbazhal dbazhal mentioned this pull request Jan 21, 2019
@roycaihw
Copy link
Member

roycaihw commented Feb 4, 2019

from the rfc I think we should decode utf8 if opcode is Text and return raw data if opcode is Binary

the previous implementation seems to be modeled from WebSocketApp but I don't see why we assumed utf8 for OPCODE_BINARY kubernetes-client/python#125 (comment). cc @mbohlool

@dbazhal
Copy link

dbazhal commented Feb 11, 2019

from the rfc I think we should decode utf8 if opcode is Text and return raw data if opcode is Binary

the previous implementation seems to be modeled from WebSocketApp but I don't see why we assumed utf8 for OPCODE_BINARY kubernetes-client/python#125 (comment). cc @mbohlool

ok. as I understand, we should mark whole channel by the first frame opcode to treat all it's parts the same way? opcode isn't supposed to change fron frame to frame, right?

@leavest leavest mentioned this pull request Apr 20, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 12, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 11, 2019
@dbazhal
Copy link

dbazhal commented Jun 11, 2019

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jun 11, 2019
@roycaihw
Copy link
Member

I checked the websocket-client module that we depend on. It looks like the module has handled fragment concatenation for us already

our ws client invokes recv_data_frame

op_code, frame = self.sock.recv_data_frame(True)

the websocket-client extracts the data type (text or binary) from the first frame, and accumulates following continuation frames into the payload. The client only returns the ws message when the last fragment (FIN) is received. Note that we've disabled fire_cont_frame and skip_utf8_validation already.

I don't think we need one more layer to concatenate multiple ws messages. websocket-client guarantees each text-type ws message can be decoded as utf8. It's just we shouldn't assume binary-type ws messages can be decoded in the same way

@dbazhal
Copy link

dbazhal commented Jun 20, 2019

@roycaihw ok, that's great! Seems like there is no trouble with ws anymore.
But I think there is still trouble with watch. I tried to switch from my fork to kubernetes-9.0.0 and again got that same error

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 2047: unexpected end of data
<...>
  File "kubernetes/watch/watch.py", line 134, in stream
    for line in iter_resp_lines(resp):
  File "kubernetes/watch/watch.py", line 49, in iter_resp_lines
    seg = seg.decode('utf8')

So it looks like this concrete PR is outdated, but could you please take a look at #112 especially https://github.com/kubernetes-client/python-base/pull/112/files#diff-7768487132d1edef2f3cb4e8d4101890 ?

@dbazhal
Copy link

dbazhal commented Jun 20, 2019

@roycaihw I created separate PR on that #138

@roycaihw
Copy link
Member

This PR fixes a bug (not honoring opcode when decoding) in our ws client. I haven't convinced myself if we want to decode binary message using utf8, but this change is still better than failing

/lgtm
/approve

@dbazhal Watch doesn't use ws so that error should be unrelated. I will review #138

@k8s-ci-robot k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 21, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: roycaihw, saberuster

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 21, 2019
@k8s-ci-robot k8s-ci-robot merged commit 474e9fb into kubernetes-client:master Jun 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm Indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants