Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure DevOps flaky network: Error waiting on socket #760

Closed
frank-bee opened this issue Jan 22, 2021 · 10 comments
Closed

Azure DevOps flaky network: Error waiting on socket #760

frank-bee opened this issue Jan 22, 2021 · 10 comments

Comments

@frank-bee
Copy link

frank-bee commented Jan 22, 2021

I see these error logs popping up sometimes in my alert notification

unable to clone 'ssh://[email protected]/v3/...', error: SSH could not read data: Error waiting on socket
flux-system.flux-system
@frank-bee frank-bee changed the title Azure Devops unable to clone 'ssh://[email protected]/v3/...', error: SSH could not read data: Error waiting on socket Azure Devops git repo: unable to clone 'ssh://[email protected]/v3/...', error: SSH could not read data: Error waiting on socket Jan 22, 2021
@frank-bee
Copy link
Author

this is my repo def.:

---
apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: GitRepository
metadata:
  name: flux-system
  namespace: flux-system
spec:
  gitImplementation: libgit2
  interval: 1m0s
  ref:
    branch: master
  secretRef:
    name: flux-system
  timeout: 20s
  url: ssh://[email protected]/v3/...

In general it works fine

@stefanprodan
Copy link
Member

Please reach out to Azure folks, if it's not a persistent error then it's an issue with Azure flaky network instead of Flux.

@hiddeco
Copy link
Member

hiddeco commented Jan 22, 2021

You may want to increase your interval and see if this improves the situation. As libgit2 does not support shallow cloning, I can see how could result in more flakiness with larger repositories.

@stefanprodan stefanprodan changed the title Azure Devops git repo: unable to clone 'ssh://[email protected]/v3/...', error: SSH could not read data: Error waiting on socket Azure DevOps flaky network: SSH could not read data: Error waiting on socket Jan 22, 2021
@stefanprodan stefanprodan changed the title Azure DevOps flaky network: SSH could not read data: Error waiting on socket Azure DevOps flaky network: Error waiting on socket Jan 22, 2021
@stefanprodan
Copy link
Member

Besides increasing the interval there is nothing we can do in flux, closing this.

@frank-bee
Copy link
Author

I'm in contact with azure @stefanprodan now but it seams to be difficult to find out the root cause.
Is there any way to filter a message like this in alarms? ( besides filtering all git source controller messages completely)

@stefanprodan
Copy link
Member

Is there any way to filter a message like this in alarms?

It will be in the release, see fluxcd/notification-controller#138

@frank-bee
Copy link
Author

frank-bee commented Feb 8, 2021

Cool :-)

I also found out today that HTTPS seems to cause no problems with Azure Devops Git repos ( up to now)
The way I added an azure devops git source is as follows. Maybe you can add an example in the getting started ( because azure devops is anyway already mentioned there but with SSH):

flux create source git flux-system \      
--git-implementation=libgit2 \
--url=https://dev.azure.com/<org>/<project>/_git/<repo>  \
--branch=master \
--interval=1m \
--username=git \
--password=abc

Password is a generated PAT token with read access to git repos (Code --> Read)

@ahojukka5
Copy link

image-automation-controller is unable to clone from a private GitLab server because of this error:

error: SSH could not read data: Error waiting on socket

The same git repository can be however used without any problem by source-controller. Any tips on how to debug what might be causing the problem?

@rjhenry
Copy link

rjhenry commented Jul 16, 2021

I'm seeing a very similar situation, where the the image-automation-controller can't clone a repository from GitHub.com with the same message:

unable to clone 'ssh://[email protected]/<redacted>/<redacted>.git', error: SSH could not read data: Error waiting on socket

The source-controller however uses this exact same repository with no issues. Any advice on debugging this would be much appreciated.

@rjhenry
Copy link

rjhenry commented Jul 16, 2021

@ahojukka5 I've found a solution that worked for me, at least - there gitrepo definition had a URL ending in .git; removing this then worked.
Before: .spec.url: ssh://[email protected]/orgname/repo.git
After: .spec.url: ssh://[email protected]/orgname/repo

Why it worked, I'm not sure - but I happened to stumble across the difference between a working and a non-working cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants