Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix user names in tokenless authentications script #53

Merged
merged 2 commits into from
Dec 4, 2017

Conversation

pluehne
Copy link
Contributor

@pluehne pluehne commented Dec 4, 2017

In this script, user names were read from LDAP, where underscores and periods are allowed characters. However, these characters aren’t allowed in GitHub user names, which is why they are replaced with dashes.

Because of this, user names containing underscores and periods weren’t linked correctly in the tokenless authentications list. This pull request fixes this by performing the substitution directly within the affected script.

In this script, user names were read from LDAP, where underscores and
periods are allowed characters. However, these characters aren’t allowed
in GitHub user names, which is why they are replaced with dashes.

Because of this, user names containing underscores and periods weren’t
linked correctly in the tokenless authentications list. This commit
fixes this by performing the substitution directly within the affected
script.
@pluehne pluehne added the bug label Dec 4, 2017
@pluehne pluehne self-assigned this Dec 4, 2017
@@ -10,4 +10,5 @@ zcat -f /var/log/github/gitauth.log.1* |
sort |
uniq -c |
sort -rn |
awk '{printf("%s\t%s\t%s\n",$2,$3,$1)}'
awk '{printf("%s\t%s\t%s\n",$2,$3,$1)}' |
awk '{gsub(/[_.]/, "-", $1)}1'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be easier/more clear to make this in Python? I think it makes sense to use bash etc. to reduce the amount of data (to avoid transporting lots of data over SSH). But massaging the data could be done in Python: https://github.com/Autodesk/hubble/blob/master/updater/reports/ReportTokenlessAuth.py

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I’d prefer doing the heavy lifting in the scripts and queries and to use Python mostly for controlling. That’s because the scripts might also be useful on their own, and I think they should have the same output as Hubble.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. as a compromise, can't we roll the gsub call into the first awk statement?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works:

awk '{gsub(/[_.]/, "-", $1); printf("%s\t%s\t%s\n",$2,$3,$1)}'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, just keep in mind that the $1 in gsub should be a $2, because the order of the two commands is inverted now.

@larsxschneider larsxschneider merged commit dbdfccf into master Dec 4, 2017
@larsxschneider larsxschneider deleted the patrick/fix-user-names branch December 4, 2017 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants