Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent output across runs (on OS X) #1

Closed
adsouza opened this issue Apr 19, 2018 · 20 comments
Closed

Inconsistent output across runs (on OS X) #1

adsouza opened this issue Apr 19, 2018 · 20 comments
Labels
question Further information is requested

Comments

@adsouza
Copy link

adsouza commented Apr 19, 2018

Every time I run scc on the same codebase on my Macbook, I get completely different results. Even the file count changes every time I run it!

@boyter
Copy link
Owner

boyter commented Apr 19, 2018

Need more information in order to reproduce this.

  • What are you trying to count?
  • Is there some code you can put somewhere that I can use to replicate.
  • Did you download the binary or compile from source?
  • Is it possible that some other background process is modifying what you are trying to count?
  • Can you create a checksum of the directory you are trying to count, then run scc then run the checksum and then scc again to prove that nothing is changing and that scc is the culprit

@boyter boyter added the question Further information is requested label Apr 19, 2018
@adsouza
Copy link
Author

adsouza commented Apr 19, 2018

All I did was run scc ~/go/src several times in a row. For comparison I also ran gocloc and tokei the exact same way and they gave me consistent numbers each time.
I built all of them from source: tokei via Brew and the others via go get.
I'm not doing anything else in the background that would be altering these files and the fact that the other 2 tools get consistent results suggests that the issue is only found in scc.
My guess is that whatever the issue is must be particular to OS X though, since you haven't seen it on either Linux or WIndows.

@boyter
Copy link
Owner

boyter commented Apr 19, 2018

That is very odd. Can you try running on a single directory with one file in it and see if its reproducible there? Then add the to option --trace

scc --trace singledirectory

That should allow me to at least guess what might be going on.

@adsouza
Copy link
Author

adsouza commented Apr 19, 2018

OK, so when I try it on a single dir it's consistent. Even on a dir that contains other dirs that contain code it is fine. But I think I have a reproducible test case for you:
go get github.com/davidrjenni/reftools
Then run scc on the davidrjenni dir a few times and the results are not consistent.
Strangely, if I run it on the only subdir in there (i.e. reftools) then I get consistent results!

@boyter
Copy link
Owner

boyter commented Apr 19, 2018

Are you seeing different counts on the totals?

You probably will see flipping between the following

HTML                        15       998      898        41       59          0
JavaScript                  15     63202    54101      8816      285      11205

That's because the sorting us done by Files rather than the name which is what Tokei does. Since they are the same value of 15 its down to whatever the runtime decides to do when the sort is invoked. You can control it to be like tokei by doing the following scc -s name . which will make the sorting the same.

I might move the default over to sort by lines, however it will have the same issue if the line counts match. I made it based on files because that's what I wanted to see.

@adsouza
Copy link
Author

adsouza commented Apr 19, 2018

I'm seeing the file count for Go flip between 519 (the correct value) and something between 235 and 240 (it varies each time).

@boyter
Copy link
Owner

boyter commented Apr 19, 2018

Must be something to do with the Go runtime on macOS. That is frustrating. I don't have access to an macOS machine.

My guess at the moment is that its something to do with the parallel walking of the file tree. If you drop down a single directory and run it does it work? That is drop down so that there is only a single directory where you run scc which will force it to run the walk in a single GoRoutine. I know you did that with the reftools directory but this should really make things sure.

If that turns out to be the case I will turn off the parallel file walk for Darwin arch until I can determine what the root cause issue is.

@adsouza
Copy link
Author

adsouza commented Apr 19, 2018

I believe that's what I already tried with the parent dir of reftools, which only contains reftools. Here's the console output:

amdsouza-macbookair:davidrjenni amdsouza$ ls
reftools
amdsouza-macbookair:davidrjenni amdsouza$ scc .

Language Files Lines Code Comments Blanks Complexity

Go 217 31285 21179 6720 3386 4849
Markdown 7 308 232 0 76 0
JavaScript 6 832 629 78 125 119
YAML 6 110 99 0 11 4
CSS 4 904 785 4 115 0
JSON 3 91 83 0 8 0
BASH 3 166 110 27 29 10
Dockerfile 2 152 112 19 21 36
ASP.NET 2 2 2 0 0 0
TypeScript 2 215 149 34 32 2
TOML 1 7 6 0 1 0
Happy 1 202 175 0 27 0
Plain Text 1 2 2 0 0 0
Makefile 1 19 12 3 4 0

Total 256 34295 23575 6885 3835 5020

Estimated Cost to Develop $745,881
Estimated Schedule Effort 13.724775 months
Estimated People Required 6.437527

amdsouza-macbookair:davidrjenni amdsouza$ scc .

Language Files Lines Code Comments Blanks Complexity

Go 240 33478 22693 7121 3664 5242
Markdown 7 308 232 0 76 0
JavaScript 6 832 629 78 125 119
YAML 6 110 99 0 11 4
CSS 4 904 785 4 115 0
BASH 3 166 110 27 29 10
JSON 3 91 83 0 8 0
TypeScript 2 215 149 34 32 2
Dockerfile 2 152 112 19 21 36
ASP.NET 2 2 2 0 0 0
Makefile 1 19 12 3 4 0
TOML 1 7 6 0 1 0
Happy 1 202 175 0 27 0
Plain Text 1 2 2 0 0 0

Total 279 36488 25089 7286 4113 5413

Estimated Cost to Develop $796,256
Estimated Schedule Effort 14.069895 months
Estimated People Required 6.703732

@boyter
Copy link
Owner

boyter commented Apr 19, 2018

Odd that it only affects the Go counts.

Can you please try checking out and installing from this branch https://github.com/boyter/scc/tree/Issue1

Then see if the new option scc --npw . resolves the issue for you. It disables the parallel walk. If it does I will have a look at what impact this has in general and either disable it just for macOS or possibly for everything depending on the results.

@adsouza
Copy link
Author

adsouza commented Apr 20, 2018

Even with the --npw flag I still get inconsistent results for the number of Go files in the davidrjenni dir.

@boyter
Copy link
Owner

boyter commented Apr 22, 2018

Have been trying this out on a 2013 Retina Macbook Pro. I am totally unable to replicate the issue. I have included the results I get. Tried running it a dozen times and the results are always 100% the same.

# boyter @ Bens-MacBook-Pro in ~/go/src/github.com/davidrjenni/reftools on git:master o [15:50:33]
$ scc .
-------------------------------------------------------------------------------
Language                 Files     Lines     Code  Comments   Blanks Complexity
-------------------------------------------------------------------------------
Go                         522    115819    83117     20749    11953      15415
JavaScript                  15     63202    54101      8816      285      11205
HTML                        15       998      898        41       59          0
Markdown                    10       376      277         0       99          0
YAML                         7       136      121         0       15          2
CSS                          6      1773     1572         5      196          0
JSON                         3        91       83         0        8          0
Plain Text                   3       172      113         0       59          0
BASH                         3       166      110        27       29         10
TypeScript                   2       215      149        34       32          2
Dockerfile                   2       152      112        19       21         36
Happy                        1       202      175         0       27          0
XML                          1        11       11         0        0          0
C                            1        19       10         4        5          0
Assembly                     1        30       26         0        4          0
Makefile                     1        19       12         3        4          0
TOML                         1         7        6         0        1          0
-------------------------------------------------------------------------------
Total                      594    183388   140893     29698    12797      26670
-------------------------------------------------------------------------------
Estimated Cost to Develop $4874490
Estimated Schedule Effort 28.009405 months
Estimated People Required 20.614841
-------------------------------------------------------------------------------

However this is running macOS Sierra 10.12.6 and not High Sierra which may explain the difference. Are you using High Sierra?

@adsouza
Copy link
Author

adsouza commented Apr 22, 2018

I am using High Sierra but also you are doing one key thing differently.
Can you try it from the parent dir (i.e. inside davidrjenni not reftools)?

@boyter
Copy link
Owner

boyter commented Apr 22, 2018

I copied the above from the last run when I was trying to replicate. No matter where I ran it I had the same result. For the record I tried the following all inside ~/go/src/github.com/davidrjenni/ and then tried again inside reftools

scc .
scc -wl go .

I am starting to suspect that it may be something to do with High Sierra. The only thing I can think of is that somehow when it traverses the file-tree it does so in a way that nothing else does and possibly causes the files to be excluded.

When you run this with the --files option you should be seeing that some files are missing, however if you add --verbose or -v you should see output at the top similar to the below

 WARN 2018-04-22T21:58:55Z: skipping directory due to being in blacklist: .git
 WARN 2018-04-22T21:58:55Z: skipping file unknown extension: Jenkinsfile
 WARN 2018-04-22T21:58:55Z: skipping file unknown extension: quartz.properties

The order of the above is not deterministic but the count should be 100% the same and they should match if you for instance sorted the output. I have a feeling that in your case they would not be. You can verify this using scc -v . | grep "WARN" | sort and capturing the output.

I will see if I can get the person whom I borrowed the Macbook from to upgrade to High Sierra but I suspect that may be unlikely.

@adsouza
Copy link
Author

adsouza commented Apr 22, 2018

Ah, using -v & looking at the warning provided a very useful clue; many lines like this:
error reading: reftools/vendor/golang.org/x/tools/go/buildutil/allpackages.go open reftools/vendor/golang.org/x/tools/go/buildutil/allpackages.go: too many open files

That probably explains what's happening: scc is bumping up against some limit on the number of files that can be open concurrently.

@boyter
Copy link
Owner

boyter commented Apr 22, 2018

Ah yes. It is designed to pull as much from the disk as possible. Sounds like you are hitting the ulimit issues.

https://superuser.com/questions/261023/how-to-change-default-ulimit-values-in-mac-os-x-10-6#306555
https://unix.stackexchange.com/questions/108174/how-to-persistently-control-maximum-system-resource-consumption-on-mac/221988#221988

I will update the documentation to mention this I think. Interesting that tokei does not hit the limit. I might have to look into how to control or at least change the limit when the application runs.

https://stackoverflow.com/questions/17817204/how-to-set-ulimit-n-from-a-golang-program#17818022

Looks promising.

@adsouza
Copy link
Author

adsouza commented Apr 23, 2018

Dang, that warning was a red herring. I fixed the limits an now I no longer see those warnings but I still get inconsistent numbers :(

@boyter
Copy link
Owner

boyter commented Apr 23, 2018

That is rather annoying. Are the verbose responses different at all when the numbers are different?

@adsouza
Copy link
Author

adsouza commented Apr 23, 2018

N/m: I had make a wee mistake earlier; raising the open file limit to 4096 seems to have done the trick after all.

@boyter
Copy link
Owner

boyter commented Apr 23, 2018

Ah great. It seemed such a perfect fit. Ill have a look at implementing the above then.

Oh annoyingly its one of those ones that requires sudo apparently. I might just add it to the documentation. Most dev's raise the ulimit anyway. Might be possible to add it as better reporting output.

boyter added a commit that referenced this issue Apr 23, 2018
@boyter
Copy link
Owner

boyter commented Apr 23, 2018

Going to close this down now. Thanks so much for working through it with me.

@boyter boyter closed this as completed Apr 23, 2018
boyter pushed a commit that referenced this issue Aug 30, 2018
Pull from original boyter repo
boyter pushed a commit that referenced this issue Apr 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants