Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Expecting Value: line 1 column 1 (char 0)" Error encountered when adding large number of subdomains through cat #96

Open
contaminatedesert opened this issue Mar 12, 2022 · 5 comments

Comments

@contaminatedesert
Copy link

Running command cat file.txt | bbrf domain add - using a large file (> 40K lines) causes a "Expecting Value: line 1 column 1 (Char 0)" error as seen in the screenshot. This file only contains subdomains, one per line, but contains several thousand.

image

@honoki
Copy link
Owner

honoki commented Mar 29, 2022

Hi @contaminatedesert - thanks for flagging this. This issue has popped up from time to time, so I figured to have a decent look at trying to fix this. I was able to reproduce this locally and the issue was the bbrf server times out with a 504, but the documents were added. Could you verify if this is also the case for you?

If not, could you please enable debugging mode by adding "debug":true to ~/.bbrf/config.json and add the output here?

@honoki
Copy link
Owner

honoki commented Mar 29, 2022

Another possible issue is that the request size is too large, see this comment: #78 (comment)

I am debating what is the most graceful way to handle either of these errors.

@contaminatedesert
Copy link
Author

Hello @honoki,

Thank you for your response. I had not seen that original thread on this issue, so thank you for that. One interesting, though probably unrelated item, is that I too am importing .mil domains.

I have used my own workaround by breaking my file up in to pieces using head and tail and that works. One thing that I noticed is that it does not always fail on the same number, sometimes I can chunk my domains into >10,000, other times it only likes <10,000.

I'm not quite sure what you mean by request size being too large. What request? The request from BBRF to validate the domain? Some other request?

To answer your question, it did seem that even though I was receiving the error, they did seem to be added to the database, or at least most of them, there did seem to be some dropouts but that may have been the domain validation.

I am going to add more domains right now and turn debugging on and add everything here (I may obfuscate some data to maintain my privacy).

I will post what I find.

@contaminatedesert
Copy link
Author

@honoki

Alright, so here's what I've done. I added the debug item and restarted bbrf-server.

I then counted the number of navy.mil domains already existing in my database and the result was 4,112.

I then counted the # of entries in the file I was about to import and that was 50,298. I then proceeded to import the file in the usual way cat file.txt | bbrf domain add - > debug.txt as you can see I added the debug data to a file, which is attached.

I too encountered the 504 error. Interestingly, the debug data I saw was not the same as in the debug.txt file, so I will add it below.

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): 127.0.0.1:443 DEBUG:urllib3.connectionpool:https://127.0.0.1:443 "GET /bbrf/dod HTTP/1.1" 200 None DEBUG:urllib3.connectionpool:https://127.0.0.1:443 "POST /bbrf/_bulk_docs HTTP/1.1" 504 167

The debug.txt file contains the steps taken by BBRF as well as the error we've been encountering.
debug.txt

I then redid the count of navy.mil domains and got the result 52,712. As you can see, most of my domains were properly added, though I am down roughly 1,700 for some reason.

I did not encounter the 413 error, although I think this is likely because I am using the docker image.

Another bit of information that may be helpful, is that when I encounter this issue, most of the time (but not all the time) I experience a significant decrease in system performance. This is corrected when I stop the docker images and restart bbrf-server.

Please let me know if there's anything else I can do to assist.

@pdelteil
Copy link
Contributor

In my experience, when adding a large amount of data you need to check if the data is what it's supposed to be. Use grep to find garbage in your domain list (symbols, spaces, etc.) Then, to add the domains is better to do it in chunks, basically dividing the input in several parts, my default value is 1,000, but If you have more than 2vcpu and more than 4 GB RAM, you can go higher.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants