Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON parsing error only shows up with a particular slice size #290

Open
ciorg opened this issue Jun 12, 2020 · 3 comments
Open

JSON parsing error only shows up with a particular slice size #290

ciorg opened this issue Jun 12, 2020 · 3 comments
Labels
bug Something isn't working

Comments

@ciorg
Copy link
Member

ciorg commented Jun 12, 2020

I'm getting the following error, but only when I change the slices to 25,000:
TSError: Failure to parse buffer, SyntaxError: Unexpected token b in JSON at position 3 at pRetry

When I run the same job with size 100,000 I don't get the error.

I ran both test jobs 2x and saw the same thing both times.

Job settings for job that gets the slice error:

{
            "_op": "kafka_reader",
            "connection": "CONNECTION,
            "topic": "TOPIC",
            "group": "GROUP",
            "size": 25000,
            "wait": 30000
        },
        {
            "_op": "noop"
        }

graphana:
Screen Shot 2020-06-12 at 2 33 56 PM

Job settings for job that gets no errors:

{
            "_op": "kafka_reader",
            "connection": "CONNECTION",
            "topic": "TOPIC",
            "group": "GROUP",
            "size": 100000,
            "wait": 30000
        },
        {
            "_op": "noop"
        }

graphana:
Screen Shot 2020-06-12 at 2 33 14 PM

If I run it with a size of 1000 I don't get any slice errors either. Not sure if the error is being swallowed or if something is breaking the json on certain slice sizes.

Just random luck that I found this, but thought I should document it.

@ciorg ciorg added the bug Something isn't working label Jun 12, 2020
@peterdemartini
Copy link
Contributor

It would be useful to see the record that is failing to parse, (it might help us figure out why it is happening), maybe if _dead_letter_action to log or use the kafka dead letter queue

@ciorg
Copy link
Member Author

ciorg commented Jun 16, 2020

I added the dead_letter_queue to the job and ran it again and was able to send the bad records to another topic.

The buffer in the records in the dead_letter queue converts to hex, and the hex converts to a parsable json doc.

The dead_letter_queue also adds the partition and offset - so I looked a record up in the original topic with kafkacat and see the same thing, a long hex string.

Almost all the records show up as json with kafkacat - so it makes me think they are coming in as hex strings.

But that doesn't explain why I only see these errors with certain slice size. Seems like they would throw errors every time?

@peterdemartini
Copy link
Contributor

I can't imagine why the slice size would affect this. This should like just a data problem and maybe it is just chance that it happens when you increase the slice size?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants