-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Marketo source fails during sync #7286
Comments
@lazebnyi I spent a little bit more time yesterday digging into this error. One thing that I noted is that it appears to happen when a bulk export job doesnt have the expected structure. According to the Marketo bulk api documents there are daily volume limits and limits on the number of queued jobs. If those limits are hit in this sync process the returned json will not be of the expected format (though I would expect the error to be caught elsewhere). This could be one cause of these errors, and explain why it feels like it works sometimes. Operating under the hypothesis that this is what was happening, I waited 24 hours (for the daily limits to reset), and limited the scope of the sync to only a single table (leads). The sync on this ran successfully for a while until I hit the error below (full logs attached). I'm not sure that this error is related to the one that this issues is about but posting it here just in case you have any insights as to what could be causing it.
|
@chriestensonb Hi, thanks for your work. Yes, you are right reason of issue is response of export doesn't have the expected structure. According to first log I have a question what value you set to window_in_days for first sync (log-2-2.log)? About limits, yes you are right that we have limit for queued jobs, but marketo connector work consistently with each bulk extract, so in each time we have only one queued job. |
@lazebnyi The It is good to know that the connector will not exceed the number of jobs queued limit. However, I am curious about the 500mb per day limit. It seems that if the connector hits / exceeds that limit during a sync, then it could cause an error, and any subsequent runs would also get this error until the limit resets at the end of the 24 hour period. |
I think we can add ignoring of 500 mb error limit. And if during sync we get daily rate limit error we save data which we collect in other bulk extract before error during sync. Following sync for streams which used bulk extract will be empty before limits will not be restored back. But if we face with bulk extract which size will be more 500 size need ask customer to put less day window. @sherifnada Did you have any suggestions about that situation or about my proposition?🙂 |
from a UX POV I'd suggest the following:
@chriestensonb @lazebnyi wdyt? |
@sherifnada I like this, and agree that it is a good path forward. i.e. The user should neither need to know, nor care that the API has limits. |
@lazebnyi Re: The data type error that I posted in the comments: One thing that I think might be happening is that there is a boolean field that is mostly null, and the parser attempts to infer its data type by looking at samples of data from that field. Because it is mostly null, it sees null values and determines it is a float. It then errors when it hits the few actual non null values in that field, which are strings or Booleans. I don't know how to find out if this is the case though. Any suggestions? |
@chriestensonb So, if I understand right. For fields |
@sherifnada I think we can manipulate and count only size of response, so we could not predict size of next response. So, we can only sync before we get an error “1029, Export daily quota exceeded” and inform about that in log. And in the next sync start from state which was before we get an error. About allow the user to configure the max limit I think that not necessary, because all user will be use max value (500Mb) which already provided by API. I see no reason for the user to reduce their limits. |
Yes. I have verified that only nulls or number data types are in that data. |
@chriestensonb Ok, for witch data fields you have 'false' value in export leads from 2020-11-13T00:00:00Z to 2020-12-13T00:00:00Z? Or if it possible could you send me data export leads from 2020-11-13T00:00:00Z to 2020-12-13T00:00:00Z? And if it possible could we switch communication to slack it can be faster there :) |
@lazebnyi I am double checking the data now. It is unlikely that will be able to export the leads data for you, but can run a number of diagnostic tests on it and report back. Good idea, moving communication to slack. |
@lazebnyi any updates on this issue? |
@sherifnada As we talk added logs and published them to Docker hub as release candidate version (0.1.1-rc). @chriestensonb Could you check Slack DM? I already send you 3 days ago info about the next steps with connector release candidate version. |
ah of course. Thanks for the reminder! It's probably helpful to include such updates on tickets just to make sure they are not lost in Slack DMs ;) |
@lazebnyi Apologies for the delay. I have upgraded and used the (0.1.1-rc) for the Marketo connector. The additional logs are very helpful. This issue is about two things, so I have addressed both in different sets of logs.
|
@chriestensonb Thanks for your work! Your logs can help a lot!
The new version is 0.1.1-rc.1 |
I have updated and run again. It does looks like the problem is with the parsing of the csv. The logs print out a lot of the data which contains some PII. Here are the last 100 lines where I have redacted some of the values.
|
Thanks for this log! I think this PR should help with the parsing csv issue. |
Fixed in #8483 |
Enviroment
Current Behavior
I'm attempting to sync for the first time and the process errors out after syncing the first table. Also failed for version 0.30.15 and 0.29.22
Expected Behavior
Sync should complete without error.
Logs
LOG
Relevant lines here (full logs attached):logs-2-2.txt
Steps to Reproduce
Are you willing to submit a PR?
Not at the moment.
The text was updated successfully, but these errors were encountered: