🐛 Source Salesforce: fix bug with pagination for BULK API #6209

yevhenii-ldv · 2021-09-17T12:17:04Z

What

closes #6122.

How

Describe the solution

Pre-merge Checklist

Community member or Airbyter

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

Create a non-forked branch based on this PR and test the below items on it
Build is successful
Credentials added to Github CI. Instructions.
/test connector=connectors/<name> command is passing.
New Connector version released on Dockerhub by running the /publish command described here

yevhenii-ldv · 2021-09-17T12:19:00Z

/test connector=connectors/source-salesforce

🕑 connectors/source-salesforce https://github.com/airbytehq/airbyte/actions/runs/1245431701
✅ connectors/source-salesforce https://github.com/airbytehq/airbyte/actions/runs/1245431701
Python tests coverage:

	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                                 Stmts   Miss  Cover
	 ------------------------------------------------------------------------
	 source_acceptance_test/__init__.py                       2      0   100%
	 source_acceptance_test/base.py                          10      4    60%
	 source_acceptance_test/config.py                        74      8    89%
	 source_acceptance_test/conftest.py                     108    108     0%
	 source_acceptance_test/plugin.py                        45     45     0%
	 source_acceptance_test/tests/__init__.py                 4      0   100%
	 source_acceptance_test/tests/test_core.py              158    109    31%
	 source_acceptance_test/tests/test_full_refresh.py       18     11    39%
	 source_acceptance_test/tests/test_incremental.py        69     38    45%
	 source_acceptance_test/utils/__init__.py                 6      0   100%
	 source_acceptance_test/utils/asserts.py                 37      2    95%
	 source_acceptance_test/utils/common.py                  41     25    39%
	 source_acceptance_test/utils/compare.py                 47     20    57%
	 source_acceptance_test/utils/connector_runner.py        82     49    40%
	 source_acceptance_test/utils/json_schema_helper.py      75     11    85%
	 ------------------------------------------------------------------------
	 TOTAL                                                  776    430    45%
	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                 Stmts   Miss  Cover
	 --------------------------------------------------------
	 unit_tests/unit_test.py::test_example_method �[32mPASSED�[0mCoverage.py warning: No data was collected. (no-data-collected)
	 source_salesforce/__init__.py            2      2     0%
	 source_salesforce/api.py               114    114     0%
	 source_salesforce/exceptions.py          1      1     0%
	 source_salesforce/rate_limiting.py      22     22     0%
	 source_salesforce/source.py             51     51     0%
	 source_salesforce/streams.py           201    201     0%
	 --------------------------------------------------------
	 TOTAL                                  391    391     0%

sherifnada · 2021-09-17T15:19:29Z

airbyte-integrations/connectors/source-salesforce/source_salesforce/streams.py

@@ -186,7 +186,7 @@ def delete_job(self, url: str):

    def next_page_token(self, last_record: dict) -> str:
        if self.primary_key and self.name not in UNSUPPORTED_FILTERING_STREAMS:
-            return f"WHERE {self.primary_key} > '{last_record[self.primary_key]}' "
+            return f"WHERE {self.primary_key} >= '{last_record[self.primary_key]}' "


are PKs numeric and incremental? why are we not using cursor field here?

This is used for Full Refresh streams. It turned out that it is impossible to use Offset more than 2000 - Salesforce API swears at this, so I had to use the same method as for Incremental, only use ID, not cursor_field.

sherifnada · 2021-09-17T15:19:55Z

airbyte-integrations/connectors/source-salesforce/source_salesforce/streams.py

@@ -272,11 +272,12 @@ def read_records(

            if job_status in ["JobComplete", "Aborted", "Failed"]:
                self.delete_job(url=job_full_url)
-                pagination_complete = True
+                if job_status in ["Aborted", "Failed"]:


should we be raising exception if the job failed?

sherifnada · 2021-09-17T15:20:23Z

airbyte-integrations/connectors/source-salesforce/source_salesforce/streams.py

-        if len(response_data["records"]) == self.limit and self.primary_key and self.name not in UNSUPPORTED_FILTERING_STREAMS:
-            return f"WHERE {self.primary_key} > '{response_data['records'][-1][self.primary_key]}' "
+        if len(response_data["records"]) == self.page_size and self.primary_key and self.name not in UNSUPPORTED_FILTERING_STREAMS:
+            return f"WHERE {self.primary_key} >= '{response_data['records'][-1][self.primary_key]}' "


why aer we comparing PK and not cursor field?

Answered in the comment above

yevhenii-ldv · 2021-09-20T13:51:08Z

/test connector=connectors/source-salesforce

🕑 connectors/source-salesforce https://github.com/airbytehq/airbyte/actions/runs/1253790524
✅ connectors/source-salesforce https://github.com/airbytehq/airbyte/actions/runs/1253790524
Python tests coverage:

	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                                 Stmts   Miss  Cover
	 ------------------------------------------------------------------------
	 source_acceptance_test/__init__.py                       2      0   100%
	 source_acceptance_test/base.py                          10      4    60%
	 source_acceptance_test/config.py                        74      8    89%
	 source_acceptance_test/conftest.py                     108    108     0%
	 source_acceptance_test/plugin.py                        45     45     0%
	 source_acceptance_test/tests/__init__.py                 4      0   100%
	 source_acceptance_test/tests/test_core.py              158    109    31%
	 source_acceptance_test/tests/test_full_refresh.py       18     11    39%
	 source_acceptance_test/tests/test_incremental.py        69     38    45%
	 source_acceptance_test/utils/__init__.py                 6      0   100%
	 source_acceptance_test/utils/asserts.py                 37      2    95%
	 source_acceptance_test/utils/common.py                  41     25    39%
	 source_acceptance_test/utils/compare.py                 47     20    57%
	 source_acceptance_test/utils/connector_runner.py        82     49    40%
	 source_acceptance_test/utils/json_schema_helper.py      75     11    85%
	 ------------------------------------------------------------------------
	 TOTAL                                                  776    430    45%
	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                 Stmts   Miss  Cover
	 --------------------------------------------------------
	 source_salesforce/__init__.py            2      2     0%
	 source_salesforce/api.py               114    114     0%
	 source_salesforce/exceptions.py          1      1     0%
	 source_salesforce/rate_limiting.py      22     22     0%
	 source_salesforce/source.py             51     51     0%
	 source_salesforce/streams.py           201    201     0%
	 --------------------------------------------------------
	 TOTAL                                  391    391     0%

…urochkin/salesforce-bug-with-bulk-pagination

yevhenii-ldv · 2021-09-21T11:33:51Z

/publish connector=connectors/source-salesforce

🕑 connectors/source-salesforce https://github.com/airbytehq/airbyte/actions/runs/1257403541
✅ connectors/source-salesforce https://github.com/airbytehq/airbyte/actions/runs/1257403541

Source Salesforce: fix bug with pagination for BULK API

4ea5de7

github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation labels Sep 17, 2021

jrhizor temporarily deployed to more-secrets September 17, 2021 12:21 Inactive

yevhenii-ldv requested review from Zirochkaa, avida and htrueman September 17, 2021 12:32

htrueman approved these changes Sep 17, 2021

View reviewed changes

Zirochkaa approved these changes Sep 17, 2021

View reviewed changes

yevhenii-ldv requested review from davinchia, sherifnada and tuliren September 17, 2021 14:45

sherifnada approved these changes Sep 17, 2021

View reviewed changes

update after review

0f8d197

jrhizor temporarily deployed to more-secrets September 20, 2021 13:53 Inactive

yevhenii-ldv added 2 commits September 21, 2021 14:32

bump version, update changelog

7adb502

Merge branch 'master' of https://github.com/airbytehq/airbyte into yk…

54778d6

…urochkin/salesforce-bug-with-bulk-pagination

yevhenii-ldv temporarily deployed to more-secrets September 21, 2021 11:34 Inactive

jrhizor temporarily deployed to more-secrets September 21, 2021 11:35 Inactive

yevhenii-ldv merged commit c114ec8 into master Sep 21, 2021

yevhenii-ldv deleted the ykurochkin/salesforce-bug-with-bulk-pagination branch September 21, 2021 11:48

jrhizor mentioned this pull request Sep 25, 2021

Bump Airbyte version from 0.29.21-alpha to 0.29.22-alpha #6450

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 Source Salesforce: fix bug with pagination for BULK API #6209

🐛 Source Salesforce: fix bug with pagination for BULK API #6209

yevhenii-ldv commented Sep 17, 2021

yevhenii-ldv commented Sep 17, 2021 •

edited by github-actions bot

Loading

sherifnada Sep 17, 2021

yevhenii-ldv Sep 20, 2021

sherifnada Sep 17, 2021

yevhenii-ldv Sep 20, 2021

sherifnada Sep 17, 2021

yevhenii-ldv Sep 20, 2021

yevhenii-ldv commented Sep 20, 2021 •

edited by github-actions bot

Loading

yevhenii-ldv commented Sep 21, 2021 •

edited by github-actions bot

Loading

🐛 Source Salesforce: fix bug with pagination for BULK API #6209

🐛 Source Salesforce: fix bug with pagination for BULK API #6209

Conversation

yevhenii-ldv commented Sep 17, 2021

What

How

Recommended reading order

Pre-merge Checklist

Community member or Airbyter

Airbyter

yevhenii-ldv commented Sep 17, 2021 • edited by github-actions bot Loading

sherifnada Sep 17, 2021

Choose a reason for hiding this comment

yevhenii-ldv Sep 20, 2021

Choose a reason for hiding this comment

sherifnada Sep 17, 2021

Choose a reason for hiding this comment

yevhenii-ldv Sep 20, 2021

Choose a reason for hiding this comment

sherifnada Sep 17, 2021

Choose a reason for hiding this comment

yevhenii-ldv Sep 20, 2021

Choose a reason for hiding this comment

yevhenii-ldv commented Sep 20, 2021 • edited by github-actions bot Loading

yevhenii-ldv commented Sep 21, 2021 • edited by github-actions bot Loading

yevhenii-ldv commented Sep 17, 2021 •

edited by github-actions bot

Loading

yevhenii-ldv commented Sep 20, 2021 •

edited by github-actions bot

Loading

yevhenii-ldv commented Sep 21, 2021 •

edited by github-actions bot

Loading