Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate code from googleapis/python-documentai #8450

Closed
wants to merge 227 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
227 commits
Select commit Hold shift + click to select a range
2547a0a
feat: add v1beta3 (#34)
yoshi-automation Sep 30, 2020
958d735
Chore: Add requirements.txt and noxfile.py for new samples (#45)
aribray Oct 19, 2020
d1e4a53
docs(samples): new Doc AI samples for v1beta3 (#44)
aribray Oct 21, 2020
71987a6
chores: fixed small issue with start index problem (#56)
munkhuushmgl Nov 11, 2020
59d811d
chore: update samples noxfile
yoshi-automation Nov 18, 2020
898aee1
fix: removes C-style semicolons and slash comments (#59)
telpirion Nov 20, 2020
5b5558e
chore(deps): update dependency google-cloud-storage to v1.33.0 (#61)
renovate-bot Nov 25, 2020
c669546
fix: added if statement to filter out dir blob files (#63)
munkhuushmgl Dec 2, 2020
86646b7
samples(fix): change comments to match function signature (#68)
telpirion Dec 4, 2020
108f47b
fix: moves import statment inside region tags (#71)
telpirion Dec 9, 2020
7d0b369
samples: added test that covers the wrong file type case (#69)
munkhuushmgl Dec 11, 2020
984627e
chore: update templates (#74)
yoshi-automation Dec 29, 2020
deeba8e
samples: migrate v1beta2 doc AI samples (#79)
munkhuushmgl Jan 12, 2021
4cf805b
chore(deps): update dependency google-cloud-storage to v1.35.0 (#78)
renovate-bot Jan 13, 2021
16016fa
chore: added increased timeout on flaky batch request (#84)
munkhuushmgl Jan 22, 2021
8ee5e99
chore: exclude `.nox` directories from linting (#87)
yoshi-automation Jan 28, 2021
d07e423
chore(deps): update dependency google-cloud-storage to v1.36.0 (#91)
renovate-bot Feb 12, 2021
a71eb69
chore(deps): update dependency google-cloud-storage to v1.36.1 (#92)
renovate-bot Feb 24, 2021
a8959c7
fix(samples): swaps 'continue' for 'return' (#93)
telpirion Mar 5, 2021
5b15d2d
fix: adds comment with explicit hostname change (#94)
telpirion Mar 11, 2021
a762210
chore(deps): update dependency google-cloud-storage to v1.36.2 (#95)
renovate-bot Mar 12, 2021
722b954
chore: update templates (#97)
yoshi-automation Mar 23, 2021
b6065a1
chore(deps): update dependency google-cloud-storage to v1.37.0 (#104)
renovate-bot Mar 26, 2021
6eef6ae
chore(deps): update dependency google-cloud-documentai to v0.4.0 (#103)
renovate-bot Mar 30, 2021
c152dc7
samples: updates Document AI samples to v1 version of service (#108)
telpirion Apr 12, 2021
a205555
samples: more updates for v1 (#121)
telpirion Apr 14, 2021
972ca4e
chore: template updates (#120)
yoshi-automation Apr 19, 2021
a8f0e1e
chore(deps): update dependency google-cloud-storage to v1.37.1 (#114)
renovate-bot Apr 19, 2021
9df81d5
chore: migrate to owl bot (#130)
parthea May 10, 2021
83feb0c
chore(deps): update dependency pytest to v6.2.4 (#124)
renovate-bot May 16, 2021
587e8bd
chore: new owl bot post processor docker image (#152)
gcf-owl-bot[bot] May 22, 2021
47a81e2
fix: Parsing pages, but should be paragraphs (#147)
dgallegos May 25, 2021
6dd239d
chore(deps): update dependency google-cloud-documentai to v0.5.0 (#155)
renovate-bot Jun 2, 2021
a43e901
chore(deps): update dependency google-cloud-storage to v1.38.0 (#133)
renovate-bot Jun 2, 2021
0890cd4
chore(deps): update dependency google-cloud-storage to v1.39.0 (#169)
renovate-bot Jun 22, 2021
b034e29
chore(deps): update dependency google-cloud-storage to v1.40.0 (#173)
renovate-bot Jul 1, 2021
4e326a0
chore(deps): update dependency google-cloud-storage to v1.41.0 (#177)
renovate-bot Jul 15, 2021
4998f05
feat: add Samples section to CONTRIBUTING.rst (#181)
gcf-owl-bot[bot] Jul 22, 2021
efe62f9
chore(deps): update dependency google-cloud-storage to v1.41.1 (#182)
renovate-bot Jul 26, 2021
503978f
chore(deps): update dependency google-cloud-documentai to v1 (#185)
renovate-bot Jul 27, 2021
9d1bba9
samples: moves region tag to include import statement (#186)
telpirion Jul 28, 2021
3b074d0
chore: fix INSTALL_LIBRARY_FROM_SOURCE in noxfile.py (#192)
gcf-owl-bot[bot] Aug 11, 2021
377ba35
chore(deps): update dependency google-cloud-storage to v1.42.0 (#194)
renovate-bot Aug 12, 2021
45acae0
chore: drop mention of Python 2.7 from templates (#197)
gcf-owl-bot[bot] Aug 13, 2021
8c83118
samples: moves import statement within region tags (#190)
telpirion Aug 13, 2021
551e358
chore(deps): update dependency pytest to v6.2.5 (#204)
renovate-bot Aug 31, 2021
a5aca52
chore(deps): update dependency google-cloud-storage to v1.42.1 (#209)
renovate-bot Sep 9, 2021
d6a98e4
chore: blacken samples noxfile template (#212)
gcf-owl-bot[bot] Sep 17, 2021
6d8146a
chore(deps): update dependency google-cloud-storage to v1.42.2 (#213)
renovate-bot Sep 20, 2021
8b44e06
chore: fail samples nox session if python version is missing (#218)
gcf-owl-bot[bot] Sep 30, 2021
7f0985a
chore(deps): update dependency google-cloud-storage to v1.42.3 (#219)
renovate-bot Sep 30, 2021
0df9c4a
chore(python): Add kokoro configs for python 3.10 samples testing (#225)
gcf-owl-bot[bot] Oct 8, 2021
07f6465
chore(deps): update dependency google-cloud-documentai to v1.1.0 (#227)
renovate-bot Oct 11, 2021
2ec503b
chore(deps): update dependency google-cloud-documentai to v1.2.0 (#232)
renovate-bot Oct 25, 2021
1d60a72
docs(samples): add OCR, form, quality, splitter and specialized proce…
Nov 10, 2021
0c6dd35
chore(python): run blacken session for all directories with a noxfile…
gcf-owl-bot[bot] Dec 12, 2021
7fb9192
chore(samples): Add check for tests in directory (#257)
gcf-owl-bot[bot] Jan 11, 2022
0025bee
chore(deps): update dependency google-cloud-storage to v2 (#247)
renovate-bot Jan 16, 2022
d816662
chore(python): Noxfile recognizes that tests can live in a folder (#262)
gcf-owl-bot[bot] Jan 19, 2022
cdf1d20
chore(deps): update dependency google-cloud-documentai to v1.2.1 (#263)
renovate-bot Jan 19, 2022
c0e07a6
test: strip quotes and newlines from output (#279)
busunkim96 Feb 26, 2022
eb4c2ff
chore(deps): update dependency google-cloud-storage to v2.1.0 (#264)
renovate-bot Feb 28, 2022
d5d6d9d
chore: Adding support for pytest-xdist and pytest-parallel (#286)
gcf-owl-bot[bot] Mar 4, 2022
3d2d595
chore(deps): update all dependencies (#281)
renovate-bot Mar 5, 2022
e695b3e
chore(deps): update dependency google-cloud-documentai to v1.3.0 (#290)
renovate-bot Mar 8, 2022
9a3e17d
chore(deps): update dependency pytest to v7.1.0 (#291)
renovate-bot Mar 13, 2022
86c3740
chore(deps): update dependency google-cloud-storage to v2.2.0 (#292)
renovate-bot Mar 14, 2022
22b275f
chore(deps): update dependency google-cloud-storage to v2.2.1 (#293)
renovate-bot Mar 16, 2022
cf055ea
chore(deps): update dependency pytest to v7.1.1 (#296)
renovate-bot Mar 19, 2022
938fe7c
chore(deps): update dependency google-cloud-documentai to v1.4.0 (#297)
renovate-bot Mar 23, 2022
3f6153b
chore(python): use black==22.3.0 (#301)
gcf-owl-bot[bot] Mar 28, 2022
89db439
chore(deps): update dependency google-cloud-storage to v2.3.0 (#310)
renovate-bot Apr 14, 2022
8d86326
chore(python): add nox session to sort python imports (#312)
gcf-owl-bot[bot] Apr 21, 2022
12423ce
chore(deps): update dependency pytest to v7.1.2 (#316)
renovate-bot Apr 25, 2022
8cb7c21
chore: removed v1beta2 samples (#315)
galz10 Apr 26, 2022
e06fe31
chore(deps): update dependency google-cloud-documentai to v1.4.1 (#319)
renovate-bot Apr 28, 2022
667edcd
fix: require python 3.7+ (#348)
gcf-owl-bot[bot] Jul 10, 2022
35b59e6
chore(deps): update all dependencies (#338)
renovate-bot Jul 14, 2022
bfe4ffc
refactor: Updates to Document AI Python Samples (#323)
holtskinner Jul 28, 2022
1cbbbe9
chore(deps): update all dependencies (#355)
renovate-bot Aug 5, 2022
8695324
chore(deps): update dependency google-cloud-documentai to v1.5.1 (#362)
renovate-bot Aug 17, 2022
3fc6791
docs(samples): Added Human Review Request Sample (#357)
holtskinner Aug 17, 2022
be8a692
chore(deps): update dependency google-cloud-documentai to v2 (#364)
renovate-bot Aug 19, 2022
ee80dee
chore(deps): update dependency pytest to v7.1.3 (#374)
renovate-bot Sep 6, 2022
bb71e29
docs(samples): Updated Samples for v2.0.0 Client Library (#365)
holtskinner Sep 13, 2022
f1ce969
chore(main): release 2.0.1 (#378)
release-please[bot] Sep 13, 2022
d548678
chore: detect samples tests in nested directories (#379)
gcf-owl-bot[bot] Sep 13, 2022
ce57573
chore(deps): update dependency google-cloud-documentai to v2.0.1 (#380)
renovate-bot Sep 14, 2022
8edec37
docs(samples): Added Processor Version Samples (#382)
holtskinner Sep 26, 2022
24a0627
chore(deps): update dependency google-cloud-documentai to v2.0.2 (#386)
renovate-bot Oct 4, 2022
2bdd856
chore(deps): update dependency google-cloud-documentai to v2.0.3 (#390)
renovate-bot Oct 18, 2022
347b2d4
chore(deps): update dependency pytest to v7.2.0 (#392)
renovate-bot Oct 26, 2022
44f7f92
docs(samples): Added extra exception handling to operation samples (#…
holtskinner Nov 2, 2022
da12685
chore:Remove Sample Inputs/Outputs from Repo (#391)
holtskinner Nov 2, 2022
de63d3c
Merge remote-tracking branch 'migration/main' into python-documentai-…
nicain Nov 2, 2022
16a38f5
Remove unused previous document ai samples folder. Naming is inconsis…
nicain Nov 2, 2022
a7914ac
Update documentai folder
nicain Nov 2, 2022
21b58d9
update blunderbuss.yml
nicain Nov 2, 2022
5d7df35
Merge branch 'main' into python-documentai-migration
nicain Nov 2, 2022
2da4812
Merge branch 'main' into python-documentai-migration
nicain Nov 4, 2022
3fe2e1d
Update documentai/AUTHORING_GUIDE.md
nicain Nov 4, 2022
9916963
Update CODEOWNERS
nicain Nov 4, 2022
7e8e694
Update .github/CODEOWNERS
dandhlee Nov 14, 2022
249c62e
Update .github/blunderbuss.yml
dandhlee Nov 14, 2022
f86c95f
Update .github/blunderbuss.yml
dandhlee Nov 14, 2022
da19172
Update documentai/CONTRIBUTING.md
dandhlee Nov 14, 2022
f9bf8cd
Merge branch 'main' into python-documentai-migration
holtskinner Dec 13, 2022
1557474
fix(samples): Fixed import issues in tests
holtskinner Dec 20, 2022
02921e7
Merge branch 'main' into python-documentai-migration
holtskinner Dec 20, 2022
fa30eb5
fix(samples): Changes snippets import to include documentai module
holtskinner Dec 20, 2022
4b3f800
Merge branch 'python-documentai-migration' of https://github.com/Goog…
holtskinner Dec 20, 2022
15adcfa
Merge branch 'main' into python-documentai-migration
dandhlee Jan 3, 2023
b21e18c
Chore: Add requirements.txt and noxfile.py for new samples (#45)
aribray Oct 19, 2020
5f94ec5
docs(samples): new Doc AI samples for v1beta3 (#44)
aribray Oct 21, 2020
d9428e6
chores: fixed small issue with start index problem (#56)
munkhuushmgl Nov 11, 2020
a702ed2
chore: update samples noxfile
yoshi-automation Nov 18, 2020
b4d03a8
fix: removes C-style semicolons and slash comments (#59)
telpirion Nov 20, 2020
7cd4615
chore(deps): update dependency google-cloud-storage to v1.33.0 (#61)
renovate-bot Nov 25, 2020
666a7ff
fix: added if statement to filter out dir blob files (#63)
munkhuushmgl Dec 2, 2020
c755ff9
samples(fix): change comments to match function signature (#68)
telpirion Dec 4, 2020
85ecf86
fix: moves import statment inside region tags (#71)
telpirion Dec 9, 2020
b1b0f92
samples: added test that covers the wrong file type case (#69)
munkhuushmgl Dec 11, 2020
efb2acc
chore: update templates (#74)
yoshi-automation Dec 29, 2020
7b2f8c9
samples: migrate v1beta2 doc AI samples (#79)
munkhuushmgl Jan 12, 2021
2774619
chore(deps): update dependency google-cloud-storage to v1.35.0 (#78)
renovate-bot Jan 13, 2021
87b9220
chore: added increased timeout on flaky batch request (#84)
munkhuushmgl Jan 22, 2021
70d51cf
chore: exclude `.nox` directories from linting (#87)
yoshi-automation Jan 28, 2021
ab86c32
chore(deps): update dependency google-cloud-storage to v1.36.0 (#91)
renovate-bot Feb 12, 2021
f533e42
chore(deps): update dependency google-cloud-storage to v1.36.1 (#92)
renovate-bot Feb 24, 2021
48b2add
fix(samples): swaps 'continue' for 'return' (#93)
telpirion Mar 5, 2021
5f5b76c
fix: adds comment with explicit hostname change (#94)
telpirion Mar 11, 2021
28051db
chore(deps): update dependency google-cloud-storage to v1.36.2 (#95)
renovate-bot Mar 12, 2021
37ea57c
chore: update templates (#97)
yoshi-automation Mar 23, 2021
ecd0e99
chore(deps): update dependency google-cloud-storage to v1.37.0 (#104)
renovate-bot Mar 26, 2021
d174bf2
chore(deps): update dependency google-cloud-documentai to v0.4.0 (#103)
renovate-bot Mar 30, 2021
513026f
samples: updates Document AI samples to v1 version of service (#108)
telpirion Apr 12, 2021
43f5dd9
samples: more updates for v1 (#121)
telpirion Apr 14, 2021
03536a1
chore: template updates (#120)
yoshi-automation Apr 19, 2021
97a834c
chore(deps): update dependency google-cloud-storage to v1.37.1 (#114)
renovate-bot Apr 19, 2021
43c22f5
chore: migrate to owl bot (#130)
parthea May 10, 2021
0ffdf66
chore(deps): update dependency pytest to v6.2.4 (#124)
renovate-bot May 16, 2021
6629f9c
chore: new owl bot post processor docker image (#152)
gcf-owl-bot[bot] May 22, 2021
321d6fc
fix: Parsing pages, but should be paragraphs (#147)
dgallegos May 25, 2021
728e01f
chore(deps): update dependency google-cloud-documentai to v0.5.0 (#155)
renovate-bot Jun 2, 2021
89214d5
chore(deps): update dependency google-cloud-storage to v1.38.0 (#133)
renovate-bot Jun 2, 2021
ad51a30
chore(deps): update dependency google-cloud-storage to v1.39.0 (#169)
renovate-bot Jun 22, 2021
fe11474
chore(deps): update dependency google-cloud-storage to v1.40.0 (#173)
renovate-bot Jul 1, 2021
167cb1d
chore(deps): update dependency google-cloud-storage to v1.41.0 (#177)
renovate-bot Jul 15, 2021
fe23cdd
feat: add Samples section to CONTRIBUTING.rst (#181)
gcf-owl-bot[bot] Jul 22, 2021
ebc5d5d
chore(deps): update dependency google-cloud-storage to v1.41.1 (#182)
renovate-bot Jul 26, 2021
3879484
chore(deps): update dependency google-cloud-documentai to v1 (#185)
renovate-bot Jul 27, 2021
4bafbc2
samples: moves region tag to include import statement (#186)
telpirion Jul 28, 2021
aa63362
chore: fix INSTALL_LIBRARY_FROM_SOURCE in noxfile.py (#192)
gcf-owl-bot[bot] Aug 11, 2021
efefb26
chore(deps): update dependency google-cloud-storage to v1.42.0 (#194)
renovate-bot Aug 12, 2021
6abc37f
chore: drop mention of Python 2.7 from templates (#197)
gcf-owl-bot[bot] Aug 13, 2021
f9098a5
samples: moves import statement within region tags (#190)
telpirion Aug 13, 2021
9035553
chore(deps): update dependency pytest to v6.2.5 (#204)
renovate-bot Aug 31, 2021
dc6fb2c
chore(deps): update dependency google-cloud-storage to v1.42.1 (#209)
renovate-bot Sep 9, 2021
a6171a9
chore: blacken samples noxfile template (#212)
gcf-owl-bot[bot] Sep 17, 2021
d7bbf09
chore(deps): update dependency google-cloud-storage to v1.42.2 (#213)
renovate-bot Sep 20, 2021
2ed85dd
chore: fail samples nox session if python version is missing (#218)
gcf-owl-bot[bot] Sep 30, 2021
9b6e2fa
chore(deps): update dependency google-cloud-storage to v1.42.3 (#219)
renovate-bot Sep 30, 2021
e8710d3
chore(python): Add kokoro configs for python 3.10 samples testing (#225)
gcf-owl-bot[bot] Oct 8, 2021
dcbccf3
chore(deps): update dependency google-cloud-documentai to v1.1.0 (#227)
renovate-bot Oct 11, 2021
c6feba3
chore(deps): update dependency google-cloud-documentai to v1.2.0 (#232)
renovate-bot Oct 25, 2021
5134855
docs(samples): add OCR, form, quality, splitter and specialized proce…
Nov 10, 2021
0852e35
chore(python): run blacken session for all directories with a noxfile…
gcf-owl-bot[bot] Dec 12, 2021
a07ba8e
chore(samples): Add check for tests in directory (#257)
gcf-owl-bot[bot] Jan 11, 2022
00deacb
chore(deps): update dependency google-cloud-storage to v2 (#247)
renovate-bot Jan 16, 2022
ba7a494
chore(python): Noxfile recognizes that tests can live in a folder (#262)
gcf-owl-bot[bot] Jan 19, 2022
b4e80b4
chore(deps): update dependency google-cloud-documentai to v1.2.1 (#263)
renovate-bot Jan 19, 2022
cdbaadc
test: strip quotes and newlines from output (#279)
busunkim96 Feb 26, 2022
2353f68
chore(deps): update dependency google-cloud-storage to v2.1.0 (#264)
renovate-bot Feb 28, 2022
6755a2e
chore: Adding support for pytest-xdist and pytest-parallel (#286)
gcf-owl-bot[bot] Mar 4, 2022
f00d657
chore(deps): update all dependencies (#281)
renovate-bot Mar 5, 2022
4e513c8
chore(deps): update dependency google-cloud-documentai to v1.3.0 (#290)
renovate-bot Mar 8, 2022
3fab8ee
chore(deps): update dependency pytest to v7.1.0 (#291)
renovate-bot Mar 13, 2022
5a293f1
chore(deps): update dependency google-cloud-storage to v2.2.0 (#292)
renovate-bot Mar 14, 2022
334bb42
chore(deps): update dependency google-cloud-storage to v2.2.1 (#293)
renovate-bot Mar 16, 2022
ed173ea
chore(deps): update dependency pytest to v7.1.1 (#296)
renovate-bot Mar 19, 2022
a8a45ee
chore(deps): update dependency google-cloud-documentai to v1.4.0 (#297)
renovate-bot Mar 23, 2022
ac38098
chore(python): use black==22.3.0 (#301)
gcf-owl-bot[bot] Mar 28, 2022
638a923
chore(deps): update dependency google-cloud-storage to v2.3.0 (#310)
renovate-bot Apr 14, 2022
38ecbbf
chore(python): add nox session to sort python imports (#312)
gcf-owl-bot[bot] Apr 21, 2022
a0a729d
chore(deps): update dependency pytest to v7.1.2 (#316)
renovate-bot Apr 25, 2022
f1339b7
chore: removed v1beta2 samples (#315)
galz10 Apr 26, 2022
82d5bb0
chore(deps): update dependency google-cloud-documentai to v1.4.1 (#319)
renovate-bot Apr 28, 2022
cc59c78
fix: require python 3.7+ (#348)
gcf-owl-bot[bot] Jul 10, 2022
851c4e0
chore(deps): update all dependencies (#338)
renovate-bot Jul 14, 2022
2f7da58
refactor: Updates to Document AI Python Samples (#323)
holtskinner Jul 28, 2022
6119ada
chore(deps): update all dependencies (#355)
renovate-bot Aug 5, 2022
c00fd29
chore(deps): update dependency google-cloud-documentai to v1.5.1 (#362)
renovate-bot Aug 17, 2022
89e7ce1
docs(samples): Added Human Review Request Sample (#357)
holtskinner Aug 17, 2022
dd4bc19
chore(deps): update dependency google-cloud-documentai to v2 (#364)
renovate-bot Aug 19, 2022
7e52ab1
chore(deps): update dependency pytest to v7.1.3 (#374)
renovate-bot Sep 6, 2022
7f4d82d
docs(samples): Updated Samples for v2.0.0 Client Library (#365)
holtskinner Sep 13, 2022
c35078e
chore(main): release 2.0.1 (#378)
release-please[bot] Sep 13, 2022
aca1634
chore: detect samples tests in nested directories (#379)
gcf-owl-bot[bot] Sep 13, 2022
9384f5a
chore(deps): update dependency google-cloud-documentai to v2.0.1 (#380)
renovate-bot Sep 14, 2022
f1f3c37
docs(samples): Added Processor Version Samples (#382)
holtskinner Sep 26, 2022
21fe303
chore(deps): update dependency google-cloud-documentai to v2.0.2 (#386)
renovate-bot Oct 4, 2022
74b3a44
chore(deps): update dependency google-cloud-documentai to v2.0.3 (#390)
renovate-bot Oct 18, 2022
6dc4b94
chore(deps): update dependency pytest to v7.2.0 (#392)
renovate-bot Oct 26, 2022
34c0e3f
docs(samples): Added extra exception handling to operation samples (#…
holtskinner Nov 2, 2022
fbdcfe1
chore:Remove Sample Inputs/Outputs from Repo (#391)
holtskinner Nov 2, 2022
4356d33
chore(deps): update dependency google-cloud-storage to v2.6.0 (#399)
renovate-bot Nov 8, 2022
3d21322
chore(deps): update dependency google-cloud-documentai to v2.1.0 (#407)
renovate-bot Nov 9, 2022
1e68334
docs(samples): Updated code samples for 2.1.0 release (#406)
holtskinner Nov 11, 2022
7600e28
chore(deps): update dependency google-cloud-documentai to v2.2.0 (#411)
renovate-bot Nov 14, 2022
702f709
chore(deps): update dependency google-cloud-documentai to v2.3.0 (#414)
renovate-bot Nov 15, 2022
7593fb2
chore(python): drop flake8-import-order in samples noxfile (#421)
gcf-owl-bot[bot] Nov 27, 2022
42deddf
fix(samples): Fix Typos in Batch process & get processor Samples (#420)
holtskinner Nov 27, 2022
5a2459c
chore(deps): update dependency google-cloud-documentai to v2.4.0 (#423)
renovate-bot Dec 2, 2022
822488b
chore(deps): update dependency google-cloud-storage to v2.7.0 (#426)
holtskinner Dec 7, 2022
58d487b
chore(deps): update dependency google-cloud-documentai to v2.4.1 (#428)
renovate-bot Dec 12, 2022
804dddc
chore(deps): update dependency google-cloud-documentai to v2.5.0 (#432)
renovate-bot Dec 14, 2022
3043731
chore(deps): update dependency google-cloud-documentai to v2.6.0 (#435)
renovate-bot Dec 15, 2022
a798a17
Moved Python Files to new_directory
holtskinner Jan 3, 2023
32f2b63
Pulled in updates from python-documentai repository
holtskinner Jan 3, 2023
1889479
Deleted temporary directory
holtskinner Jan 3, 2023
1328a19
Merge branch 'main' into python-documentai-migration
holtskinner Jan 3, 2023
e3d58e2
Addressed Test Failures
holtskinner Jan 4, 2023
c3f455a
Merge branch 'python-documentai-migration' of https://github.com/Goog…
holtskinner Jan 4, 2023
77dd5d9
Merge branch 'main' into python-documentai-migration
holtskinner Jan 4, 2023
781c196
Merge branch 'main' into python-documentai-migration
kweinmeister Jan 4, 2023
de30aa2
Updated Document AI/Storage Client Library Versions in requirements.txt
holtskinner Jan 4, 2023
f453358
Addressed flake8 import linter errors
holtskinner Jan 4, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
/dataproc/**/* @GoogleCloudPlatform/python-samples-reviewers
/datastore/**/* @GoogleCloudPlatform/cloud-native-db-dpes @GoogleCloudPlatform/python-samples-reviewers
/dns/**/* @GoogleCloudPlatform/python-samples-reviewers
/documentai/**/* @GoogleCloudPlatform/dee-data-ai @GoogleCloudPlatform/python-samples-reviewers
/endpoints/**/* @GoogleCloudPlatform/python-samples-reviewers
/eventarc/**/* @GoogleCloudPlatform/aap-dpes @GoogleCloudPlatform/python-samples-reviewers
/error_reporting/**/* @GoogleCloudPlatform/python-samples-reviewers
Expand Down
5 changes: 5 additions & 0 deletions .github/blunderbuss.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# limitations under the License.

assign_issues_by:
- labels:
- 'api: documentai'
to:
- GoogleCloudPlatform/dee-data-ai
- labels:
- 'api: appengine'
- 'api: eventarc'
Expand Down Expand Up @@ -188,6 +192,7 @@ assign_prs_by:
to:
- GoogleCloudPlatform/infra-db-dpes
- labels:
- 'api: documentai'
- 'api: retail'
to:
- GoogleCloudPlatform/dee-data-ai
Expand Down
3 changes: 0 additions & 3 deletions document/README.rst

This file was deleted.

1 change: 1 addition & 0 deletions documentai/AUTHORING_GUIDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
See https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/AUTHORING_GUIDE.md
1 change: 1 addition & 0 deletions documentai/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
See https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/CONTRIBUTING.md
Empty file added documentai/__init__.py
Empty file.
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
# Copyright 2020 Google LLC
dandhlee marked this conversation as resolved.
Show resolved Hide resolved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


# [START documentai_batch_process_documents_processor_version]
import re

from google.api_core.client_options import ClientOptions
from google.api_core.exceptions import InternalServerError
from google.api_core.exceptions import RetryError
from google.cloud import documentai
from google.cloud import storage

# TODO(developer): Uncomment these variables before running the sample.
# project_id = 'YOUR_PROJECT_ID'
# location = 'YOUR_PROCESSOR_LOCATION' # Format is 'us' or 'eu'
# processor_id = 'YOUR_PROCESSOR_ID' # Example: aeb8cea219b7c272
# processor_version_id = "YOUR_PROCESSOR_VERSION_ID" # Example: pretrained-ocr-v1.0-2020-09-23
# gcs_input_uri = "YOUR_INPUT_URI" # Format: gs://bucket/directory/file.pdf
# input_mime_type = "application/pdf"
# gcs_output_bucket = "YOUR_OUTPUT_BUCKET_NAME" # Format: gs://bucket
# gcs_output_uri_prefix = "YOUR_OUTPUT_URI_PREFIX" # Format: directory/subdirectory/
# field_mask = "text,entities,pages.pageNumber" # Optional. The fields to return in the Document object.


def batch_process_documents_processor_version(
project_id: str,
location: str,
processor_id: str,
processor_version_id: str,
gcs_input_uri: str,
input_mime_type: str,
gcs_output_bucket: str,
gcs_output_uri_prefix: str,
field_mask: str = None,
timeout: int = 400,
):

# You must set the api_endpoint if you use a location other than 'us'.
opts = ClientOptions(api_endpoint=f"{location}-documentai.googleapis.com")

client = documentai.DocumentProcessorServiceClient(client_options=opts)

gcs_document = documentai.GcsDocument(
gcs_uri=gcs_input_uri, mime_type=input_mime_type
)

# Load GCS Input URI into a List of document files
gcs_documents = documentai.GcsDocuments(documents=[gcs_document])
input_config = documentai.BatchDocumentsInputConfig(gcs_documents=gcs_documents)

# NOTE: Alternatively, specify a GCS URI Prefix to process an entire directory
#
# gcs_input_uri = "gs://bucket/directory/"
# gcs_prefix = documentai.GcsPrefix(gcs_uri_prefix=gcs_input_uri)
# input_config = documentai.BatchDocumentsInputConfig(gcs_prefix=gcs_prefix)
#

# Cloud Storage URI for the Output Directory
# This must end with a trailing forward slash `/`
destination_uri = f"{gcs_output_bucket}/{gcs_output_uri_prefix}"

gcs_output_config = documentai.DocumentOutputConfig.GcsOutputConfig(
gcs_uri=destination_uri, field_mask=field_mask
)

# Where to write results
output_config = documentai.DocumentOutputConfig(gcs_output_config=gcs_output_config)

# The full resource name of the processor version
# e.g. projects/{project_id}/locations/{location}/processors/{processor_id}/processorVersions/{processor_version_id}
name = client.processor_version_path(
project_id, location, processor_id, processor_version_id
)

request = documentai.BatchProcessRequest(
name=name,
input_documents=input_config,
document_output_config=output_config,
)

# BatchProcess returns a Long Running Operation (LRO)
operation = client.batch_process_documents(request)

# Continually polls the operation until it is complete.
# This could take some time for larger files
# Format: projects/PROJECT_NUMBER/locations/LOCATION/operations/OPERATION_ID
try:
print(f"Waiting for operation {operation.operation.name} to complete...")
operation.result(timeout=timeout)
# Catch exception when operation doesn't finish before timeout
except (RetryError, InternalServerError) as e:
print(e.message)

# NOTE: Can also use callbacks for asynchronous processing
#
# def my_callback(future):
# result = future.result()
#
# operation.add_done_callback(my_callback)

# Once the operation is complete,
# get output document information from operation metadata
metadata = documentai.BatchProcessMetadata(operation.metadata)

if metadata.state != documentai.BatchProcessMetadata.State.SUCCEEDED:
raise ValueError(f"Batch Process Failed: {metadata.state_message}")

storage_client = storage.Client()

print("Output files:")
# One process per Input Document
for process in metadata.individual_process_statuses:
# output_gcs_destination format: gs://BUCKET/PREFIX/OPERATION_NUMBER/INPUT_FILE_NUMBER/
# The Cloud Storage API requires the bucket name and URI prefix separately
matches = re.match(r"gs://(.*?)/(.*)", process.output_gcs_destination)
if not matches:
print(
"Could not parse output GCS destination:",
process.output_gcs_destination,
)
continue

output_bucket, output_prefix = matches.groups()

# Get List of Document Objects from the Output Bucket
output_blobs = storage_client.list_blobs(output_bucket, prefix=output_prefix)

# Document AI may output multiple JSON files per source file
for blob in output_blobs:
# Document AI should only output JSON files to GCS
if ".json" not in blob.name:
print(
f"Skipping non-supported file: {blob.name} - Mimetype: {blob.content_type}"
)
continue

# Download JSON File as bytes object and convert to Document Object
print(f"Fetching {blob.name}")
document = documentai.Document.from_json(
blob.download_as_bytes(), ignore_unknown_fields=True
)

# For a full list of Document object attributes, please reference this page:
# https://cloud.google.com/python/docs/reference/documentai/latest/google.cloud.documentai_v1.types.Document

# Read the text recognition output from the processor
print("The document contains the following text:")
print(document.text)


# [END documentai_batch_process_documents_processor_version]
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

import os
from uuid import uuid4

from documentai.snippets import \
batch_process_documents_processor_version_sample

location = "us"
project_id = os.environ["GOOGLE_CLOUD_PROJECT"]
processor_id = "90484cfdedb024f6"
processor_version_id = "pretrained-form-parser-v1.0-2020-09-23"
gcs_input_uri = "gs://cloud-samples-data/documentai/invoice.pdf"
input_mime_type = "application/pdf"
gcs_output_bucket = "gs://document-ai-python"
gcs_output_uri_prefix = f"{uuid4()}/"
field_mask = "text,pages.pageNumber"


def test_batch_process_documents_processor_version(capsys):
batch_process_documents_processor_version_sample.batch_process_documents_processor_version(
project_id=project_id,
location=location,
processor_id=processor_id,
processor_version_id=processor_version_id,
gcs_input_uri=gcs_input_uri,
input_mime_type=input_mime_type,
gcs_output_bucket=gcs_output_bucket,
gcs_output_uri_prefix=gcs_output_uri_prefix,
field_mask=field_mask,
)
out, _ = capsys.readouterr()

assert "operation" in out
assert "Fetching" in out
assert "text:" in out
Loading