Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TinkerPop 3.6.1 [tp-tests] #3208

Merged
merged 1 commit into from
Oct 24, 2022

Conversation

farodin91
Copy link
Contributor

@farodin91 farodin91 commented Sep 15, 2022

Fixes #3069


Thank you for contributing to JanusGraph!

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there an issue associated with this PR? Is it referenced in the commit message?
  • Does your PR body contain #xyz where xyz is the issue number you are trying to resolve?
  • Has your PR been rebased against the latest commit within the target branch (typically master)?
  • Is your initial contribution a single, squashed commit?

For code changes:

  • Have you written and/or updated unit tests to verify your changes?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE.txt file, including the main LICENSE.txt file in the root of this repository?
  • If applicable, have you updated the NOTICE.txt file, including the main NOTICE.txt file found in the root of this repository?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

pom.xml Outdated
<slf4j.version>1.7.36</slf4j.version>
<logback.version>1.2.11</logback.version>
<httpcomponents.httpclient.version>4.5.13</httpcomponents.httpclient.version>
<httpcomponents.httpcore.version>4.4.15</httpcomponents.httpcore.version>
<hadoop2.version>2.8.5</hadoop2.version>
<hadoop2.version>3.3.4</hadoop2.version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nitpick) I think it makes sense to call this version hadoop3 now instead of hadoop2. I.e.:

<hadoop3.version>3.3.4</hadoop3.version>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nitpick, so not necessary to resolve, we can refactor it afterwards

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will clean before marking the pr ready.

@farodin91 farodin91 force-pushed the tinkerpop branch 2 times, most recently from abdd217 to 74a2cd3 Compare September 15, 2022 20:35
@farodin91 farodin91 added this to the Release v1.0.0 milestone Sep 15, 2022
@farodin91 farodin91 force-pushed the tinkerpop branch 2 times, most recently from 9387232 to a83351d Compare September 16, 2022 08:39
@farodin91
Copy link
Contributor Author

farodin91 commented Sep 19, 2022

I will split this update into multiple PRs.

@farodin91
Copy link
Contributor Author

@porunov Hbase upgrade is a pain.

@porunov
Copy link
Member

porunov commented Oct 1, 2022

@porunov Hbase upgrade is a pain.

I imagine it is ...

@farodin91
Copy link
Contributor Author

@porunov @FlorianHockmann @li-boxuan @rngcntr Does any have a bit time to look into hbase hadoop tests?

@li-boxuan li-boxuan linked an issue Oct 4, 2022 that may be closed by this pull request
@farodin91
Copy link
Contributor Author

It looks like this one of the bugs which stoping me to get this running fine. https://github.com/apache/hbase/pull/4819/files

Other tests in apache projects were deactivated in the combination to test hbase 2 with hadoop 3. https://github.com/apache/ranger

@mad
Copy link
Contributor

mad commented Oct 12, 2022

@farodin91

TP tests are skipped

It would be nice to run it by adding [tp-tests] into commit message

@farodin91 farodin91 changed the title Update TinkerPop 3.6.1 Update TinkerPop 3.6.1 [tp-tests] Oct 12, 2022
@farodin91
Copy link
Contributor Author

@mad I would like to find a way to fix the hbase test before hand.

@mad
Copy link
Contributor

mad commented Oct 13, 2022

@farodin91

Some info about hbase issue

Scan for janusgraph hbase return some data, but metadata say no data exists

Scan response

hbase:003:0> scan 'janusgraph', {LIMIT => 10}
ROW                                                 COLUMN+CELL                                                                                                                                            
 \x00\x00\x00\x00\x00\x00\x00\x03                   column=i:\xFF\xFF\xFF\xFF\xFF\xFE\xC7\x7F\x00\x00\x01\x83\xD0\xEB\xE0\x807f000101691244-bic-pc1, timestamp=2022-10-13T10:37:42.913, value=             
 \x00\x00\x00\x00\x00\x00\x00\x04                   column=i:\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x9B\x00\x00\x01\x83\xD0\xEB\xDF(7f000101691244-bic-pc1, timestamp=2022-10-13T10:37:42.569, value=                
 \x00\x00\x00\x00\x00\x00\x00\x04                   column=i:\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xCD\x00\x00\x01\x83\xD0\xEB\xDD\xCA7f000101691244-bic-pc1, timestamp=2022-10-13T10:37:42.219, value=             
 \x00\x00\x00\x00\x00\x00\x02\x0D                   column=e:\x02, timestamp=2022-10-13T10:37:44.127, value=\x00\x01\x08\x80                                                                               
 \x00\x00\x00\x00\x00\x00\x02\x0D                   column=e:\x10\xC0, timestamp=2022-10-13T10:37:44.127, value=\xA0vl\x1EvertexKe\xF9\x04\x80                                                             
 \x00\x00\x00\x00\x00\x00\x02\x0D                   column=e:\x10\xC2\x80\x14\x00, timestamp=2022-10-13T10:37:44.127, value=\x8F\x00\x01\x8E\x00\x8F\x80                                                   
 \x00\x00\x00\x00\x00\x00\x02\x0D                   column=e:\x10\xC2\x80\x18\x00, timestamp=2022-10-13T10:37:44.127, value=\x8F\x00\x01\x8E\x00\x90\x80                                                   
 \x00\x00\x00\x00\x00\x00\x02\x0D                   column=e:\x10\xC4, timestamp=2022-10-13T10:37:44.127, value=\x00\x82\x0C\x80                                                                           
 \x00\x00\x00\x00\x00\x00\x02\x0D                   column=e:\x10\xC8, timestamp=2022-10-13T10:37:44.127, value=\x00\x80\x00\x01\x83\xD0\xEB\xE1\xFF\x10\x80                                               
 \x18\xD4{\x96\x10\xA5\xA0vl\x1EvertexKe\xF9        column=g:\x00, timestamp=2022-10-13T10:37:44.127, value=\x04\x8D                                                                                       
 configuration                                      column=s:graph.janusgraph-version, timestamp=2022-10-13T10:37:42.043, value=\x92\xA01.0.0-SNAPSHO\xD4                                                  
 configuration                                      column=s:graph.storage-version, timestamp=2022-10-13T10:37:42.047, value=\x92\xA0\xB2                                                                  
 configuration                                      column=s:graph.timestamps, timestamp=2022-10-13T10:37:42.033, value=\xB6\x82                                                                           
 configuration                                      column=s:hidden.frozen, timestamp=2022-10-13T10:37:42.049, value=\x8F\x01                                                                              
 configuration                                      column=s:ids.num-partitions, timestamp=2022-10-13T10:37:42.035, value=\x8C\x82                                                                         
 configuration                                      column=s:storage.drop-on-clear, timestamp=2022-10-13T10:37:42.039, value=\x8F\x00                                                                      
 configuration                                      column=s:system-registration.7f000101691244-bic-pc1.startup-time, timestamp=2022-10-13T10:37:42.153, value=\xC1\x80\x00\x00\x00cG\xEAv\x01\x11ta\x80   
 \x88\x00\x00\x00\x00\x00\x00\x00                   column=i:\xFF\xFF\xFF\xFF\xFF\xFF\xD8\xEF\x00\x00\x01\x83\xD0\xEB\xE2\x077f000101691244-bic-pc1, timestamp=2022-10-13T10:37:43.303, value=             
 \x88\x00\x00\x00\x00\x00\x00\x03                   column=i:\xFF\xFF\xFF\xFF\xFF\xFE\xC7\x7F\x00\x00\x01\x83\xD0\xEB\xE3c7f000101691244-bic-pc1, timestamp=2022-10-13T10:37:43.651, value=                
 \x88\x00\x00\x00\x00\x00\x00\x80                   column=e:\x02, timestamp=2022-10-13T10:37:44.001, value=\x00\x01\x04\x91                                                                               
 \x88\x00\x00\x00\x00\x00\x00\x80                   column=e:$, timestamp=2022-10-13T10:37:44.001, value=\x04\x8D\x08\x91\xFF                                                                              
 \xFA<[T\x11\xA5\x82                                column=g:\x00\x04\x8D\x0C\x80, timestamp=2022-10-13T10:37:44.127, value=\x04\x8D                                                                       
9 row(s)
Took 0.0498 seconds                                    

Metadata response

hbase:004:0> scan 'hbase:meta', {FILTER=>"PrefixFilter('janusgraph')", COLUMNS=>['info:regioninfo']}
ROW                                                 COLUMN+CELL                                                                                                                                            
 janusgraph,,1665657444679.32f32af13dd119840a3d0b3c column=info:regioninfo, timestamp=2022-10-13T10:37:41.600, value={ENCODED => 32f32af13dd119840a3d0b3c75e01e99, NAME => 'janusgraph,,1665657444679.32f32
 75e01e99.                                          af13dd119840a3d0b3c75e01e99.', STARTKEY => '', ENDKEY => ''}                                                                                           
1 row(s)

STARTKEY and ENDKEY are empties.

So, that lead to empty inputSplits here org.janusgraph.hadoop.formats.hbase.HBaseBinaryInputFormat#getSplits

@farodin91
Copy link
Contributor Author

@mad any idea why metadata is empty?

@mad
Copy link
Contributor

mad commented Oct 13, 2022

@farodin91

Actually root cause is spark https://spark.apache.org/docs/3.2.0/core-migration-guide.html#upgrading-from-core-31-to-32

Since Spark 3.2, spark.hadoopRDD.ignoreEmptySplits is set to true by default which means Spark will not create empty partitions for empty input splits. To restore the behavior before Spark 3.2, you can set spark.hadoopRDD.ignoreEmptySplits to false.

So, just put spark.hadoopRDD.ignoreEmptySplits=false to hbase-read.properties and hbase-read-snapshot.properties

@farodin91 farodin91 marked this pull request as ready for review October 13, 2022 21:20
@farodin91 farodin91 requested a review from a team October 14, 2022 04:29
@farodin91
Copy link
Contributor Author

@mad Thank you.

docs/changelog.md Show resolved Hide resolved
@farodin91
Copy link
Contributor Author

@mad Updated

Fixes JanusGraph#3069

* Upgrade hadoop 3.x
* Remove Jackson 1

Signed-off-by: Jan Jansen <[email protected]>
@mad
Copy link
Contributor

mad commented Oct 17, 2022

@farodin91

3.6 has some breaking changes https://github.com/apache/tinkerpop/blob/master/CHANGELOG.asciidoc - (breaking) mark

Do we need to take this into account?

@farodin91
Copy link
Contributor Author

@mad All breaking changes, i've handled multiple breaking changes. https://issues.apache.org/jira/browse/TINKERPOP-2507 and gryo removal.

@farodin91
Copy link
Contributor Author

@porunov Would you like to review it again?

Copy link
Member

@porunov porunov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you @farodin91 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: external Externally-managed CLA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update to TinkerPop 3.6.x
5 participants