update

wayneguow · Jun 12, 2024 · ab82eab · ab82eab
1 parent b5e1b79
commit ab82eab
Show file tree

Hide file tree

Showing 3 changed files with 6 additions and 6 deletions.
diff --git a/docs/sql-data-sources-load-save-functions.md b/docs/sql-data-sources-load-save-functions.md
@@ -109,7 +109,7 @@ For example, you can control bloom filters and dictionary encodings for ORC data
 The following ORC example will create bloom filter and use dictionary encoding only for `favorite_color`.
 For Parquet, there exists `parquet.bloom.filter.enabled` and `parquet.enable.dictionary`, too.
 To find more detailed information about the extra ORC/Parquet options,
-visit the official Apache [ORC](https://orc.apache.org/docs/spark-config.html) / [Parquet](https://github.com/apache/parquet-mr/tree/master/parquet-hadoop) websites.
+visit the official Apache [ORC](https://orc.apache.org/docs/spark-config.html) / [Parquet](https://github.com/apache/parquet-java/tree/master/parquet-hadoop) websites.
 
 ORC data source:
 

diff --git a/docs/sql-data-sources-parquet.md b/docs/sql-data-sources-parquet.md
@@ -350,7 +350,7 @@ Dataset<Row> df2 = spark.read().parquet("/path/to/table.parquet.encrypted");
 
 #### KMS Client
 
-The InMemoryKMS class is provided only for illustration and simple demonstration of Parquet encryption functionality. **It should not be used in a real deployment**. The master encryption keys must be kept and managed in a production-grade KMS system, deployed in user's organization. Rollout of Spark with Parquet encryption requires implementation of a client class for the KMS server. Parquet provides a plug-in [interface](https://github.com/apache/parquet-mr/blob/apache-parquet-1.13.1/parquet-hadoop/src/main/java/org/apache/parquet/crypto/keytools/KmsClient.java) for development of such classes,
+The InMemoryKMS class is provided only for illustration and simple demonstration of Parquet encryption functionality. **It should not be used in a real deployment**. The master encryption keys must be kept and managed in a production-grade KMS system, deployed in user's organization. Rollout of Spark with Parquet encryption requires implementation of a client class for the KMS server. Parquet provides a plug-in [interface](https://github.com/apache/parquet-java/blob/apache-parquet-1.13.1/parquet-hadoop/src/main/java/org/apache/parquet/crypto/keytools/KmsClient.java) for development of such classes,
 
 <div data-lang="java"  markdown="1">
 {% highlight java %}
@@ -371,9 +371,9 @@ public interface KmsClient {
 
 </div>
 
-An [example](https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/test/java/org/apache/parquet/crypto/keytools/samples/VaultClient.java) of such class for an open source [KMS](https://www.vaultproject.io/api/secret/transit) can be found in the parquet-mr repository. The production KMS client should be designed in cooperation with organization's security administrators, and built by developers with an experience in access control management. Once such class is created, it can be passed to applications via the `parquet.encryption.kms.client.class` parameter and leveraged by general Spark users as shown in the encrypted dataframe write/read sample above.
+An [example](https://github.com/apache/parquet-java/blob/master/parquet-hadoop/src/test/java/org/apache/parquet/crypto/keytools/samples/VaultClient.java) of such class for an open source [KMS](https://www.vaultproject.io/api/secret/transit) can be found in the parquet-java repository. The production KMS client should be designed in cooperation with organization's security administrators, and built by developers with an experience in access control management. Once such class is created, it can be passed to applications via the `parquet.encryption.kms.client.class` parameter and leveraged by general Spark users as shown in the encrypted dataframe write/read sample above.
 
-Note: By default, Parquet implements a "double envelope encryption" mode, that minimizes the interaction of Spark executors with a KMS server. In this mode, the DEKs are encrypted with "key encryption keys" (KEKs, randomly generated by Parquet). The KEKs are encrypted with MEKs in KMS; the result and the KEK itself are cached in Spark executor memory. Users interested in regular envelope encryption, can switch to it by setting the `parquet.encryption.double.wrapping` parameter to `false`. For more details on Parquet encryption parameters, visit the parquet-hadoop configuration [page](https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/README.md#class-propertiesdrivencryptofactory).
+Note: By default, Parquet implements a "double envelope encryption" mode, that minimizes the interaction of Spark executors with a KMS server. In this mode, the DEKs are encrypted with "key encryption keys" (KEKs, randomly generated by Parquet). The KEKs are encrypted with MEKs in KMS; the result and the KEK itself are cached in Spark executor memory. Users interested in regular envelope encryption, can switch to it by setting the `parquet.encryption.double.wrapping` parameter to `false`. For more details on Parquet encryption parameters, visit the parquet-hadoop configuration [page](https://github.com/apache/parquet-java/blob/master/parquet-hadoop/README.md#class-propertiesdrivencryptofactory).
 
 
 ## Data Source Option

diff --git a/...ala/org/apache/spark/sql/execution/datasources/parquet/ParquetInteroperabilitySuite.scala b/...ala/org/apache/spark/sql/execution/datasources/parquet/ParquetInteroperabilitySuite.scala
@@ -213,8 +213,8 @@ class ParquetInteroperabilitySuite extends ParquetCompatibilityTest with SharedS
               // predicates because (a) in ParquetFilters, we ignore TimestampType and (b) parquet
               // does not read statistics from int96 fields, as they are unsigned.  See
               // scalastyle:off line.size.limit
-              // https://github.com/apache/parquet-mr/blob/2fd62ee4d524c270764e9b91dca72e5cf1a005b7/parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java#L419
-              // https://github.com/apache/parquet-mr/blob/2fd62ee4d524c270764e9b91dca72e5cf1a005b7/parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java#L348
+              // https://github.com/apache/parquet-java/blob/2fd62ee4d524c270764e9b91dca72e5cf1a005b7/parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java#L419
+              // https://github.com/apache/parquet-java/blob/2fd62ee4d524c270764e9b91dca72e5cf1a005b7/parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java#L348
               // scalastyle:on line.size.limit
               //
               // Just to be defensive in case anything ever changes in parquet, this test checks