Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Field '__messageProperties' not found; typically this occurs with arrays which are not mapped as single value. #292

Open
akshayjain3450 opened this issue Jun 15, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@akshayjain3450
Copy link

akshayjain3450 commented Jun 15, 2023

What is the bug?

Field '__messageProperties' not found; typically this occurs with arrays which are not mapped as single value. Though this field actually exists in the reader data.

How can one reproduce the bug?

This is my sample Scala program using Spark:

def read(): Unit = {
var df = spark.read.format("opensearch")
.option("opensearch.nodes", "host")
.option("opensearch.port", "9200")
.option("opensearch.nodes.wan.only", "true")
.option("opensearch.resource", "index")
.option("opensearch.net.http.auth.user", "admin")
.option("opensearch.net.http.auth.pass", "admin")
.load()
df.printSchema()
df.show(10)
}

What is your host/environment?

Spark: 3.3.1, Opensearch-Hadoop 1.1.0

Do you have any additional context?

The Spark Schema:

root
|-- __eventTime: long (nullable = true)
|-- __messageId: string (nullable = true)
|-- __messageProperties: struct (nullable = true)
|-- __metadata: struct (nullable = true)
| |-- _dataos_run_mapper_id: string (nullable = true)
| |-- id: string (nullable = true)
|-- __publishTime: long (nullable = true)
|-- __topic: string (nullable = true)
|-- age: string (nullable = true)
|-- city: string (nullable = true)
|-- country: string (nullable = true)
|-- email: string (nullable = true)
|-- first_name: string (nullable = true)
|-- gender: string (nullable = true)
|-- id: string (nullable = true)
|-- last_name: string (nullable = true)
|-- phone: string (nullable = true)
|-- postcode: string (nullable = true)
|-- state: string (nullable = true)
|-- title: string (nullable = true)

Also, if you notice I have two struct fields, one gets mapped properly, and the other throws this issue. I am looking for a solution to this. Do let me know if you need any more details on this.

@akshayjain3450 akshayjain3450 added bug Something isn't working untriaged labels Jun 15, 2023
@akshayjain3450 akshayjain3450 changed the title [BUG]: Field '__messageProperties' not found; typically this occurs with arrays which are not mapped as single value Field '__messageProperties' not found; typically this occurs with arrays which are not mapped as single value. Jun 15, 2023
@wbeckler
Copy link

Would you be up for trying to add a breaking test to the client?

@akshayjain3450
Copy link
Author

Sure, would love to contribute to that. Would require some direction and guidance that where can I do this.

@akshayjain3450
Copy link
Author

Hi, @wbeckler any update on this?

@harshavamsi
Copy link
Collaborator

Hi @akshayjain3450, you would need to set the option opensearch.read.field.as.array.include. This tells OS hadoop how to map arrays like data. Can you set .option("opensearch.read.field.as.array.include", "__messageProperties") and see what happens?

@akshayjain3450
Copy link
Author

Hi @harshavamsi, I did try this option and still face the same issue.

Complete Stack Trace:
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 1.0 failed 1 times, most recent failure: Lost task 2.0 in stage 1.0 (TID 3) (192.168.68.68 executor driver): org.opensearch.hadoop.rest.OpenSearchHadoopParsingException: org.opensearch.hadoop.OpenSearchHadoopIllegalStateException: Field '__messageProperties' not found; typically this occurs with arrays which are not mapped as single value
at org.opensearch.hadoop.serialization.ScrollReader.readHit(ScrollReader.java:528)
at org.opensearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:306)
at org.opensearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:270)
at org.opensearch.hadoop.rest.RestRepository.scroll(RestRepository.java:326)
at org.opensearch.hadoop.rest.ScrollQuery.hasNext(ScrollQuery.java:104)
at org.opensearch.spark.rdd.AbstractOpenSearchRDDIterator.hasNext(AbstractOpenSearchRDDIterator.scala:75)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:364)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.opensearch.hadoop.OpenSearchHadoopIllegalStateException: Field '__messageProperties' not found; typically this occurs with arrays which are not mapped as single value
at org.opensearch.spark.sql.RowValueReader.rowColumns(RowValueReader.scala:60)
at org.opensearch.spark.sql.RowValueReader.rowColumns$(RowValueReader.scala:57)
at org.opensearch.spark.sql.ScalaRowValueReader.rowColumns(ScalaOpenSearchRowValueReader.scala:41)
at org.opensearch.spark.sql.ScalaRowValueReader.createMap(ScalaOpenSearchRowValueReader.scala:78)
at org.opensearch.hadoop.serialization.ScrollReader.map(ScrollReader.java:1030)
at org.opensearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:901)
at org.opensearch.hadoop.serialization.ScrollReader.map(ScrollReader.java:1066)
at org.opensearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:903)
at org.opensearch.hadoop.serialization.ScrollReader.readHitAsMap(ScrollReader.java:616)
at org.opensearch.hadoop.serialization.ScrollReader.readHit(ScrollReader.java:440)
... 23 more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants