Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds new blog post announcing opensearch hadoop #1650

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions _posts/2023-06-05-opensearch-hadoop-launch.markdown
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
layout: post
title: "Announcing the general availability of the OpenSearch Hadoop client"
authors:
- hvamsi
- nknize
date: 2023-05-06 12:15:00 -0700
categories:
- releases
meta_keywords: opensearch hadoop, apache spark, apache hive, apache hadoop, openseearch, mapreduce, hdfs
meta_description: OpenSearch Hadoop is now generally available with support for multiple versions of OpenSearch to run on Spark and Hive.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the meta with the following:

Meta_keywords: OpenSearch Hadoop, Apache Hadoop, OpenSearch Hadoop client
Meta_description: The OpenSearch Hadoop connector is now generally available with support for multiple versions of OpenSearch running on Spark and Hive.

twittercard:
description: OpenSearch Hadoop is now generally available with support for multiple versions of OpenSearch to run on Spark and Hive.
harshavamsi marked this conversation as resolved.
Show resolved Hide resolved
excerpt: We are excited to announce the release of the new OpenSearch Hadoop connector. This tool enables efficient interaction between your Hadoop-based Big Data operations and OpenSearch clusters, supporting all versions of OpenSearch.
harshavamsi marked this conversation as resolved.
Show resolved Hide resolved
---

We are excited to announce the release of the new OpenSearch-Hadoop connector. This tool enables efficient interaction between your Hadoop-based Big Data operations and OpenSearch clusters, supporting all versions of OpenSearch.

## OpenSearch Hadoop connector features:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an opening paragraph here? For example, "The OpenSearch-Hadoop connector includes the following features:" (and remove the colon from the heading)

harshavamsi marked this conversation as resolved.
Show resolved Hide resolved

harshavamsi marked this conversation as resolved.
Show resolved Hide resolved
- **Versatility**: Compatible with Scala up to version 2.13.x and Spark up to version 3.2.x, the connector facilitates data processing and analysis operations across different environments.
- **Memory and Input/Output (I/O) Efficient**: The connector is designed with a focus on performance. It uses pull-based parsing and supports bulk updates to and direct conversion of native types, resulting in efficient memory and network I/O usage.
- **Adaptive I/O**: The connector can detect transport errors and retry automatically. In case of node failures, it can reroute requests to available nodes. If OpenSearch is overloaded, the connector can detect data rejection and resend it.
- **Data Co-location Integration**: The connector integrates with Hadoop to expose network access information, enabling co-located OpenSearch and Hadoop clusters to be aware of each other, thus reducing network I/O.
- **Secure Access**: Supports identity and access management (IAM) for AWS-managed OpenSearch, ensuring secure access to your AWS resources.

## Compatibility with OpenSearch

The following matrix shows the compatibility of [`opensearch-hadoop`](https://central.sonatype.com/artifact/org.opensearch.client/opensearch-hadoop) with versions of [`OpenSearch`](https://opensearch.org/downloads.html#opensearch).

| Client version | OpenSearch version | Elasticsearch version |
| -------------- | ------------------ | --------------------- |
| 1.0.1 | 1.0.0-2.8.0 | 7.10 |

## Compatibility with Spark and Scala

| Client version | Spark version | Scala version(s) |
| -------------- | ------------- | ---------------- |
| 1.0.1 | 2.2.3 | 2.10 |
| 1.0.1 | 2.4.4 | 2.11/2.12 |
| 1.0.1 | 3.2.4 | 2.12/2.13 |

## Compatibility with AWS Glue

| Client version | Spark version | Glue version(s) |
| -------------- | ------------- | --------------- |
| 1.0.1 | 2.4.4 | 2 |
| 1.0.1 | 3.2.4 | 3/4 |

This connector is a crucial tool for anyone looking to leverage the full power of OpenSearch alongside their Hadoop ecosystem. Download and start using it today.
harshavamsi marked this conversation as resolved.
Show resolved Hide resolved
harshavamsi marked this conversation as resolved.
Show resolved Hide resolved