Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AWS documentation (DynamoDB, Redshift) #1733

Merged
merged 1 commit into from
Jul 27, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions docs/getting-started/create-a-feature-repository.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,22 @@ feast init -t gcp
Creating a new Feast repository in /<...>/tiny_pika.
```
{% endtab %}

{% tab title="AWS template" %}
```
feast init -t aws
[?] AWS Region (e.g. us-west-2): ...
[?] Redshift Cluster ID: ...
[?] Redshift Database Name: ...
[?] Redshift User Name: ...
[?] Redshift S3 Staging Location (s3://*): ...
[?] Redshift IAM Role for S3 (arn:aws:iam::*:role/*): ...
[?] Should I upload example data to Redshift (overwriting 'feast_driver_hourly_stats' table)? (Y/n):

Creating a new Feast repository in /<...>/tiny_pika.
```
{% endtab %}

{% endtabs %}

The `init` command creates a Python file with feature definitions, sample data, and a Feast configuration file for local development:
Expand Down
6 changes: 6 additions & 0 deletions docs/getting-started/install-feast.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,9 @@ Install Feast with GCP dependencies \(required when using BigQuery or Firestore\
pip install 'feast[gcp]'
```

Install Feast with AWS dependencies \(required when using Redshift or DynamoDB\):

```text
pip install 'feast[aws]'
```

5 changes: 2 additions & 3 deletions docs/reference/data-sources/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,8 @@

Please see [Data Source](../../concepts/feature-view.md#data-source) for an explanation of data sources.

{% page-ref page="bigquery.md" %}

{% page-ref page="file.md" %}

{% page-ref page="bigquery.md" %}


{% page-ref page="redshift.md" %}
33 changes: 33 additions & 0 deletions docs/reference/data-sources/redshift.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Redshift

### Description

Redshift data sources allow for the retrieval of historical feature values from Redshift for building training datasets as well as materializing features into an online store.

* Either a table name or a SQL query can be provided.
* No performance guarantees can be provided over SQL query-based sources. Please use table references where possible.

### Examples

Using a table name

```python
from feast import RedshiftSource

my_redshift_source = RedshiftSource(
table="redshift_table",
)
```

Using a query

```python
from feast import RedshiftSource

my_redshift_source = RedshiftSource(
query="SELECT timestamp as ts, created, f1, f2 "
"FROM redshift_table",
)
```

Configuration options are available [here](https://rtd.feast.dev/en/master/feast.html?#feast.RedshiftSource).
1 change: 1 addition & 0 deletions docs/reference/offline-stores/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ Please see [Offline Store](../../concepts/offline-store.md) for an explanation o

{% page-ref page="bigquery.md" %}

{% page-ref page="redshift.md" %}
31 changes: 31 additions & 0 deletions docs/reference/offline-stores/redshift.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Redshift

### Description

The Redshift offline store provides support for reading [RedshiftSources](../data-sources/redshift.md).

* Redshift tables and views are allowed as sources.
* All joins happen within Redshift.
* Entity dataframes can be provided as a SQL query or can be provided as a Pandas dataframe. Pandas dataframes will be uploaded to Redshift in order to complete join operations.
* A [RedshiftRetrievalJob](https://github.com/feast-dev/feast/blob/bf557bcb72c7878a16dccb48443bbbe9dc3efa49/sdk/python/feast/infra/offline_stores/redshift.py#L161) is returned when calling `get_historical_features()`.

### Example

{% code title="feature\_store.yaml" %}
```yaml
project: my_feature_repo
registry: data/registry.db
provider: aws
offline_store:
type: redshift
region: us-west-2
cluster_id: feast-cluster
database: feast-database
user: redshift-user
s3_staging_location: s3://feast-bucket/redshift
iam_role: arn:aws:iam::123456789012:role/redshift_s3_access_role
woop marked this conversation as resolved.
Show resolved Hide resolved
```
{% endcode %}

Configuration options are available [here](https://github.com/feast-dev/feast/blob/bf557bcb72c7878a16dccb48443bbbe9dc3efa49/sdk/python/feast/infra/offline_stores/redshift.py#L22).

1 change: 1 addition & 0 deletions docs/reference/online-stores/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ Please see [Online Store](../../concepts/online-store.md) for an explanation of

{% page-ref page="datastore.md" %}

{% page-ref page="dynamodb.md" %}
27 changes: 27 additions & 0 deletions docs/reference/online-stores/dynamodb.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# DynamoDB

### Description

The [DynamoDB](https://aws.amazon.com/dynamodb/) online store provides support for materializing feature values into AWS DynamoDB.
<!---

TODO: Add DynamoDB to online store format document and point to it.

The data model used to store feature values in DynamoDB is described in more detail [here](https://github.com/feast-dev/feast/blob/master/docs/specs/online_store_format.md#google-datastore-online-store-format).

-->

### Example

{% code title="feature\_store.yaml" %}
```yaml
project: my_feature_repo
registry: data/registry.db
provider: aws
online_store:
type: dynamodb
region: us-west-2
```
{% endcode %}

Configuration options are available [here](https://github.com/feast-dev/feast/blob/17bfa6118d6658d2bff53d7de8e2ccef5681714d/sdk/python/feast/infra/online_stores/dynamodb.py#L36).
1 change: 1 addition & 0 deletions docs/reference/providers/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ Please see [Provider](../../concepts/provider.md) for an explanation of provider

{% page-ref page="google-cloud-platform.md" %}

{% page-ref page="amazon-web-services.md" %}
117 changes: 117 additions & 0 deletions docs/reference/providers/amazon-web-services.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Amazon Web Services

### Description

* Offline Store: Uses the **Redshift** offline store by default. Also supports File as the offline store.
* Online Store: Uses the **DynamoDB** online store by default. Also supports Sqlite as an online store.

### Example

{% code title="feature\_store.yaml" %}
```yaml
project: my_feature_repo
registry: data/registry.db
provider: aws
online_store:
type: dynamodb
region: us-west-2
offline_store:
type: redshift
region: us-west-2
cluster_id: feast-cluster
database: feast-database
user: redshift-user
s3_staging_location: s3://feast-bucket/redshift
iam_role: arn:aws:iam::123456789012:role/redshift_s3_access_role
```
{% endcode %}

<!--
TODO: figure out the permissions

### **Permissions**

<table>
<thead>
<tr>
<th style="text-align:left"><b>Command</b>
</th>
<th style="text-align:left">Component</th>
<th style="text-align:left">Permissions</th>
<th style="text-align:left">Recommended Role</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left"><b>Apply</b>
</td>
<td style="text-align:left">BigQuery (source)</td>
<td style="text-align:left">
<p>bigquery.jobs.create</p>
<p>bigquery.readsessions.create</p>
<p>bigquery.readsessions.getData</p>
</td>
<td style="text-align:left">roles/bigquery.user</td>
</tr>
<tr>
<td style="text-align:left"><b>Apply</b>
</td>
<td style="text-align:left">Datastore (destination)</td>
<td style="text-align:left">
<p>datastore.entities.allocateIds</p>
<p>datastore.entities.create</p>
<p>datastore.entities.delete</p>
<p>datastore.entities.get</p>
<p>datastore.entities.list</p>
<p>datastore.entities.update</p>
</td>
<td style="text-align:left">roles/datastore.owner</td>
</tr>
<tr>
<td style="text-align:left"><b>Materialize</b>
</td>
<td style="text-align:left">BigQuery (source)</td>
<td style="text-align:left">bigquery.jobs.create</td>
<td style="text-align:left">roles/bigquery.user</td>
</tr>
<tr>
<td style="text-align:left"><b>Materialize</b>
</td>
<td style="text-align:left">Datastore (destination)</td>
<td style="text-align:left">
<p>datastore.entities.allocateIds</p>
<p>datastore.entities.create</p>
<p>datastore.entities.delete</p>
<p>datastore.entities.get</p>
<p>datastore.entities.list</p>
<p>datastore.entities.update</p>
<p>datastore.databases.get</p>
</td>
<td style="text-align:left">roles/datastore.owner</td>
</tr>
<tr>
<td style="text-align:left"><b>Get Online Features</b>
</td>
<td style="text-align:left">Datastore</td>
<td style="text-align:left">datastore.entities.get</td>
<td style="text-align:left">roles/datastore.user</td>
</tr>
<tr>
<td style="text-align:left"><b>Get Historical Features</b>
</td>
<td style="text-align:left">BigQuery (source)</td>
<td style="text-align:left">
<p>bigquery.datasets.get</p>
<p>bigquery.tables.get</p>
<p>bigquery.tables.create</p>
<p>bigquery.tables.updateData</p>
<p>bigquery.tables.update</p>
<p>bigquery.tables.delete</p>
<p>bigquery.tables.getData</p>
</td>
<td style="text-align:left">roles/bigquery.dataEditor</td>
</tr>
</tbody>
</table>

-->