diff --git a/docs/en/app_ecosystem/feat_insight/faq.md b/docs/en/app_ecosystem/feat_insight/faq.md new file mode 100644 index 00000000000..48b03b53e6d --- /dev/null +++ b/docs/en/app_ecosystem/feat_insight/faq.md @@ -0,0 +1,38 @@ +# Frequently Asked Questions + +## What are the differences between FeatInsight and mainstream Feature Stores? + +Mainstream Feature Stores, such as Feast, Tecton, Feathr, provide feature management and computation capabilities, with online storage mainly using pre-aggregated key-value stores like Redis. FeatInsight provides real-time feature computation capabilities, and feature extraction solutions can be directly deployed with a single click without the need to re-deploy and synchronize online data. The main feature comparisons are as follows. + +| Feature Store System | Feast | Tecton | Feathr | FeatInsight | +| --------------------------| ------------------ | ----------------- | ----------------- | ----------------- | +| Data Source Support | Multiple data sources | Multiple data sources | Multiple data sources | Multiple data sources | +| Scalability | High | High | Medium to High | High | +| Real-time Feature Service | Supported | Supported | Supported | Supported | +| Batch Feature Service | Supported | Supported | Supported | Supported | +| Feature Transformation | Basic transformations supported | Complex transformations and SQL supported | Complex transformations supported | Complex transformations and SQL supported | +| Data Storage | Multiple storage options supported | Mainly supports cloud storage | Multiple storage options supported | Built-in high-performance time-series database, supports multiple storage options | +| Community and Support | Open-source community | Commercial support | Open-source community | Open-source community | +| Real-time Feature Computation | Not supported | Not supported | Not supported | Supported | + +## Is it necessary to have OpenMLDB for deploying FeatInsight? + +Yes, it is necessary because FeatInsight's metadata storage and feature computation rely on the OpenMLDB cluster. Therefore, deploying FeatInsight requires deployment of the OpenMLDB cluster. You can also use the [All-in-One Docker image](./install/docker.md) that integrates both for one-click deployment. + +After using FeatInsight, users can develop and deploy features without relying on OpenMLDB CLI or SDK. All feature engineering needs can be completed through the web interface. + +## How can I implement MLOps workflows using FeatInsight? + +With FeatInsight, you can create databases and tables in the frontend, then submit the import tasks for online and offline data. Use OpenMLDB SQL syntax for data exploration and feature creation. You can then export offline features and deploy online features with just one click. There is no need for any additional development work to transition from offline to online in the MLOps process. For detailed steps, refer to the [Quickstart](./quickstart.md). + +## How does FeatInsight support ecosystem integration? + +FeatInsight relies on the OpenMLDB ecosystem and supports integration with other components in the OpenMLDB ecosystem. + +For example, integration with data integration components in the OpenMLDB ecosystem supports [Kafka](../../integration/online_datasources/kafka_connector_demo.md)、[Pulsar](../../integration/online_datasources/pulsar_connector_demo.md)、[RocketMQ](../../integration/online_datasources/rocketmq_connector.md)、[Hive](../../integration/offline_data_sources/hive.md)、[Amazon S3](../../integration/offline_data_sources/s3.md). For scheduling systems, it supports [Airflow](../../integration/deploy_integration/airflow_provider_demo.md)、[DolphinScheduler](../../integration/deploy_integration/dolphinscheduler_task_demo.md)、[Byzer](../../integration/deploy_integration/OpenMLDB_Byzer_taxi.md), etc. It also provides a certain degree of support for Spark Connector supporting HDFS, Iceberg, and cloud-related technologies like Kubernetes, Alibaba Cloud MaxCompute, etc. + +## What is the business value and technical complexity of FeatInsight? + +Compared to simple Feature Stores using HDFS for storing offline data and Redis for storing online data, FeatInsight's value lies in using the online-offline consistent feature extraction language of OpenMLDB SQL. For feature development scientists, they only need to write SQL logic to define features. In offline scenarios, this SQL will be translated into a distributed Spark application for execution. In online scenarios, the same SQL will be translated into query statements for an online time-series database for execution, achieving consistency between online and offline feature computations. + +Currently, the SQL compiler, online storage engine, and offline computing engine are all implemented based on programming languages such as C++ and Scala. For scientists without a technical background, using SQL language to define the feature development process can reduce learning costs and improve development efficiency. All the code is open-source and available, with the OpenMLDB project at https://github.com/4paradigm/openmldb and the FeatInsight project at https://github.com/4paradigm/FeatInsight. \ No newline at end of file diff --git a/docs/en/app_ecosystem/feat_insight/images/bigscreen.png b/docs/en/app_ecosystem/feat_insight/images/bigscreen.png new file mode 100644 index 00000000000..0672bceb49e Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/bigscreen.png differ diff --git a/docs/en/app_ecosystem/feat_insight/images/create_test_feature_service.png b/docs/en/app_ecosystem/feat_insight/images/create_test_feature_service.png new file mode 100644 index 00000000000..54af594581b Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/create_test_feature_service.png differ diff --git a/docs/en/app_ecosystem/feat_insight/images/create_test_featureview.png b/docs/en/app_ecosystem/feat_insight/images/create_test_featureview.png new file mode 100644 index 00000000000..a6438427832 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/create_test_featureview.png differ diff --git a/docs/en/app_ecosystem/feat_insight/images/create_test_table.png b/docs/en/app_ecosystem/feat_insight/images/create_test_table.png new file mode 100644 index 00000000000..6d0c06377db Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/create_test_table.png differ diff --git a/docs/en/app_ecosystem/feat_insight/images/csv_import_test_table.png b/docs/en/app_ecosystem/feat_insight/images/csv_import_test_table.png new file mode 100644 index 00000000000..9a17fa274c3 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/csv_import_test_table.png differ diff --git a/docs/en/app_ecosystem/feat_insight/images/export_test_offline_samples.png b/docs/en/app_ecosystem/feat_insight/images/export_test_offline_samples.png new file mode 100644 index 00000000000..3b026346b34 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/export_test_offline_samples.png differ diff --git a/docs/en/app_ecosystem/feat_insight/images/ide_develop_featuer_platform.png b/docs/en/app_ecosystem/feat_insight/images/ide_develop_featuer_platform.png new file mode 100644 index 00000000000..4e506ccf867 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/ide_develop_featuer_platform.png differ diff --git a/docs/en/app_ecosystem/feat_insight/images/import_job_result.png b/docs/en/app_ecosystem/feat_insight/images/import_job_result.png new file mode 100644 index 00000000000..3062101e883 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/import_job_result.png differ diff --git a/docs/en/app_ecosystem/feat_insight/images/local_test_offline_samples.png b/docs/en/app_ecosystem/feat_insight/images/local_test_offline_samples.png new file mode 100644 index 00000000000..dcb9aeff76b Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/local_test_offline_samples.png differ diff --git a/docs/en/app_ecosystem/feat_insight/images/online_csv_import_test_table.png b/docs/en/app_ecosystem/feat_insight/images/online_csv_import_test_table.png new file mode 100644 index 00000000000..d5fb72f2c9a Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/online_csv_import_test_table.png differ diff --git a/docs/en/app_ecosystem/feat_insight/images/preview_test_features.png b/docs/en/app_ecosystem/feat_insight/images/preview_test_features.png new file mode 100644 index 00000000000..2d149564bda Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/preview_test_features.png differ diff --git a/docs/en/app_ecosystem/feat_insight/images/preview_test_table.png b/docs/en/app_ecosystem/feat_insight/images/preview_test_table.png new file mode 100644 index 00000000000..3987efe09e2 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/preview_test_table.png differ diff --git a/docs/en/app_ecosystem/feat_insight/images/request_test_feature_service.png b/docs/en/app_ecosystem/feat_insight/images/request_test_feature_service.png new file mode 100644 index 00000000000..cf696370027 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/request_test_feature_service.png differ diff --git a/docs/en/app_ecosystem/feat_insight/images/test_feature_service_detail.png b/docs/en/app_ecosystem/feat_insight/images/test_feature_service_detail.png new file mode 100644 index 00000000000..ddbba8529b9 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/test_feature_service_detail.png differ diff --git a/docs/en/app_ecosystem/feat_insight/images/test_offline_sample_detail.png b/docs/en/app_ecosystem/feat_insight/images/test_offline_sample_detail.png new file mode 100644 index 00000000000..bd1522ac22a Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/images/test_offline_sample_detail.png differ diff --git a/docs/en/app_ecosystem/feat_insight/index.rst b/docs/en/app_ecosystem/feat_insight/index.rst new file mode 100644 index 00000000000..83d17a03f7e --- /dev/null +++ b/docs/en/app_ecosystem/feat_insight/index.rst @@ -0,0 +1,12 @@ +============================= +FeatInsight +============================= + +.. toctree:: + :maxdepth: 1 + + introduction + quickstart + install/index + use_cases/index + faq \ No newline at end of file diff --git a/docs/en/app_ecosystem/feat_insight/install/config_file.md b/docs/en/app_ecosystem/feat_insight/install/config_file.md new file mode 100644 index 00000000000..60773fa8a74 --- /dev/null +++ b/docs/en/app_ecosystem/feat_insight/install/config_file.md @@ -0,0 +1,30 @@ +# FeatInsight Configuration File + +## Introduction + +FeatInsight is developed based on Spring Boot. It uses the standard `application.yml` as configuration file. + +## Example Configuration + +A simplified configuration file example is as follows: + +``` +server: + port: 8888 + +openmldb: + zk_cluster: 127.0.0.1:2181 + zk_path: /openmldb + apiserver: 127.0.0.1:9080 +``` + +## Configuration Items + + +| Item | Definition | Type | Example | +| --------------------------| --------------------------- | ------- | -------------- | +| server.port | port for service | int | 8888 | +| openmldb.zk_cluster | ZooKeeper address | string | 127.0.0.1:2181 | +| openmldb.zk_path | OpenMLDB root path | string | /openmldb | +| openmldb.apiserver | OpenMLDB APIServer address | string | 127.0.0.1:9080 | +| openmldb.skip_index_check | whether to skip index check | boolean | false | \ No newline at end of file diff --git a/docs/en/app_ecosystem/feat_insight/install/docker.md b/docs/en/app_ecosystem/feat_insight/install/docker.md new file mode 100644 index 00000000000..86c682714bf --- /dev/null +++ b/docs/en/app_ecosystem/feat_insight/install/docker.md @@ -0,0 +1,47 @@ +# Docker + +## Introduction + +User official Docker image for quick deployment of FeatInsight feature services. + +## All-in-One Image + +With All-in-One image which contains a automatic OpenMLDB deployment, you can start both the OpenMLDB cluster and FeatInsight at the same time. No additional actions are required. + +``` +docker run -d -p 8888:8888 registry.cn-shenzhen.aliyuncs.com/tobe43/portable-openmldb +``` + +It takes around one minute to start. You can check the logs through `docker logs`. + +After successful start-up, you can access FeatInsight service with any web browser at `http://127.0.0.1:8888`. + +## Docker Image without OpenMLDB + +With this image, you need to deploy a OpenMLDB cluster in advance, and then start this FeatInsight docker container. There are more steps but it offers higher flexibility. + +Please refer to [OpenMLDB Deployment](../../../deploy/index.rst) to deploy a OpenMLDB cluster. + +Then, refer to [FeatInsight Configuration File](./config_file.md) to create an `application.yml` configuration file. + +``` +server: + port: 8888 + +openmldb: + zk_cluster: 127.0.0.1:2181 + zk_path: /openmldb + apiserver: 127.0.0.1:9080 +``` + +For Linux OS, use the following command to start the container. + +``` +docker run -d -p 8888:8888 --net=host -v `pwd`/application.yml:/app/application.yml registry.cn-shenzhen.aliyuncs.com/tobe43/featinsight +``` + +For MacOS, since virtual machine is used to start Docker container, `--net=host` is not working properly, please configure `application.yml` to point to OpenMLDB service addresses correctly. + +``` +docker run -d -p 8888:8888 -v `pwd`/application.yml:/app/application.yml registry.cn-shenzhen.aliyuncs.com/tobe43/featinsight +``` diff --git a/docs/en/app_ecosystem/feat_insight/install/index.rst b/docs/en/app_ecosystem/feat_insight/install/index.rst new file mode 100644 index 00000000000..b1b2ab5918d --- /dev/null +++ b/docs/en/app_ecosystem/feat_insight/install/index.rst @@ -0,0 +1,13 @@ +============================= +Installation and Deployment +============================= + +.. toctree:: + :maxdepth: 1 + + docker + package + source + config_file + upgrade + diff --git a/docs/en/app_ecosystem/feat_insight/install/package.md b/docs/en/app_ecosystem/feat_insight/install/package.md new file mode 100644 index 00000000000..66e7fb66616 --- /dev/null +++ b/docs/en/app_ecosystem/feat_insight/install/package.md @@ -0,0 +1,37 @@ +# Installation Package + +## Introduction +You can deploy FeatInsight quickly with official pre-built installation package and Java environment. + +Note that you need to deploy OpenMLDB cluster first, refer to [OpenMLDB Deployment](../../../deploy/index.rst). + +## Download + +Download Jar file. + +``` +wget https://openmldb.ai/download/featinsight/featinsight-0.1.0-SNAPSHOT.jar +``` + +## Configuration + +Refer to [FeatInsight Configuration](./config_file.md) to create an `application.yml` configuration file. + +``` +server: + port: 8888 + +openmldb: + zk_cluster: 127.0.0.1:2181 + zk_path: /openmldb + apiserver: 127.0.0.1:9080 +``` + +## Start + +Start FeatInsight service. + +``` +java -jar ./featinsight-0.1.0-SNAPSHOT.jar +``` + diff --git a/docs/en/app_ecosystem/feat_insight/install/source.md b/docs/en/app_ecosystem/feat_insight/install/source.md new file mode 100644 index 00000000000..2ca51d93141 --- /dev/null +++ b/docs/en/app_ecosystem/feat_insight/install/source.md @@ -0,0 +1,40 @@ +# Build from Source + +## Introduction + +You can build FeatInsight from source code as required. + +## Download + +Download project source code. + +``` +git clone https://github.com/4paradigm/FeatInsight +``` + +## Compile from Source + +Enter project root directory, execute the following command to compile frontend and backend. + +``` +cd ./FeatInsight/frontend/ +npm run build + +cd ../ +mvn clean package +``` + +## Start + +Deploy OpenMLDB cluster and generate configuration file, start the service with the following command. + + +``` +./start_server.sh +``` + +## IDE + +If you are developing with IDE, you can modify `application.yml` configuration file, and directly start `HtttpServer.java`. + +![](../images/ide_develop_featuer_platform.png) diff --git a/docs/en/app_ecosystem/feat_insight/install/upgrade.md b/docs/en/app_ecosystem/feat_insight/install/upgrade.md new file mode 100644 index 00000000000..55431732b42 --- /dev/null +++ b/docs/en/app_ecosystem/feat_insight/install/upgrade.md @@ -0,0 +1,10 @@ +# Version Upgrade + +## Introduction + +FeatInsight provides an HTTP interface to the external users, relying on the OpenMLDB database for storing metadata. Therefore, version upgrades can be carried out using methods like multiple instances and rolling updates. + +## Single Instance Upgrade Steps +1. Download the new installation package or Docker image. +2. Stop the currently running instance of FeatInsight. +3. Start a new instance with the new FeatInsight package. \ No newline at end of file diff --git a/docs/en/app_ecosystem/feat_insight/introduction.md b/docs/en/app_ecosystem/feat_insight/introduction.md new file mode 100644 index 00000000000..262a2ebeea7 --- /dev/null +++ b/docs/en/app_ecosystem/feat_insight/introduction.md @@ -0,0 +1,41 @@ +# Introduction + +FeatInsight is a sophisticated feature store service, leveraging [OpenMLDB](https://github.com/4paradigm/OpenMLDB) for efficient feature computation, management, and orchestration. + +FeatInsight provides a user-friendly user interface, allowing users to perform the entire process of feature engineering for machine learning, including data import, viewing and update, feature generation, store, and online deployment. For offline scenarios, users can choose features for training sample generation for ML training; for online scenarios, users can deploy online feature services for real-time feature computations. + +![](./images/bigscreen.png) + +## Main Functionalities + +FeatInsight includes the following major functionalities: + +- [Data Management](./functions/import_data.md): To import and manage datasets and online data sources for feature engineering. +- [Feature Management](./functions/manage_feature.md): To store original features and generated features. +- [Online Scenario](./functions/online_scenario.md): To deploy feature services online, which provides hard real-time online feature extraction APIs using online data. +- [Offline Scenario](./functions/offline_scenario.md): To generate training dataset from offline data and corresponding feature calculations. It also provides management functions for offline datasets and offline tasks. +- [SQL Playground](./functions/sql_playground.md): To execute any OpenMLDB SQL statements. It can be used in both online and offline mode for feature calculations. +- [Computed Features](./functions/computed_features.md): To store pre-computed features directly into OpenMLDB online tables, for access to perform feature reads and writes. + +## Key Features + +The main objective of FeatInsight is to address common challenges in machine learning development, including facilitating easy and quick feature extraction, transformation, combination, and selection, managing feature lineage, enabling feature reuse and sharing, version control for feature services, and ensuring consistency and reliability of feature data used in both training and inference processes. Application scenarios include the following: + +* Online Feature Service Deployment: Provides high-performance feature storage and online feature computation functions for localized deployment. +* MLOps Platform: Establishes MLOps workflow with OpenMLDB online-offline consistent computations. +* FeatureStore Platform: Provides comprehensive feature extraction, deletion, online deployment, and lineage management functionality to achieve low-cost local FeatureStore services. +* Open-Source Feature Solution Reuse: Supports solution reuse locally for feature reuse and sharing. +* Business Component for Machine Learning: Provides a one-stop feature engineering solution for machine learning models in recommendation systems, natural language processing, finance, healthcare, and other areas of machine learning implementation. + + +## Core Concepts + +Here are some terms and their definitions used in FeatInsight for better understanding: + +* Feature: Data obtained through feature extraction from raw data that can be directly used for model training and inference. +* Pre-computed Feature: Feature values stored after external batch computation or streaming processing, available for direct online use. +* Feature View: A set of features defined by a single SQL computation statement. +* Feature Service: Combines one or more features into a feature service, provided for use in online scenarios. +* Online Scenario: By deploying feature services, it provides hard real-time online feature extraction interfaces using online data. +* Offline Scenario: With distributed computing, performs feature computation on offline data and exports training dataset for machine learning. +* Online-Offline Consistency: The consistency in feature results between online and offline scenarios is ensured through the same SQL statement. diff --git a/docs/en/app_ecosystem/feat_insight/quickstart.md b/docs/en/app_ecosystem/feat_insight/quickstart.md new file mode 100644 index 00000000000..b918adb6d6c --- /dev/null +++ b/docs/en/app_ecosystem/feat_insight/quickstart.md @@ -0,0 +1,112 @@ +# Quickstart + +We will use a simple example to show how to use FeatInsight to perform feature engineering. + +The installation and deployment, you can refer to [OpenMLDB Deployment](../../../deploy/index.rst) and [FeatInsight Deployment](./install/index.rst). + +## Usage + +The major steps to use FeatInsight includes the following: + +1. Data Import: Use SQL or frontend form to create database, data table, import online data, import offline data. +2. Feature Creation: Use SQL to define a feature view, and FeatInsight will use SQL compiler to analyze and create corresponding features. +3. Offline Scenarios: Choose features to import (features from different feature views can be chosen), and export training dataset through distributed computing into local or distributed storage. +4. Online Scenarios: Choose features for deployment, and deploy them as online feature extraction services. The service then can be accessed through HTTP client to retrieve online feature extraction results. + +### 1. Data Import + +Firstly, create database `test_db` and data table `test_table`. You can use SQL to create. + +``` +CREATE DATABASE test_db; + +CREATE TABLE test_db.test_table (id STRING, trx_time DATE); +``` + +Or you can use the UI and create it under "Data Import". + +![](./images/create_test_table.png) + +For easier testing, we prepare a CSV file and save it to `/tmp/test_table.csv`. Note that, this path is a local path for the machine that runs the OpenMLDB TaskManager, usually also the machine for FeatInsight. You will need access to the machine for the edition. + +``` +id,trx_time +user1,2024-01-01 +user2,2024-01-02 +user3,2024-01-03 +user4,2024-01-04 +user5,2024-01-05 +user6,2024-01-06 +user7,2024-01-07 +``` + +For online scenarios, you can use the command `LOAD DATA` or `INSERT`. Here we use "Import from CSV". + +![](./images/online_csv_import_test_table.png) + +The imported data can be previewed. + +![](./images/preview_test_table.png) + +For offline scenarios, you can also use `LOAD_DATA` or "Import from CSV". + +![](./images/csv_import_test_table.png) + +Wait for about half a minute for the task to finish. You can also check the status and log. + +![](./images/import_job_result.png) + +### 2. Feature Creation + +After data imports, we can create features. Here we use SQL to create two basic features. + +``` +SELECT id, dayofweek(trx_time) as trx_day FROM test_table +``` + +In "Features", the button beside "All Features" is to create new features. Fill in the form accordingly. + +![](./images/create_test_featureview.png) + +After successful creation, you can check the features. Click on the name to go into details. You can check the basic information, as well as preview feature values. + +![](./images/preview_test_features.png) + + +### 3. Offline Samples Export + +In "Offline Scenario", you can choose to export offline samples. You can choose the features to export and specify the export path. There are "More Options" for you to specify the file format and other advanced parameters. + +![](./images/export_test_offline_samples.png) + +Wait for about half a minute and you can check the status at "Offline Samples". + +![](./images/test_offline_sample_detail.png) + +You can check the content of the exported samples. To verify online-offline consistency provided by FeatInsight, you can record the result and compare it with online feature computation results. + +![](./images/local_test_offline_samples.png) + +### 4. Online Feature Service + +In "Feature Services", the button besides "All Feature Services" is to create a new feature service. You can choose the features to deploy, and fill in service name and version accordingly. + +![](./images/create_test_feature_service.png) + +After successful creation, you can check service details, including feature list, dependent tables and lineage. + +![](./images/test_feature_service_detail.png) + +Lastly, in "Request Feature Service" page, we can key in test data to perform online feature calculation, and compare it with offline computation results. + +![](./images/request_test_feature_service.png) + +## Summary + +This example demonstrates the complete process of using FeatInsight. By writing simple SQL statements, users can define features for both online and offline scenarios. By selecting different features or combining feature sets, users can quickly reuse and deploy features services. Lastly, the consistency of feature computation can be validated by comparing offline and online calculation results. + +## Appendix: Advanced Functions +In addition to the basic functionalities of feature engineering, FeatInsight also provides advanced functionalities to facilitate feature development for users: + +* SQL Playground: Offers debugging and execution capabilities for OpenMLDB SQL statements, allowing users to execute arbitrary SQL operations and debug SQL statements for feature extraction. +* Computed Features: Enables the direct storage of feature values obtained through external batch computation or stream processing into OpenMLDB online tables. Users can then access and manipulate feature data in online tables. \ No newline at end of file diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/images/recommend_create_feature.png b/docs/en/app_ecosystem/feat_insight/use_cases/images/recommend_create_feature.png new file mode 100644 index 00000000000..a00f914763c Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/use_cases/images/recommend_create_feature.png differ diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/images/recommend_create_feature_service.png b/docs/en/app_ecosystem/feat_insight/use_cases/images/recommend_create_feature_service.png new file mode 100644 index 00000000000..3183000fc26 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/use_cases/images/recommend_create_feature_service.png differ diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/images/recommend_feature_view_detail.png b/docs/en/app_ecosystem/feat_insight/use_cases/images/recommend_feature_view_detail.png new file mode 100644 index 00000000000..8b58c242c2d Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/use_cases/images/recommend_feature_view_detail.png differ diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_create_feature.png b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_create_feature.png new file mode 100644 index 00000000000..ef04186f20b Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_create_feature.png differ diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_create_feature_service.png b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_create_feature_service.png new file mode 100644 index 00000000000..9966ddf94c0 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_create_feature_service.png differ diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_create_table.png b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_create_table.png new file mode 100644 index 00000000000..e84688ec95c Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_create_table.png differ diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_export_offline_samples.png b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_export_offline_samples.png new file mode 100644 index 00000000000..e442c4a7a93 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_export_offline_samples.png differ diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_feature_service_detail.png b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_feature_service_detail.png new file mode 100644 index 00000000000..b4569b134fa Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_feature_service_detail.png differ diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_features.png b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_features.png new file mode 100644 index 00000000000..7d03a9bf02f Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_features.png differ diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_import_offline_data.png b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_import_offline_data.png new file mode 100644 index 00000000000..e1a272b3460 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_import_offline_data.png differ diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_import_online_data.png b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_import_online_data.png new file mode 100644 index 00000000000..482c610ee55 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_import_online_data.png differ diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_offline_samples_data.png b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_offline_samples_data.png new file mode 100644 index 00000000000..a02f861f0ec Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_offline_samples_data.png differ diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_preview_online_table.png b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_preview_online_table.png new file mode 100644 index 00000000000..83f152aabd0 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_preview_online_table.png differ diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_request_feature_service.png b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_request_feature_service.png new file mode 100644 index 00000000000..770708d4153 Binary files /dev/null and b/docs/en/app_ecosystem/feat_insight/use_cases/images/taxi_request_feature_service.png differ diff --git a/docs/en/app_ecosystem/feature_platform/index.rst b/docs/en/app_ecosystem/feat_insight/use_cases/index.rst similarity index 52% rename from docs/en/app_ecosystem/feature_platform/index.rst rename to docs/en/app_ecosystem/feat_insight/use_cases/index.rst index 93e31d3a062..554d126f8b4 100644 --- a/docs/en/app_ecosystem/feature_platform/index.rst +++ b/docs/en/app_ecosystem/feat_insight/use_cases/index.rst @@ -1,12 +1,9 @@ ============================= -OpenMLDB Feature Platform +Application Scenarios ============================= .. toctree:: :maxdepth: 1 - concept - installation - tutorial - usage - \ No newline at end of file + taxi_tour_duration_prediction + recommend_system \ No newline at end of file diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/recommend_system.md b/docs/en/app_ecosystem/feat_insight/use_cases/recommend_system.md new file mode 100644 index 00000000000..2218ca2c62d --- /dev/null +++ b/docs/en/app_ecosystem/feat_insight/use_cases/recommend_system.md @@ -0,0 +1,116 @@ +# E-commerce Recommendation System + +## Background + +In common e-commerce recommendation systems, it is necessary to accurately count the number of times users browse various tagged advertisements in a specific time period (e.g., the last 7 days) before each recommendation request. These statistics will be fed back to the recommendation system for more in-depth rule analysis and decision-making. + +## Data Preparation + +We prepare three data tables. + +First is the request data table, where users query the features for the current window using ID and request time. + +``` +CREATE TABLE recommend_system.request (uid string, event_time timestamp) +``` + +Next is the exposure table, which provides user ID and material ID information. To simplify, we remove other irrelevant columns. + +``` +CREATE TABLE recommend_system.feeds (uid string, material_id string, event_time timestamp) +``` + +Finally, the material table contains basic information about the material, including the material type needed for this scenario. Again, we simplify by removing unrelated fields. + +``` +CREATE TABLE recommend_system.material (material_id string, tag string); +``` + +## Feature Design + +Based on the scenario description, we only need to extract the count of different labels for user ID and materials. We use the following OpenMLDB SQL to perform feature extraction. + +``` +SELECT + uid, + count_cate(material_id, tag) OVER w AS category_count +FROM + (SELECT uid, CAST (null AS string) AS material_id, CAST (null AS string) AS tag, event_time FROM request) +WINDOW + w AS ( + UNION ( + SELECT + uid, feeds.material_id, material.tag AS tag, event_time + FROM feeds + LAST JOIN material ON feeds.material_id = material.material_id) + PARTITION BY uid ORDER BY event_time ROWS_RANGE BETWEEN 7d PRECEDING AND CURRENT ROW) +``` +You can refer to below to understand the meaning of the SQL statement: + +1. Join the exposure table with the material table to obtain the attributes needed, such as the material's label type. +2. Expand the request table, adding the material_id and tag columns and filling them with null values. This facilitates the subsequent Union operation with the output table from the first step. +3. Use Window Union to combine the tables from the first and second steps. This results in a complete table, and window and query operations are performed based on this complete table. Note that Window Union is used instead of Join + Window to avoid the possibility of a Left Join producing multiple rows for one data entry, and using Last Join may cause the secondary table to be joined with only one row of data. +4. Finally, use the count_cate function to count the material labels and obtain the feature. + +## Implementation Process + +### 1. Data Import + +First, create the database and tables, and for convenience, add the indexes in advance. + +``` +CREATE DATABASE recommend_system; + +CREATE TABLE recommend_system.request (uid string, event_time timestamp, INDEX(key=uid, TS=event_time)); + +CREATE TABLE recommend_system.feeds (uid string, material_id string, event_time timestamp, INDEX(key=uid, TS=event_time)); + +CREATE TABLE recommend_system.material (material_id string, tag string); +``` + +Since actual data needs to be desensitized, users can import test data according to their actual situation. This article only demonstrates the feature deployment process + +### 2. Feature Definition + +Define features using the SQL statement introduced earlier. + +``` +SELECT + uid, + count_cate(material_id, tag) OVER w AS category_count +FROM + (SELECT uid, CAST (null AS string) AS material_id, CAST (null AS string) AS tag, event_time FROM request) +WINDOW + w AS ( + UNION ( + SELECT + uid, feeds.material_id, material.tag AS tag, event_time + FROM feeds + LAST JOIN material ON feeds.material_id = material.material_id) + PARTITION BY uid ORDER BY event_time ROWS_RANGE BETWEEN 7d PRECEDING AND CURRENT ROW) +``` + +在前端页面创建特征,并自动分析出需要创建的两个特征。 + +Create features through the frontend interface, use "Analyze SQL" to automatically analyze that two features need to be created. + +![](./images/recommend_create_feature.png) + +After successfully creating the features, you can view the details through feature view details. + +![](./images/recommend_feature_view_detail.png) + +### 3. Feature Deployment + +On the "Online Scenario" page, select the features to be deployed and confirm the creation. + +![](./images/recommend_create_feature_service.png) + +After successfully deploying the feature service, you can test it by inputting request data. + +![](./images/recommend_request_feature_service.png) + +## Summary + +For recommendation systems, feature engineering is a crucial step. FeatInsight provides a simple and fast feature management and deployment process, helping users quickly deploy features to enhance the effectiveness of recommendation systems. Even more complex features can be described and deployed using SQL. + diff --git a/docs/en/app_ecosystem/feat_insight/use_cases/taxi_tour_duration_prediction.md b/docs/en/app_ecosystem/feat_insight/use_cases/taxi_tour_duration_prediction.md new file mode 100644 index 00000000000..ca58e66fecb --- /dev/null +++ b/docs/en/app_ecosystem/feat_insight/use_cases/taxi_tour_duration_prediction.md @@ -0,0 +1,106 @@ +# Taxi Trip Duration Prediction + +## Background + +The scenario is from Kaggle's [New York City Taxi Trip Duration](https://www.kaggle.com/c/nyc-taxi-trip-duration/overview), where the goal is to predict the trip duration for taxi rides in New York City. The inputs for prediction include the starting and ending latitude and longitude, departure time, weather conditions, etc. Feature extraction is required to ultimately predict the trip duration based on these features. + +## Feature Design + +For feature design, we refer to [Taxi Trip Duration Prediction (OpenMLDB+LightGBM)](../../../use_case/taxi_tour_duration_prediction.md). The following OpenMLDB SQL is used for feature engineering and export. + +``` +SELECT + trip_duration, + passenger_count, + sum(pickup_latitude) OVER w AS vendor_sum_pl, + max(pickup_latitude) OVER w AS vendor_max_pl, + min(pickup_latitude) OVER w AS vendor_min_pl, + avg(pickup_latitude) OVER w AS vendor_avg_pl, + sum(pickup_latitude) OVER w2 AS pc_sum_pl, + max(pickup_latitude) OVER w2 AS pc_max_pl, + min(pickup_latitude) OVER w2 AS pc_min_pl, + avg(pickup_latitude) OVER w2 AS pc_avg_pl, + count(vendor_id) OVER w2 AS pc_cnt, + count(vendor_id) OVER w AS vendor_cnt +FROM t1 +WINDOW + w AS (PARTITION BY vendor_id ORDER BY pickup_datetime ROWS_RANGE BETWEEN 1d PRECEDING AND CURRENT ROW), + w2 AS (PARTITION BY passenger_count ORDER BY pickup_datetime ROWS_RANGE BETWEEN 1d PRECEDING AND CURRENT ROW) +``` + +## Implementation Process + +### 1. Data Import + +Create database + +Create testing database `taxi_trip_duration` and testing data table `t1`. + +``` +CREATE DATABASE taxi_trip_duration; + +CREATE TABLE taxi_trip_duration.t1 (id string, vendor_id int, pickup_datetime timestamp, dropoff_datetime timestamp, passenger_count int, pickup_longitude double, pickup_latitude double, dropoff_longitude double, dropoff_latitude double, store_and_fwd_flag string, trip_duration int); +``` + +![](./images/taxi_create_table.png) + +Note, for versions before 0.8.4, index creation is not automatic. You need to manually create index when creating tables. + +``` +CREATE TABLE taxi_trip_duration.t1(id string, vendor_id int, pickup_datetime timestamp, dropoff_datetime timestamp, passenger_count int, pickup_longitude double, pickup_latitude double, dropoff_longitude double, dropoff_latitude double, store_and_fwd_flag string, trip_duration int, INDEX(KEY=vendor_id, TS=pickup_datetime), INDEX(KEY=passenger_count, TS=pickup_datetime)); +``` + +Download dataset from Kaggle. + +``` +kaggle competitions download -c nyc-taxi-trip-duration +``` + +Unzip to get `train.csv`, and put the file at `/tmp/train.csv`, in frontend form, use "Import From CSV" to import as online data. + +![](./images/taxi_import_online_data.png) + +Preview online table. + +![](./images/taxi_preview_online_table.png) + +Then perform offline data import. Similarly, use "Import From CSV". + + +![](./images/taxi_import_offline_data.png) + +### 2. Feature Creation + +With SQL statements designed, we create a feature view. Use "Analyze SQL", and a list of features will be automatically generated based on the SQL analysis. + +![](./images/taxi_create_feature.png) + +![](./images/taxi_features.png) + +### 3. Offline Scenario + +To generate offline samples, we choose all the features in the feature view. The samples are used for model training. + +![](./images/taxi_export_offline_samples.png) + +After successful generation, check path `/tmp/taxi_tour_features/` for the generated samples. These data can be directly used for model training. For training, please refer to [Taxi Trip Duration Prediction (OpenMLDB+LightGBM)](../../../use_case/taxi_tour_duration_prediction.md). + +![](./images/taxi_offline_samples_data.png) + +### 4. Online Scenario + +After verifying the SQL through offline scenario, the features can be deployed online as a feature service. + +![](./images/taxi_create_feature_service.png) + +The details of the feature service can be checked here. + +![](./images/taxi_feature_service_detail.png) + +Lastly, at "Request Feature Service", test with request data to verify the consistency between online and offline results. + +![](./images/taxi_request_feature_service.png) + +## Summary + +Using FeatInsight to implement the taxi trip duration prediction scenario is a straightforward and clear process. It is more intuitive compared to using the OpenMLDB command-line tools. Moreover, it eliminates the hassle for scientists to set up the environment, as it can be operated with just a browser. Online debugging of features and feature reuse are also simplified using FeatInsight. diff --git a/docs/en/app_ecosystem/feature_platform/concept.md b/docs/en/app_ecosystem/feature_platform/concept.md deleted file mode 100644 index 8a7a6339644..00000000000 --- a/docs/en/app_ecosystem/feature_platform/concept.md +++ /dev/null @@ -1,12 +0,0 @@ -## Introduction - -The OpenMLDB Feature Platform is a sophisticated feature store service, leveraging [OpenMLDB](https://github.com/4paradigm/OpenMLDB) for efficient feature management and orchestration. - - - -* Feature: Data obtained through feature extraction from raw data that can be directly used for model training and inference. -* Feature View: A set of features defined by a single SQL computation statement. -* Data Table: In OpenMLDB, data tables include online storage that supports real-time queries and distributed offline storage. -* Online Scenario: By deploying online feature services, it provides hard real-time online feature extraction interfaces using online data. -* Offline Scenario: Uses distributed computing to process offline data for feature computation and exports sample files needed for machine learning. -* Online-Offline Consistency: Ensuring that the feature results computed in online and offline scenarios are consistent through the same SQL definitions. diff --git a/docs/en/app_ecosystem/feature_platform/installation.md b/docs/en/app_ecosystem/feature_platform/installation.md deleted file mode 100644 index 1369a3a297f..00000000000 --- a/docs/en/app_ecosystem/feature_platform/installation.md +++ /dev/null @@ -1,55 +0,0 @@ -## Installation - -### Java - -Download the jar file. - -``` -wget https://openmldb.ai/download/feature-platform/openmldb-feature-platform-0.8-SNAPSHOT.jar -``` - -Prepare the config file which may be named as `application.yml`. - -``` -server: - port: 8888 - -openmldb: - zk_cluster: 127.0.0.1:2181 - zk_path: /openmldb - apiserver: 127.0.0.1:9080 -``` - -Start the feature platform server. - -``` -java -jar ./openmldb-feature-platform-0.8-SNAPSHOT.jar -``` - -### Docker - -Prepare the config file `application.yml` and start the docker container. - -``` -docker run -d -p 8888:8888 -v `pwd`/application.yml:/app/application.yml registry.cn-shenzhen.aliyuncs.com/tobe43/openmldb-feature-platform -``` - -### Compiling from Source - -Clone the source code and build from scratch. - -``` -git clone https://github.com/4paradigm/feature-platform - -cd ./feature-platform/frontend/ -npm run build - -cd ../ -mvn clean package -``` - -Start the server with local config file. - -``` -./start_server.sh -``` \ No newline at end of file diff --git a/docs/en/app_ecosystem/feature_platform/tutorial.md b/docs/en/app_ecosystem/feature_platform/tutorial.md deleted file mode 100644 index f7b34200706..00000000000 --- a/docs/en/app_ecosystem/feature_platform/tutorial.md +++ /dev/null @@ -1,9 +0,0 @@ -## Tutorial - -Access the feature platform by navigating to http://127.0.0.1:8888/ using any conventional web browser. - -1. Importing Data: Create databases, create data tables, import online data, and import offline data using SQL commands or frontend forms. -2. Creating Features: Define feature views using SQL statements. The feature platform will use a SQL compiler to analyze the features and create corresponding entities. -3. Offline Scenario: Select the desired features to import. You can choose features from different feature views simultaneously and use distributed computing to import sample files into local or distributed storage. -4. Online Scenario: Select the desired features to go live. Publish them as an online feature extraction service with one click, and then use an HTTP client to request and return online feature extraction results. -5. SQL Debugging: Execute any online or offline computing SQL statement and view the execution results and logs on the web frontend. \ No newline at end of file diff --git a/docs/en/app_ecosystem/feature_platform/usage.md b/docs/en/app_ecosystem/feature_platform/usage.md deleted file mode 100644 index 2dfd2d78c85..00000000000 --- a/docs/en/app_ecosystem/feature_platform/usage.md +++ /dev/null @@ -1,2 +0,0 @@ -## Application Guide - diff --git a/docs/en/index.rst b/docs/en/index.rst index 3b6bd1e7599..9e3d03d9638 100644 --- a/docs/en/index.rst +++ b/docs/en/index.rst @@ -22,5 +22,5 @@ OpenMLDB Docs (|version|) :hidden: :caption: 📚 Application Ecosystem - app_ecosystem/feature_platform/index + app_ecosystem/feat_insight/index app_ecosystem/sql_emulator/index \ No newline at end of file diff --git a/docs/en/use_case/taxi_tour_duration_prediction.md b/docs/en/use_case/taxi_tour_duration_prediction.md index fb790441793..550ab26a216 100644 --- a/docs/en/use_case/taxi_tour_duration_prediction.md +++ b/docs/en/use_case/taxi_tour_duration_prediction.md @@ -1,6 +1,6 @@ -# Taxi Journey Time Prediction (OpenMLDB+LightGBM) +# Taxi Trip Duration Prediction (OpenMLDB+LightGBM) -This article will use [The Problem of Predicting Taxi Travel Time on Kaggle](https://www.kaggle.com/c/nyc-taxi-trip-duration/overview) as an example to demonstrate how to use the combination of OpenMLDB and LightGBM to create a complete machine-learning application. +This article will use [New York City Taxi Trip Duration](https://www.kaggle.com/c/nyc-taxi-trip-duration/overview) from Kaggle as an example to demonstrate how to use the combination of OpenMLDB and LightGBM to create a complete machine-learning application. Please note that this document employs a pre-compiled Docker image. If you wish to perform tests in your self-compiled and built OpenMLDB environment, you will need to configure and utilize the [Spark Distribution Documentation for Feature Engineering Optimization](https://github.com/4paradigm/Spark/). Refer to the [Spark Distribution Documentation for OpenMLDB Optimization](../tutorial/openmldbspark_distribution.md#openmldb-spark-distribution) and the [Installation and Deployment Documentation](../deploy/install_deploy.md#modifyingtheconfigurationfileconftaskmanagerproperties) for more detailed information. diff --git a/docs/zh/app_ecosystem/feat_insight/images/bigscreen.png b/docs/zh/app_ecosystem/feat_insight/images/bigscreen.png index ff341bb40dc..3529371c83a 100644 Binary files a/docs/zh/app_ecosystem/feat_insight/images/bigscreen.png and b/docs/zh/app_ecosystem/feat_insight/images/bigscreen.png differ diff --git a/docs/zh/app_ecosystem/feat_insight/install/docker.md b/docs/zh/app_ecosystem/feat_insight/install/docker.md index 3625d61bd47..e3d65f0e030 100644 --- a/docs/zh/app_ecosystem/feat_insight/install/docker.md +++ b/docs/zh/app_ecosystem/feat_insight/install/docker.md @@ -2,11 +2,11 @@ ## 介绍 -使用官方构建好的 Docker 镜像, 可以快速部署 OpenMLDB 特征服务. +使用官方构建好的 Docker 镜像, 可以快速部署 FeatInsight 特征服务. ## 内置 OpenMLDB 镜像 -使用内置 OpenMLDB 的镜像,可以一键启动 OpenMLDB 集群和 OpenMLDB 特征服务,无需额外部署即可使用特征服务。 +使用内置 OpenMLDB 的镜像,可以一键启动 OpenMLDB 集群和 FeatInsight 特征服务,无需额外部署即可使用特征服务。 ``` docker run -d -p 8888:8888 registry.cn-shenzhen.aliyuncs.com/tobe43/portable-openmldb @@ -17,7 +17,7 @@ docker run -d -p 8888:8888 registry.cn-shenzhen.aliyuncs.com/tobe43/portable-ope ## 不包含 OpenMLDB 镜像 -使用不包含 OpenMLDB 的镜像,需要提前部署 OpenMLDB 集群,然后启动 OpenMLDB 特征服务容器,部署步骤较繁琐但灵活性高。 +使用不包含 OpenMLDB 的镜像,需要提前部署 OpenMLDB 集群,然后启动 FeatInsight 特征服务容器,部署步骤较繁琐但灵活性高。 首先参考 [OpenMLDB 部署文档](../../../deploy/index.rst) 提前部署 OpenMLDB 集群。 diff --git a/docs/zh/app_ecosystem/feat_insight/install/upgrade.md b/docs/zh/app_ecosystem/feat_insight/install/upgrade.md index 1155258a3ec..f03c8f30267 100644 --- a/docs/zh/app_ecosystem/feat_insight/install/upgrade.md +++ b/docs/zh/app_ecosystem/feat_insight/install/upgrade.md @@ -6,6 +6,6 @@ FeatInsight 对外提供 HTTP 接口,底层依赖 OpenMLDB 数据库存储元 ## 单实例升级步骤 -1. 下载新版本的 OpenMLDB 安装包或 Docker 镜像。 -2. 停止当前正在运行的 OpenMLDB 特征服务实例。 -3. 基于新版本 OpenMLDB 特征服务包启动新实例。 +1. 下载新版本的安装包或 Docker 镜像。 +2. 停止当前正在运行的 FeatInsight 实例。 +3. 基于新版本 FeatInsight 包启动新实例。 diff --git a/docs/zh/app_ecosystem/feat_insight/introduction.md b/docs/zh/app_ecosystem/feat_insight/introduction.md index 49f477d6f41..2ac0c79c4dc 100644 --- a/docs/zh/app_ecosystem/feat_insight/introduction.md +++ b/docs/zh/app_ecosystem/feat_insight/introduction.md @@ -14,7 +14,7 @@ FeatInsight 包括以下几个主要功能: - [特征管理](./functions/manage_feature.md):用于存储原始特征数据和派生特征数据的存储系统。 - [在线场景](./functions/online_scenario.md):上线特征服务,使用在线数据提供硬实时的在线特征抽取接口。 - [离线场景](./functions/offline_scenario.md):对离线数据进行特征计算并导出样本文件,提供离线样本、任务管理功能。 -- [SQL实验室](./functions/sql_playground.md):可调试和执行任意的 OpenMLDB SQL 语句,使用在线模式或离线模型完成特征计算任务。 +- [SQL实验室](./functions/sql_playground.md):可调试和执行任意的 OpenMLDB SQL 语句,使用在线模式或离线模式完成特征计算任务。 - [预计算特征](./functions/computed_features.md):用户可以通过预计算把特征值直接存入 OpenMLDB 在线表中,然后访问在线表数据进行读写特征。 ## 核心特性 diff --git a/docs/zh/app_ecosystem/feat_insight/quickstart.md b/docs/zh/app_ecosystem/feat_insight/quickstart.md index 88a255b198c..e0f8878ec47 100644 --- a/docs/zh/app_ecosystem/feat_insight/quickstart.md +++ b/docs/zh/app_ecosystem/feat_insight/quickstart.md @@ -78,7 +78,7 @@ SELECT id, dayofweek(trx_time) as trx_day FROM test_table ### 3. 生成离线样本 -在“离线场景”页面,可以选择导出离线样本,只要选择刚创建好的特征和提供导出路径即可,前端还提供了“更多选项”可以选择到处格式、运行参数等。 +在“离线场景”页面,可以选择导出离线样本,只要选择刚创建好的特征和提供导出路径即可,前端还提供了“更多选项”可以选择导出格式、运行参数等。 ![](./images/export_test_offline_samples.png)