An embedding is a relatively low-dimensional space into which you can translate high-dimensional vectors. Embeddings make it easier to do machine learning on large inputs like sparse vectors representing words. Ideally, an embedding captures some of the semantics of the input by placing semantically similar inputs close together in the embedding space. An embedding can be learned and reused across models. Embeddings Ref
Embeddings are commonly used for OpenAI Ref:
- Search
- Clustering (grouped by similarity)
- Recommendations
- Anomaly detection
MediaPipe is a set of On-Device Machine Learning libraries ready for deployment in production. There are libraries for Android, iOS, Web, and Python. One of the multiple solutions is to create a numerical representation of text data, this means the embeddings.
This is an open-source code used MediaPipe for creating the embeddings and cosine as a similarity measure with Kotlin - Android.
Try it:
Made with ❤ by jggomez.
Copyright 2024 Juan Guillermo Gómez
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.