vlm
Here are 109 public repositories matching this topic...
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.
-
Updated
Sep 19, 2024 - Python
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
-
Updated
Aug 23, 2024 - Python
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
-
Updated
Sep 18, 2024
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.
-
Updated
Sep 20, 2024 - Python
Aircraft design optimization made fast through computational graph transformations (e.g., automatic differentiation). Composable analysis tools for aerodynamics, propulsion, structures, trajectory design, and much more.
-
Updated
Sep 9, 2024 - Jupyter Notebook
A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites
-
Updated
Sep 9, 2024
[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing
-
Updated
Jul 25, 2024 - Python
Famous Vision Language Models and Their Architectures
-
Updated
Sep 8, 2024 - Markdown
A curated list of awesome papers on Embodied AI and related research/industry-driven resources.
-
Updated
Jul 26, 2024
EVE: Encoder-Free Vision-Language Models
-
Updated
Jul 20, 2024 - Python
Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon
-
Updated
Sep 7, 2024 - Jupyter Notebook
Awesome LLM Papers and repos on very comprehensive topics.
-
Updated
Aug 22, 2024
Ptera Software is a fast, easy-to-use, and open-source software package for analyzing flapping-wing flight.
-
Updated
Sep 9, 2023 - Python
Official code for Paper "Mantis: Multi-Image Instruction Tuning"
-
Updated
Sep 9, 2024 - Python
Seamlessly integrate state-of-the-art transformer models into robotics stacks
-
Updated
Sep 18, 2024 - Python
LLaRA: Large Language and Robotics Assistant
-
Updated
Sep 3, 2024 - Python
Improve this page
Add a description, image, and links to the vlm topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the vlm topic, visit your repo's landing page and select "manage topics."