CYBER: A General Robotic Operation System for Embodied AI

The development of world models in robotics has long been a cornerstone of advanced research, with most approaches relying heavily on vast, platform-specific datasets. These datasets, while valuable, often limit scalability and generalization to different robotic platforms, restricting their broader applicability.

In contrast, CYBER approaches world modeling from a "first principles" perspective, drawing inspiration from how humans naturally acquire skills through experience and interaction with their environment. CYBER is the first general Robotic Operational System designed to adapt to both teleoperated manipulation and human operation data, enabling robots to learn and predict across a wide range of tasks and environments. It builds with a Physical World Model, a cross-embodied Visual-Language Action Model (VLA), a Perception Model, a Memory Model, and a Control Model to help robots learn, predict, and memory across various tasks and embodiments.

At the same time, CYBER also provide millions of human operation datasets and baseline models over HuggingFace 🤗 to enhance embodied learning, and experimental evalaution tool box to help researchers to test and evaluate their models in both simulation and real world.

🌟 Key Features

🛠️ Modular: Built with a modular architecture, allowing flexibility in various environments.
📊 Data-Driven: Leverages millions of human operation datasets to enhance embodied learning.
📈 Scalable: Scales across different robotic platforms, adapting to new environments and tasks.
🔧 Customizable: Allows for customization and fine-tuning to meet specific requirements.
📚 Extensible: Supports the addition of new modules and functionalities, enhancing capabilities.
📦 Open Source: Open-source and freely available, fostering collaboration and innovation.
🔬 Experimental: Supports experimentation and testing, enabling continuous improvement.

🛠️ Modular Components

CYBER is built with a modular architecture, allowing for flexibility and customization. Here are the key components:

🌍 World Model: Learns from physical interactions to understand and predict the environment.
🎬 Action Model: Learns from actions and interactions to perform tasks and navigate.
👁️ Perception Model: Processes sensory inputs to perceive and interpret surroundings.
🧠 Memory Model: Utilizes past experiences to inform current decisions.
🎮 Control Model: Manages control inputs for movement and interaction.

🌍 World Model is now available. Additional models will be released soon.

⚙️ Setup

Pre-requisites

You will need Anaconda installed on your machine. If you don't have it installed, you can follow the installation instructions here.

Installation

You can run the following commands to install CYBER:

bash scripts/build.sh

Alternatively, you can install it manually by following the steps below:

Create a clean conda environment:

 conda create -n cyber python=3.10 && conda activate cyber

Install PyTorch and torchvision:

 conda install pytorch==2.3.0 torchvision==0.18.0 cudatoolkit=11.1 -c pytorch -c nvidia

Install the CYBER package:
```
 pip install -e .
```

🤗 Hugging Face Integration

CYBER leverages the power of Hugging Face for model sharing and collaboration. You can easily access and use our models through the Hugging Face platform.

Available Data

Currently, four tasks are available for download:

🤗 Pipette: Bimanual human demonstration dataset of precision pipetting tasks for laboratory manipulation.
🤗 Take Item: Single-arm manipulation demonstrations of object pick-and-place tasks.
🤗 Twist Tube: Bimanual demonstration dataset of coordinated tube manipulation sequences.
🤗 Fold Towels: Bimanual manipulation demonstrations of deformable object folding procedures.

Available Models

Our pretrained models will be released on Hugging Face soon:

Cyber-World-Large (Coming Soon)
Cyber-World-Base(Coming Soon)
Cyber-World-Small (Coming Soon)

Using the Models (Coming Soon)

🕹️ Usage

Please refer to the experiments for more details on data downloading and model training.

💾 File Structure

├── ...
├── docs                   # documentation files and figures 
├── docker                 # docker files for containerization
├── examples               # example code snippets
├── tests                  # test cases and scripts
├── scripts                # scripts for setup and utilities
├── experiments            # model implementation and details
│   ├── configs            # model configurations
│   ├── models             # model training and evaluation scripts
│   ├── notebooks          # sample notebooks
│   └── ...
├── cyber                  # compression, model training, and dataset source code
│   ├── dataset            # dataset processing and loading
│   ├── utils              # utility functions
│   └── models             # model definitions and architectures
│       ├── action         # visual language action model
│       ├── control        # robot platform control model
│       ├── memory         # lifelong memory model
│       ├── perception     # perception and scene understanding model
│       ├── world          # physical world model
│       └── ...
└── ...

📕 References

Magvit2 and GENIE adapted from 1xGPT Challenge 1X Technologies. (2024). 1X World Model Challenge (Version 1.1) [Data set]

@inproceedings{wang2024hpt,
author    = {Lirui Wang, Xinlei Chen, Jialiang Zhao, Kaiming He},
title     = {Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers},
booktitle = {Neurips},
year      = {2024}
}

@article{luo2024open,
  title={Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation},
  author={Luo, Zhuoyan and Shi, Fengyuan and Ge, Yixiao and Yang, Yujiu and Wang, Limin and Shan, Ying},
  journal={arXiv preprint arXiv:2409.04410},
  year={2024}
}

📄 Dataset Metadata

property value

name CyberOrigin Dataset

url https://github.com/CyberOrigin2077/Cyber

description Cyber represents a model implementation that seamlessly integrates state-of-the-art (SOTA) world models with the proposed CyberOrigin Dataset, pushing the boundaries of artificial intelligence and machine learning.

provider

property	value
name	`CyberOrigin`

license

property	value
name	`Apache 2.0`

📫 Contact

If you have technical questions, please open a GitHub issue. For business development or other collaboration inquiries, feel free to contact us through email 📧 ([email protected]). Enjoy! 🎉

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
cyber		cyber
docs		docs
experiments		experiments
scripts		scripts
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
mkdocs.yaml		mkdocs.yaml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CYBER: A General Robotic Operation System for Embodied AI

🌟 Key Features

🛠️ Modular Components

⚙️ Setup

Pre-requisites

Installation

🤗 Hugging Face Integration

Available Data

Available Models

Using the Models (Coming Soon)

🕹️ Usage

💾 File Structure

📕 References

📄 Dataset Metadata

📫 Contact

About

Releases

Packages

Languages

License

MetaSLAM/Cyber

Folders and files

Latest commit

History

Repository files navigation

CYBER: A General Robotic Operation System for Embodied AI

🌟 Key Features

🛠️ Modular Components

⚙️ Setup

Pre-requisites

Installation

🤗 Hugging Face Integration

Available Data

Available Models

Using the Models (Coming Soon)

🕹️ Usage

💾 File Structure

📕 References

📄 Dataset Metadata

📫 Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages