Standard Storage Solution #2

0x4007 · 2024-06-04T12:49:50Z

Standardizing Plug-in Data Storage in Organization-Wide Configuration Repository

Objective

Establish a standardized method for storing plug-in data in the .ubiquibot-config repository, ensuring data integrity and security. An additional benefit is that this allows partners full control over their data and decentralizes the data storage.

Specification

Storage Structure

Each plug-in will have its own JSON database file.
The filename of each JSON database will be the plug-in ID.
This ensures that plug-ins cannot tamper with each other's data.

JSON Database Format

Each JSON file will store data specific to its corresponding plug-in.
The structure within the JSON file is determined by the plug-in's requirements.

Example

For a plug-in with ID @ubiquibot/command-start-stop, the JSON file will be named ubiquibot-command-start-stop.json.

{
    "dataKey1": "value1",
    "dataKey2": "value2",
    ...
}

Access Control

The kernel will manage read and write permissions.
Write access will be restricted to ensure plug-ins can only modify their own JSON file.
Read access can be granted based on plug-in ID, allowing access to other plug-ins' data as needed.

Implementation

Repository Setup
- Use the .ubiquibot-config repository as the general-purpose utility repository per organization.
- Configure GitHub App permissions to allow the kernel to manage repository access.
Kernel Configuration
- Ensure the kernel has write access to the repository.
- Implement read access control based on plug-in IDs.

Security Considerations

Restrict write permissions to prevent unauthorized modifications.

GitHub App Permissions

The kernel requires the following GitHub App permissions:
- Read and write access to the configuration repository.

Benefits

Data Integrity and Security: By isolating each plug-in's data in its own JSON file, we ensure that plug-ins cannot interfere with each other’s data.
Partner Control: Partners have full control over their data, enhancing privacy and security.
Decentralized Storage: Decentralizing data storage minimizes the risk of data breaches and central points of failure.
Simplified Development: Standardizing data storage eliminates the need to handle different data providers when developing plugins. Methods in our SDK will make it simple for plugin developers to store and access data.

Summary

By standardizing the storage of plug-in data in separate JSON files named after the plug-in ID, we ensure data integrity and security. The kernel will manage access control, providing a robust framework for plug-in data management, and simplifying the development process for plugin developers.

The text was updated successfully, but these errors were encountered:

0x4007 · 2024-06-04T12:56:56Z

First step is to ensure that assumptions are accurate.

Have the kernel push code to the repository when working in another repository.
Be able to read JSON databases from the other plugins.
Do all of this without requiring threatening permissions.

gentlementlegen · 2024-06-11T05:41:21Z

Having something self contained is a great idea, and would probably make plugin development easier if we didn't have to spin up a db instance for each plugin. However json format might be limited at some point which is why I would suggest something more robust like SQLite.

I think ideally plugins should not rely on the Kernel for reading their own content, but be responsible themselves for it. However we will always reach a limitations when it comes to user / wallet retrievals as this data should be shared for everything, otherwise we would end up with duplicate DBs which would be difficult to maintain and update.

We still have one issue remaining which is the storage. Even by using JSON, SQLite or any file base system, we need to store / read / update the content. First, it might trigger security issues if the data becomes sensitive. Second, it would have atomic requirements since many runs could occur in parallel.

0x4007 · 2024-06-11T06:40:40Z

On the fence about SQLite. It's nice that it handles so many catastrophic errors out of the box but I also would rather ensure that plugin development is as easy as possible for new developers.

Auditing a plaintext json object is way easier than working with a database or having to find a database viewer. The files are stored as binary objects in SQLite.

gentlementlegen · 2024-06-11T07:11:54Z

Yes that's a nice thing to consider. For me the advantage is also that it is easier to have:

generated types based on the schema
query engine, so easier to aggregate, sort etc.
migration system, if any change in the schema is needed
backup and copies
security for data loss (ACID)
atomicity
lower memory consumption, so less resource hungry (JSON would put the whole file in memory)

With JSON, you would need to write a manual script for any schema change. Each plugin would have its own custom code for query which is very error prone and tedious to maintain, and way less performant. If two plugins access the data, or if the server crashes, very high chances to break and lose the whole content. All of these reason would be quite a trade-off just to be able to view data.

For me, IntelliJ comes with a built-in viewer for my DB so I actually never leave my IDE. VsCode has a similar plugin to view them:
https://marketplace.visualstudio.com/items?itemName=qwtel.sqlite-viewer

rndquu · 2024-06-13T15:20:37Z

Plain JSON storage is useful only for really simple and small plugins. It is not scalable at all compared to any RDBMS. Why don't we let plugin developers choose the storage they want (i.e. need for specific task) instead of forcing them to use a solution that is applicable to only a small part of storage use cases.

0x4007 · 2024-06-13T15:33:54Z

Let's start with plain JSON files and then we can add more advanced support later if needed. None of our existing plugins have any sort of complex data querying needs.

There is no need to over-engineer things "just in case" if we haven't gotten close to those hypothetical problems in a couple of years of r&d for the existing bot capabilities.

rndquu · 2024-06-13T15:46:45Z

Let's start with plain JSON files and then we can add more advanced support later if needed. None of our existing plugins have any sort of complex data querying needs.

There is no need to over-engineer things "just in case" if we haven't gotten close to those hypothetical problems in a couple of years of r&d for the existing bot capabilities.

My point is that we don't need to add storage support to the SDK at all since:

We won't cover all possible use cases
We should give plugin developers a freedom of selection a storage solution they want to use + fits plugin use case

There is no need to over-engineer

Exactly, there is no need to implement "save to JSON file SDK" from scratch when plugin developers can setup this feature in an hour using any npm package

gentlementlegen · 2024-06-13T15:51:34Z

My question would be more "where do we store it"? Because my experience so far right now the major problem I encounter with plugins is where do I store the data. Currently I have access to Supabase, but other contributors don't.

Letting the developer chose its own solution is ok, but say they chose Neo4j somehow to do their plugin, how do we handle this? Because we should not rely on that external contributor to have its own instance, so we should definitely be in control of the data. Also JSON would mean anything can read, and potentially write into it.

rndquu · 2024-06-13T15:56:44Z

Letting the developer chose its own solution is ok, but say they chose Neo4j somehow to do their plugin, how do we handle this?

Why do we need to handle it? Let the developer use Neo4j.

we should definitely be in control of the data

We should be in control of the data related only to the core plugins (conversation rewards, permit generations, etc...). We don't need access to 3rd party plugins data.

0x4007 · 2024-06-14T15:49:20Z

It is attractive to DAOs especially to decentralize the storage and to allow them to own their own data. In addition, it makes plugin development simple and straightforward for debugging. That is why JSON storage in the utility repository that the bot already requires .ubiquibot-config makes sense.

The implementation logic can be any existing framework, that's fine. But it needs to authenticate via the kernel to write to the repository.

Keyrxng · 2024-07-04T14:42:22Z

Is it possible for the kernel to restrict the fetching of repo contents down to a specific file? Or is the intention to pass the data via the payload? Via the payload makes custom handling difficult. If it's possible, the public & private repo approach with restrictions placed on what is shared only for private repo storage would be great.

I think add support for JSON storage out of the box but from the SDK not the kernel as it'll be most common and is good DX.

Allow custom solutions but the burden is on the developer to make it easy to integrate with other plugins.

Two new flags:

publicStorage
accessibleWhilePrivate

The first determines if it's in the public storage repo accessible by all. The second determines if the kernel will allow it to be shared across plugins while in the privateRepo. This would be ideal if possible, in my opinion.

These flags should make it possible for the org to configure the visibility of aspects of their storage as they see fit without affecting plugin usage.

If publicStorage: false and accessibleWhilePrivate: false, plugins can't access the data so developers must handle output specifically for each plugin.

0x4007 · 2024-07-05T11:11:09Z

We will continue to use the .ubiquibot-config repo. There's no point to making a separate storage repo.

Not sure about the implementation details otherwise but I did see under the GitHub app settings that you can share a specific file. Maybe there are some other similar permissions settings that could be of use.

Keyrxng · 2024-10-04T15:12:59Z

I've implemented an approach for this here.

using ubiquibot-config repository
targeting the storage branch
path looks like: ubiquibot-config/plugin-storage/telegram-bot/<dbOject>.json

I think we should have a couple global storage objects that all plugins can use and I think we should have a partner create an app dedicated to their storage needs. Afaik currently partners need only create one app isn't that right? The bot itself and I don't think it's too much of an ask to create one more that'll remove DB dependency completely, plus we may need it for safer private access.

user-base.json: it makes sense to track a partner's user base globally as opposed to each plugin having to build up their own user database. Commands like /wallet, /register etc either via GitHub or Telegram should fetch/push to the same object. 9/10 plugins will need user base info as most current plugins do.

This here is the user-base.json I have created for the telegram-bot plugin. This pretty much covers everything we need about a contributor (minus the notification specific props), I propose that we do something like this detached from any specific plugin. Here we could also store their level, USD earned, etc as well maybe?

org-ownership.json: If a partner has multiple orgs like we do, we can use this to store references to other orgs which a partner owns which we can leverage to build a single storage location. Prior to any fetch/push we check if they specify a main org to use for storage. Setting this up could be done via TG bot or UI (as we'd need some kind of validation that doesn't use app_private_key, requires more thought). It'll suck having to use /register /subscribe /wallet etc 4x for new contributors to be able to set themselves up across all of our orgs.

In my mind I think it may become hard to work with if we keep all storage completely walled off from any plugin other than the one that created it. I feel we need a few globals that can make interoperability more feasible. We could expose DB shapes, locations etc via a plugin' manifest (the author decides) to make things really accessible.

Additionally I think that "GitHub as a storage layer" should be documented entirely separate from within plugins in our official Ubiquity OS ecosystem docs and/or extensively in the README if it needs to be a plugin.

0x4007 · 2024-10-05T10:23:29Z

Let's make it possible to read from other plugin values that's fine. The plugin developer simply must specifically request for it by ID like

const data = storage.get(`ubiquity-os-marketplace/conversation-rewards`);

In other news I realize that the simple solution for global storage can be that we create a dedicated organization with a special repo that is hard coded into the kernel to be able to fetch from or something.

For example:

@ubiquity-os-storage/ubiquibot-config/.github/plugin-store/*.json

We can consider making a batch writing system when we have scaling problems¹

the writes are committed on separate branches based on the org name then consolidated daily with a cron job or something. We can make reads intelligently look for the updates first on the org branch and then use it to sum to the consolidated global main branch results. ↩

0x4007 mentioned this issue Jul 4, 2024

command/[start | stop] ubiquity-os/plugins-wishlist#9

Closed

gentlementlegen mentioned this issue Jul 15, 2024

feat: pull requests are automatically merged based on their activity ubiquity-os-marketplace/daemon-merging#1

Merged

This was referenced Aug 15, 2024

Crypto faucet ubiquity-os/plugins-wishlist#35

Open

Save EOA to a DB ubiquity/safe.ubq.fi#3

Open

devpool-directory-superintendent bot mentioned this issue Sep 25, 2024

Standard Storage Solution ubiquity/devpool-directory#1602

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standard Storage Solution #2

Standard Storage Solution #2

0x4007 commented Jun 4, 2024 •

edited

Loading

0x4007 commented Jun 4, 2024

gentlementlegen commented Jun 11, 2024

0x4007 commented Jun 11, 2024

gentlementlegen commented Jun 11, 2024

rndquu commented Jun 13, 2024

0x4007 commented Jun 13, 2024

rndquu commented Jun 13, 2024

gentlementlegen commented Jun 13, 2024

rndquu commented Jun 13, 2024

0x4007 commented Jun 14, 2024 •

edited

Loading

Keyrxng commented Jul 4, 2024

0x4007 commented Jul 5, 2024

Keyrxng commented Oct 4, 2024

0x4007 commented Oct 5, 2024 •

edited

Loading

Standard Storage Solution #2

Standard Storage Solution #2

Comments

0x4007 commented Jun 4, 2024 • edited Loading

Standardizing Plug-in Data Storage in Organization-Wide Configuration Repository

Objective

Specification

Storage Structure

JSON Database Format

Example

Access Control

Implementation

Security Considerations

GitHub App Permissions

Benefits

Summary

0x4007 commented Jun 4, 2024

gentlementlegen commented Jun 11, 2024

0x4007 commented Jun 11, 2024

gentlementlegen commented Jun 11, 2024

rndquu commented Jun 13, 2024

0x4007 commented Jun 13, 2024

rndquu commented Jun 13, 2024

gentlementlegen commented Jun 13, 2024

rndquu commented Jun 13, 2024

0x4007 commented Jun 14, 2024 • edited Loading

Keyrxng commented Jul 4, 2024

0x4007 commented Jul 5, 2024

Keyrxng commented Oct 4, 2024

0x4007 commented Oct 5, 2024 • edited Loading

Footnotes

0x4007 commented Jun 4, 2024 •

edited

Loading

0x4007 commented Jun 14, 2024 •

edited

Loading

0x4007 commented Oct 5, 2024 •

edited

Loading