Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIP-97: Files hosted on relay #719

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ondra-novak
Copy link

Second attempt (previous attempt #694)

Working demo client and relay
https://nostr-test.novacisko.cz/files/index.html

The relay is reset every midnight UTC, so use it for testing.

This proposal is built on NIP-94.

A new feature is ability to retrieve URL to a binary content through field added by relay (as the relay know, where the file is stored).

@ondra-novak ondra-novak changed the title nip-97 version 2 NIP-97: Files hosted on relay Aug 13, 2023
@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Aug 13, 2023

Super nice!

Since the f tag won't be useful in any other event type, it should not be an indexable, single-letter tag. First, because there is no need to index a tag that has the same value everywhere and second because we only got 26 letters to "index" things. On NIP-94 we have url, magnet URIs and, more recently, a streaming tag for m3u8 files. Those tags define how to download the data. This PR could use another name that represents the replay protocol proposed here. Maybe nip97 could be a good tag name.

What if we could make this backward compatible? If f is not a requirement, any user can now download their past NIP-94 content from the centralized website it's currently stored in and send that to the relay that also has the NIP-94 event. If the NIP-94 event has x, m and size and the hash matches the relay could store it.

This would be a huge win to decentralize past NIP-94 content that is now completely stuck with centralized services.

Then in the same way you have the Retrieve the content by URL section, you could have a simple tag "nip97":"true" to tell clients the content is available in the RETRIEVE method as well as in the URL

@ondra-novak
Copy link
Author

Issue with tag 'f' - in reference to my question in a discussion: link

I wanted it to be indexed, for possibility to find all files hosted on the relay. The original idea was to assign it a different kind, then the flag wouldn't have to be there. The introduction of the indexed flag could of course be used elsewhere, but a new NIP would have to be created for it. Meaningfully, it could be an indexed valueless property. It is possible to include more 'f' tags for indexing under more such properties. In relation to my question, it is not currently possible to query for the existence of tags, and at the same time the proposed solution I was redirected to requires an even bigger change at the protocol level than just introducing such a tag

About single letter tags. In NIP-94 itself, I have a minor objection to the x tag - I don't see any use there at all, why files should be indexed according to their hash. Likewise the m tag, although it might already make sense to look for files of a certain type there

@ondra-novak
Copy link
Author

ondra-novak commented Aug 13, 2023

Then in the same way you have the Retrieve the content by URL section, you could have a simple tag "nip97":"true" to tell clients the content is available in the RETRIEVE method as well as in the URL

Only the client can set tags and then signs them with his key. The client cannot know if the relay exposes the uploaded files to any url, nor to which url they will be exposed in the end. In the time before the event is sent, even the relay itself doesn't need to know this, if it has a repository in a distributed form (cloud)

The final URL must be set by the relay. The client also must assume, that URL is dynamic, so it is not wrong to give to different client a different url for the same file!

If the relay doesn't have a http interface (other than one-purposed for ability to accept a websocket connection), the value "file_url" wouldn't be there to indicate, that the file is not available through the http protocol.

There is also question in the forum: link.

It looks like it's ok to put a new item in the event where the relay puts the url where the file can be retrieved. I chose "file_url" for this purpose. The client does not verify this new item for signature validation, nor does it change the post ID

It would be nice to have it designed so that there is no difference between the url for NIP-94 and the "file_url" but unfortunately it is not possible with this arrangement. Anyway, it is still possible to insert the url tag into the event, if the file is available on a centralized network, upload it to the relay at the same time. Such an event will have both tag url and the file_url field.

The main purpose of being able to download a file via the http interface is to relieve the websocket connection when retrieving binary content, since the http interface is more efficient for this purpose, for example, files can be downloaded in parallel, which can be advantageous when downloading timeline posts. The files can then be cached on a local CDN.

Of course, HTTP access will simplify the design of web clients. However, I do not recommend placing the link directly in the HTML code, the client should verify that the link leads to the file he requested, i.e. he should, for example, calculate and compare the hash.

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Aug 13, 2023

Only the client can set tags and then signs them with his key.

Sure, but that's not the idea. The idea is to use a new field, outside the event, set by the relay, and thus not participating in the signature verification hash. It's fully controlled by the relay and dynamic in nature. If the relay has the content, it sets it to true, if not, it doesn't even need to exist. If the client sees the flag, it can then opt for the HTTP interface OR the Websockets interface.

Something like this:

{
   "id":"....",
   "kind": 1063,
   "pubkey":"....",
   ...
   "file_url":"https://cdn.myrelay.example.com/file/abc123acfa7e1256a.jpg", // Can download via HTTP
   "nip97": true // Can download via RETRIEVE. 
}

Neither the file_url nor the nip97 participate in the hash to verify the event. They can be changed at will by the relay.

About single letter tags. In NIP-94 itself, I have a minor objection to the x tag - I don't see any use there at all, why files should be indexed according to their hash. Likewise the m tag, although it might already make sense to look for files of a certain type there

I agree, those didn't need to be single letters either.

I wanted it to be indexed, for the possibility to find all files hosted on the relay.

I am not sure this is a good idea. The majority of the file servers don't expose a directory of all files on purpose to avoid constant crawling by bots out there.

@ondra-novak
Copy link
Author

OK, I understand that, so it's just a matter of the relay itself adding the information that the event also contains a binary content, so the client doesn't have to indicate this in advance with any flag. It's a shame that it makes it somewhat impossible to do the opposite check, i.e. if the client publishes this event without binary content (command EVENT), the relay that supports this NIP will not stop him. But this is not big deal.

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Aug 13, 2023

if the client publishes this event without binary content

I actually think there is a use case for that. Not all relays need to have all the content. In fact, it seems more viable that binary relays would be separate from regular event relays. The Client just needs to figure out where the content is in the same way Clients figure out where events are.

We could even have a nip97-specific relay type on NIP-65's kind 10002. Each user would point to their "home" content provider and clients would use that information to know where to call RETRIEVE on that user's content.

@ondra-novak
Copy link
Author

We could even have a nip97-specific relay type on NIP-65's kind 10002. Each user would point to their "home" content provider and clients would use that information to know where to call RETRIEVE on that user's content.

I can't decide that.

In general, I thought that the client would honestly check supported_nips and if publishing to a relay that supports NIP-97, use the FILE command, and if publishing to another relay, then use the old way.

It will be interesting if someone has a hybrid list of relays to publish to. Then he will probably have to stick with the old way of publishing.

I hope the transition period won't last long

Any idea how I could insert an icon into my profile via NIP-97. Use note: link? Is there a link to the event that would also include a relay (as I do not recommend using that url). I only know nprofile: but this is only for users

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Aug 13, 2023

In general, I thought that the client would honestly check supported_nips and if publishing to a relay that supports NIP-97, use the FILE command, and if publishing to another relay, then use the old way.

In the short term, we will likely ask the user if he/she wants to upload to a server OR to the relay, either in the Settings or a per upload toggle. Automating that decision just confuses everyone because most clients won't support this protocol out of the gate and some clients will never support it.

I hope the transition period won't last long

Don't assume people will only use your NIP. We almost never have full consensus in Nostr. So, we always have to deal with clients not implementing NIPs they don't want to. The best strategy is to offer good value when people decide to implement your NIP and just hope for the best.

Any idea how I could insert an icon into my profile via NIP-97.

It's not well-adopted for images yet, but you could use an nevent1 to point to the NIP-94 record. Then when a client sees it, it can download the image from there. But it requires the client to actually implement that routing.

@ondra-novak
Copy link
Author

I have updated the proposal. The current demo relay doesn't reflect changes yet (I plan to incorporate changes and deploy the new version today evening CEST)

Copy link
Collaborator

@Semisol Semisol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any form of files over websockets is not going to end well.

@staab
Copy link
Member

staab commented Aug 14, 2023

I apologize for the drive-by comment, I don't have time to get into this issue, but why are we not still pursuing #547 for this? Consensus seemed much stronger over there. From a web developer's perspective, using HTTP for file retrieval is much better, since the browser does a lot of the heavy lifting for you (image tags, caching, CSP, etc). You're throwing all that away with websockets.

@vitorpamplona
Copy link
Collaborator

From a web developer's perspective, using HTTP for file retrieval is much better, since the browser does a lot of the heavy lifting for you (image tags, caching, CSP, etc).

This PR allows relays to offer both, an HTTP and a Websockets connection. Clients can choose which they prefer.

@staab
Copy link
Member

staab commented Aug 14, 2023

I see, missed that the first time. Why would downloading via websocket ever be preferred to http? It seems to me supporting both introduces a lot of complexity that would be better spent on helping relays monetize file storage as in #547. I do like the integration with NIP 94 here.

@ondra-novak
Copy link
Author

I apologize for the drive-by comment, I don't have time to get into this issue, but why are we not still pursuing #547 for this? Consensus seemed much stronger over there. From a web developer's perspective, using HTTP for file retrieval is much better, since the browser does a lot of the heavy lifting for you (image tags, caching, CSP, etc). You're throwing all that away with websockets.

As I understand it, that proposal is not blocked in any way, it can be implemented at the same time as this proposal, because it is completely outside the NOSTR network.

My idea was that posts with binary content would travel over the NOSTR network. This is not the same as making NOSTR-RELAY another binary repository. If the binary content is part of an event, it can be replicated, cached, forwarded, and any client is able to retrieve it, even if it was deleted from the original server. This is the foundation of decentralization and censorship resistance. Of course this topic would be for a longer discussion, but my take is that it's just an extension of the original idea of NOSTR to media.

In this proposal, the binary protocol is mainly used for the upload, where the user publishes the event with its binary content. Two methods are available for downloading the binary content: the HTTP interface and the websocket connection. The HTTP interface is an optional method (but I would hope that most relays would implent it) - for the simple, technical reason that because a full HTTP interface is not mandatory for the NOSTR network, while a websocket connection is always mandatory

@ondra-novak
Copy link
Author

ondra-novak commented Aug 14, 2023

I see, missed that the first time. Why would downloading via websocket ever be preferred to http? It seems to me supporting both introduces a lot of complexity that would be better spent on helping relays monetize file storage as in #547. I do like the integration with NIP 94 here.

I prefer download over HTTP, it was always faster and allows to use CDNs to distribute content. Download over websocket is fallback option.
I can imagine a relay without an HTTP interface.

I can also imagine a bridging service that forwards events from one side of the network to the other using a pure websocket connection, where forcing an http connection for certain content would add complexity.

About monetization and other topics, I only solved the technical part of the issue, other ideas will need to be given by other members of the community

@ondra-novak
Copy link
Author

demo relay and client updated.

@pj8912
Copy link

pj8912 commented Aug 22, 2023

My old idea for file storage(which I called nostrage[file storage on nostr]) is slighlty different where I have two types of file storage:

  • Public
  • Private

Public is available for anyone to see and download, Private is encrypted on upload and needs key to decrypt which will be unique per file.
The file encryption type depends on the client. Would be better if a client have a dashboard displaying the filename, type, relays used , encryption key as columns.

Similat to the p2p file storage systems we have except the file is not split into pieces(shards) and distributed across multiple nodes.

Would this idea work on this NIP since what I thought about is relay specific?

The relay might monitize on private file storage?

@pj8912
Copy link

pj8912 commented Aug 22, 2023

what is ["aes-256-gcm",<key>, <iv>], in tags?

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Aug 22, 2023

["aes-256-gcm",<key>, <iv>]

Don't be too attached to that format. It was proposed months ago, but I don't think anyone is actually using it. We can change if want/need to.

@pj8912
Copy link

pj8912 commented Aug 22, 2023

@vitorpamplona what do you think about #719 (comment) ?? Is it possible in this nip ?

@ondra-novak
Copy link
Author

I designed this proposal as a technical solution. However one can build on this. You can use, for example, a zap to an account that is set up to collect storage fees.

Encryption is supposed to be done on the client side so that even the provider of the relay can't see the content.

@vitorpamplona
Copy link
Collaborator

Is it possible in this nip ?

I see no problem with implementing what you described through this NIP.

@ondra-novak
Copy link
Author

ondra-novak commented Aug 22, 2023

It is a bit difficult for relay to decide what is private and what is public content. Private content is probably encrypted, but how do you determine if something is encrypted. You can't trust the mime type, because it came from the user and may have been entered intentionally incorrect.

But you could build it on analyzing the binary content (like the file command does in Linux, for example) and if the analysis recognizes the file, it can assume it is not encrypted and therefore public

@pj8912
Copy link

pj8912 commented Aug 22, 2023

The storage type should be explicitly specified on file upload. file_storage_type : public / private ,
if public retrieve URL to a binary content ,
if private :

  • to be viewed/downloaded display that the file is private and ask for the key.

of course the client will check if the user has the keys stored for decryption with which the file will be displayed, else the error message requiring key.

This might work 🤞

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Aug 22, 2023

to be viewed/downloaded display that the file is private and ask for the key.

Why would the relay want the key? That is not supposed to happen. The client is supposed to download the encrypted content from the relay and then decrypt it with the key locally. In that way, the key never leaves the device.

You can use NIP-98 (HTTP Auth) to let the user prove they control the key. But that proof won't tell the server if the user can decrypt it. And will definitely not allow the relay to decrypt by themselves.

@pj8912
Copy link

pj8912 commented Aug 22, 2023

ask for the key as in the key required for decryption, stored by the user. Like a pop-up. Not the relay , this is on the client-side. I'll write key for decryption next time 😉 .

@ondra-novak
Copy link
Author

The storage type should be explicitly specified on file upload. file_storage_type : public / private ,

You can't build relay monetization based on this "flag". It is set by the user. If you require a fee for private storage, nobody is going to set this flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants