Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIP-95 Revisit #1145

Closed
wants to merge 10 commits into from
Closed

Conversation

arthurfranca
Copy link
Contributor

Read here

I was reading #345 new comments and came up with this spec.

Differences:

  • a random pubkey (not the uploader's one) used just once as author of the file event(s) is what identifies the file
  • file can be made of multiple chunks, all with the same above pubkey
  • new NIP-65 flag to configure user's "file relays" cause most relays won't accept NIP-95 events
  • nfile entity

@vitorpamplona
Copy link
Collaborator

File relays and nfile are good ideas, but why the random pubkey? I don't get it.

@arthurfranca
Copy link
Contributor Author

but why the random pubkey?

The NIP is missing a section that I'm gonna add soon. The idea is that the same chunk event set may have multiple "owners/uploaders" to avoid duplication. Because of that, one reason for using a random pubkey is that the chunk event author does not need to be the main pubkey of the user who first uploaded the file.

The random pubkey would be used as author of all chunk events of a single file and then should never be reused as author of anything else. To get all chunks, client filter by { authors: ["<the-random-pubkey>"], kind: [1064] }. This way the pubkey would identify this "version" of a file (the same file can have many "versions", e.g. it can be split into 3 chunks or 1 or 10 which would be 3 versions).

But some user may misbehave and reuse the key so maybe it is not a good way to group all chunks of a file.

@vitorpamplona
Copy link
Collaborator

Interesting, so the point is to find a way to query the chunks of a single file without knowing all the event ids beforehand and without allowing other people to add a malicious chunk in the middle of your file:

{ authors: ["<key-per-file>"], kind: [1064] }

Meaning: If the event header has all the ids in a list, no one can add a malicious chunk, but you have to create a filter with all the ids from that event and that filter can be huge.

{ ids: [<huge list of chunk event ids>] }

Alternatively, chunk events can tag an unbound list. But, since it's very easy for anyone else out there to create a new malicious chunk and also point to the same list, the filter must include the author of the header:

{ #n: ["<list name>"], authors: ["owner"], kind: [1064] }

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Mar 29, 2024

n could be the file hash that is already computed for the header event.

{ #n: ["<full file hash>"], authors: ["owner"], kind: [1064] }

Copy link
Collaborator

@vitorpamplona vitorpamplona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly this random pubkey business seems like a lot of work for what could be achieved with unbound lists.

95.md Outdated

## Upload

To upload a file, first client must convert its bytes to base64. It may do it in chunks made of multiples of 3 bytes or in one go.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 bytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiples of 3 cause base64 needs atleast 3 bytes to encode. it could be 255000 bytes per chunk for example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad english, tried to improve text

95.md Outdated
Client should upload to user's "file relays", which use the [NIP-65](65.md) `f` flag.
When downloading a file uploaded with this NIP, it should search on the uploader's "file relays".

**Relays must NOT honor `kind:5` deletion events referencing file chunk events.** Deletion
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this. I might want to get a separate chunk of the file in each relay. Deleting chunks should be possible and it should be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of the following flow, maybe it is bogus:

  • An userA uploads a file.
  • Another userB sees it on his client and asks client to copy and upload it too (to make it his file) and both users happen to use the same file relay.
  • UserB client instead of re-uploading it, just registers userB as an owner/uploader of the file that is already on the relay.

Now that userA and userB both own the file, we can't let userA delete the chunks.. userA can just unregister himself as not an uploader/owner anymore. The file is deleted if there is no registered uploader.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you trying to register many owners for shared chunks and only allow deletion when all owners request or stop using the chunk?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, anyone the file relay authorizes can become an owner ("uploader) of a file chunk set.

There is an "uploader event" for that. When no "uploader event" is present on the relay anymore (deleted with kind:5), the file relay is free to automatically delete the chunk set (not using kind:5 here, just auto-deletes).

95.md Outdated
- `["OK", "<kind:1065-event-id>", true, "uploaded: ..."]`: The corresponding `kind:1064` file chunks are already uploaded, trying to re-upload them will fail;
- `["OK", "<kind:1065-event-id>", true, "upload: Missing chunks 1, 2, 7, 10"]`: File isn't uploaded yet or incomplete, user is allowed to upload it on this ws connection;

Trying to send a `kind:1064` event before a `kind:1065` one should fail.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like these custom behaviors for relays.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for this is that a file relay can't let a client send a big event (a file chunk) just to later reply that the client/user had no authorization to do it. It wastes relay resources so I imagine file relay will start with a capped max ws message size for every new ws connection until it sees a kind:1065 event "asking" for authorization to upload a kind:1064 event.

95.md Outdated Show resolved Hide resolved
@NfNitLoop

This comment was marked as off-topic.

@arthurfranca
Copy link
Contributor Author

@NfNitLoop I get your point but authors aren't listed anymore on NIPs. I did steal many ideas while adding some of my own and glued them together here cause there were many changes that needed to be placed together to make it cohesive and would be hard to explain and ask them to be considered separately there at #345.

My goal is solely to help come up with the best version of a NIP we could. This one was my vision of how it could look like.

I can change the NIP number and the kinds.. doesn't matter, I just put the text here for whoever may be insterested to discuss if it is better, worse or if could be improved further or ditched in favor of a better version.

@arthurfranca
Copy link
Contributor Author

@vitorpamplona I think I may have confused you by reusing the kind:1065 to mean something other than file metadata.

On this NIP, kind:1065 means "uploader" and would look like this:

{
  kind: 1065,
 pubkey: "<uploader-main-pubkey>",
  ...,
  tags: [
    ["f", "<key-per-file>"]
  ]
}

There could have a NIP-94 event or a copy with another number like you suggested that would have the metadata tags like:

{
  kind: 10xx,
  pubkey: "<user-main-pubkey>",
  ...,
  tags: [
    ["f", "<key-per-file>"],
    ["nip95u", "<uploader-main-pubkey>"], // most times it is the same as "<user-main-pubkey>"
    ["size", "..."],
    ["dim", "..."],
    ["blurhash", "..."]
  ]
}

NIP-94 event isn't required cause nfile with or without NIP-54 inline metadata could be used instead inside a kind:1 for example.

I will change all the kind numbers.

@vitorpamplona
Copy link
Collaborator

The gains of using random pubkeys to represent files are still not clear to me though.

@arthurfranca
Copy link
Contributor Author

@vitorpamplona It is true that a file chunk set fits well into the unbound list spec, that addresses a set with the owner pubkey + n tag (that could be set to the sha256 as you said).

But it may not be a perfect fit. Because a pubkey may (don't know why it would want to do it but it is possible) upload the same file with the same hash twice but with a different set of chunks.

Example:
First time it sends 3 chunks of 9 bytes
Second time it sends 1 chunk of 27 bytes

Now we got two versions of the same file. 4 chunks that shouldn't belong on the same unbound list.

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Mar 29, 2024

Yeah, that would require not using the hash of the file as the name of the unbound list, but we could do hash+blocksize as a name and then have 2 tag entries in the header event pointing to each unbound list. The receiver can choose which one to download.

@arthurfranca
Copy link
Contributor Author

Right, I will edit it.

@arthurfranca
Copy link
Contributor Author

@vitorpamplona now it is using unbound list. One thing left is adding nfile to NIP-19 but not sure yet how to do it.

@NfNitLoop @frbitten what do you think of this version of NIP-95?

@arthurfranca
Copy link
Contributor Author

Reviewing this I think it has some problems:

  1. on upload: serializing a somewhat big payload (chunk event) to sign it is bad;
  2. on download: need to put file chunk content in memory at once to send it as a nostr event, though small chunks wouldn't be that of a problem;
  3. storage: it feels wrong to possibly store many versions (sets of chunks of variable chunk sizes) of the same file (same sha256). somehow the identifier should be just the sha256 hash instead of it plus chunk size;
  4. of course the base64 encoding/decoding step isn't ideal too;

NIP-96 is better. #719 may be good too

@arthurfranca arthurfranca deleted the nip-95-revisit branch May 9, 2024 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants