-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compartmentalization/files API simplification #93
Comments
I like the idea from a code perspective, but I'm still a bit unsure about the memory-use implications. Memory use would basically be unbounded (depending on how many open ActivePieces there are, and the size of the torrent's pieces). Although I guess if that were to become a problem, it could be mitigated through other means (off the top of my head, maybe choose at runtime if ActivePiece should use a []byte or file for storage), and we would still have the benefit of simpler code.
BitTorrent file distribution "works" because no client can/should be "purely leeching". Otherwise, looks good to me. |
100% agree, this proposal makes memory usage unbounded. Here's another proposal: keep an upper boundary of memory used for all ActivePieces, and in RequestBlock, prevent from creating a new ActivePiece if we have already reached the limit. Somewhere around here, check if the sum of active pieces' content + the potential new piece goes beyond the threshold; if true, don't request a new piece, just wait for the other ones to be retrieved, committed to disk and then discarded for memory to be available. As you can guess, my goal is to avoid mixing parts as much as possible to keep everything simple (and, as you raised, under control).
If you're talking about "bittorrent the community", then I totally agree with you. However I'd like to see bittorrent used beyond the "community" use case, because the transport itself is extremely powerful; I'd like bittorrent to be the default transport protocol for distributing Wikipedia dumps, or Internet Archive items, or OSM tiles (or raw data), or my distribution packages. For those cases I expect users to be purely leeching content until completion, then leave the swarm (which would be kind of ok because there will always be an available seed, the upstream project). I'd expect them to be barely seeding content. Basically, I'd like bittorrent to be used when a client can reliably get the torrent (or even just the magnet) to retrieve data wherever it comes from, without having to seed just to improve its choking score. Anyway, point taken, TT should remain "community" oriented in that it should not be directed towards pure leeching (that's a fair assumption). |
I like this, it seems like the most correct approach: "Don't []byte off more than we can chew" :-) . |
A while back rakoo, in issue #93, proposed some improvements and simplifications that could be done to how TT handles active pieces, caching, and FileStore. This commit starts on implementing a few.
Hey there,
While working on the files subset of the code (specifically for putting everything files related, so various
*files.go
, various*cache.go
,sftp.go
) I realized that there is room to make the code simpler, more understandable. Here's a rough list of things I have in mind:Modification of active pieces
A
torrent.ActivePiece
would keep a[]byte
with the whole piece content. In https://github.com/jackpal/Taipei-Torrent/blob/master/torrent/torrent.go#L1061 we don't write the chunk to theFileStore
but to the active piece's buffer; then in https://github.com/jackpal/Taipei-Torrent/blob/master/torrent/torrent.go#L844 we don't read content back from theFileStore
but directly from the active piece's buffer.pieceLength
)FileStore
(which could be slow if it's SFTP and you don't have a cache for some reason), the piece is verified directly from memory. Also, checkPiece and computePieceSum aren't needed anymore.FileStore
is remoteFileStore
doc thatio.WriteAt
is only called for full, verified pieces, and that they aren't read back unless asked by another peerFileStore
as the only relevant outside interfaceFileStore
would get back to what it was before, ieA
TorrentCache
would actually be a specific implementation of theFileStore
interface that wraps anotherFileStore
; there would be no specific interface just for it. From the point of view of the user, there's no indication that theFileStore
is actually a cache.CacheProvider
would be changed so thatNewCache
also has an underlyingFileStore
as argument. All reads are forwarded (if necessary), all writes are forwarded.Combined with the previous change in active pieces, there is no need for a specific
Commit
phase forFileStore
, because all writes are "clean" ie they're only full pieces. Any write of a piece to cache should be immediately forwarded to the underlying FileStore. This will also get rid of all theunfulfilled
, because the cache wraps theFileStore
; the cache would just read the whole piece fromFileStore
and store it locally, then write only the relevant part for the caller to use.I am working on this in my
file
branch (see the storage sub-package) if you want to have a look.Compartmentalization
Instead of having pretty much everything in the
torrent
package, I'd like to move the files stuff in its own package (I have astorage
package). I also externalized thebitset
.Thoughts ?
The text was updated successfully, but these errors were encountered: