Continuous dumping #193

gavin-norman-sociomantic · 2018-11-20T13:28:06Z

Me and Luca were recently discussing the possibility of modifying the dhtdump process so that a continuous dump to disk was performed. It was an interesting idea, so I thought I'd write up what we discussed, for possible future consideration.

It could work something like the following:

Rather than performing a periodic GetAll on each channel, dhtdump opens a special Listen request to the node. (Something like ListenMulti, discussed in https://github.com/sociomantic/swarm/issues/303... This would be ListenAll, in fact!)
This request would also need to inform the client of removed records, and should be able to send data in compressed blocks (in this case, update speed is probably less important than bandwidth).
Upon receiving updates from the listener, dhtdump would apply them to a (disk-based) tokyocabinet hash database. In this way, an up-to-date snapshot of the data is maintained in real-time on disk.
Upon shutting down, the node would have to make sure to flush all pending updates to the listeners, to avoid losing data.
The dhtnode would need to be modified to support loading from the TC hashtable disk format into memory.

The big (possible) advantages of such an approach would be:

No big delay when shutting down a dht node.
(In principle) much lower risk of losing data due to crashes -- rather than being dumped once every 6 hours (like at present), the data would be flushed to disk very regularly.

Notes:

We may have to check for applications which are writing unmodified records to the dht. Sending such records to the listener would be a waste of effort. (The special listen request could have the facility to ignore Puts which don't modify the record.)
We'd need some way of distinguishing the new disk-based hashtable file format from the old dump format. This may be as simple as detecting the difference between a set of files in a sub-dir vs a single .tcm file. (I'm not sure what file format the TC hashtable database uses.)
The major risk would be in the data rate (i.e. the stream of modified records) becoming too high for the listening connection to handle. What exactly to do in this case is unclear and would require more thought.

gavin-norman-sociomantic · 2018-11-20T13:29:04Z

Further comments:

A big advantage of using the tokyo cabinet disk database is that it has support for transaction-based writing. So, an update can be written in a clean way, with no risk of corruption if a crash occurs half way through.

A quick experiment with the TC disk table shows that puts can occasionally block. For a long time. In tests with 500K puts, we saw one or two requests which would take 1s or longer (sometimes like 4 or 5s) to complete. Despite the function name (tchdbputasync), TC doesn't implement any kind of asynchronous writing, internally.

We concluded that the best approach to try next would be an enhanced dhtdump which Listens to all channels in a DHT node and writes the updates to disk via a TokyoCabinet hash database instance (possibly running in another thread, so as to not block the Listen handling when the disk writes block).

Now we have the neo Mirror request that forwards deletions to the client, it would be possible to look into implementing this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continuous dumping #193

Continuous dumping #193

gavin-norman-sociomantic commented Nov 20, 2018

gavin-norman-sociomantic commented Nov 20, 2018

Continuous dumping #193

Continuous dumping #193

Comments

gavin-norman-sociomantic commented Nov 20, 2018

gavin-norman-sociomantic commented Nov 20, 2018