Remove lnd gossip workaround, clean up for modern spec #4184

rustyrussell · 2020-11-06T02:51:31Z

We can use a counter not a bitmap since the spec insists replies be in order.

We remove an old LND workaround, and add another.

Then we make our code more efficient for the "query every single channel!" case.

m-schmoock

Some nits and a failing testcase because of the large reply splitting optimization:

tests/test_gossip.py::test_gossip_query_channel_range (line 703)

    # It should definitely have split
    l2.daemon.wait_for_log('queue_channel_ranges full: splitting')

m-schmoock · 2020-11-06T11:28:42Z

gossipd/queries.c

+	u16 i, old_num, added;
+	const struct channel_update_timestamps *ts;
+	/* Zero means "no timestamp" */
+	const static struct channel_update_timestamps zero_ts;


Not 100% sure if it's really safer on some wired compiler or platform, but a {0} initialization can't hurt:

Suggested change

const static struct channel_update_timestamps zero_ts;

const static struct channel_update_timestamps zero_ts = {0, 0};

Sure; it's implied but better to be clear!

m-schmoock · 2020-11-06T11:29:54Z

gossipd/queries.c

@@ -414,31 +434,14 @@ static void get_checksum_and_timestamp(struct routing_state *rstate,
 }

 /* FIXME: This assumes that the tlv type encodes into 1 byte! */
-static size_t tlv_len(const tal_t *msg)
+static size_t tlv_len(size_t num_entries, size_t size)


Can this / should this be inlined?

Suggested change

static size_t tlv_len(size_t num_entries, size_t size)

static inline size_t tlv_len(size_t num_entries, size_t size)

No, we never inline except in headers; the compiler is pretty smart.

I used to say "inline is the register keyword of the 90s" which just shows how long I've been saying it:)

m-schmoock · 2020-11-06T11:32:33Z

gossipd/gossipd.h

 #include <ccan/list/list.h>
 #include <ccan/short_types/short_types.h>
 #include <ccan/timer/timer.h>
 #include <common/bigsize.h>
 #include <common/node_id.h>
+#include <wire/peer_wire.h>


This created a duplicate import in gossipd.c line 67 which fails on the code checks.
I recommend removing it from the c file.

m-schmoock · 2020-11-06T11:35:11Z

gossipd/queries.c

+	if (timestamps_tlv) {
+		ts = decode_channel_update_timestamps(tmpctx,
+						      timestamps_tlv);
+		if (!ts || tal_count(ts) != tal_count(scids)) {


While we are at it we can extract the pure decoding error message from the unequal count:

Suggested change

if (!ts || tal_count(ts) != tal_count(scids)) {

if (!ts) {

return towire_errorfmt(peer, NULL,

"reply_channel_range can't decode timestamps.");

}

if (tal_count(ts) != tal_count(scids)) {

...

m-schmoock · 2020-11-06T12:34:01Z

gossipd/queries.c

+			     query_option_flags, &tstamps, &csums);
+
+	limit = max_entries(query_option_flags);
+	off = 0;


Throughout the rest of this function the offset variable size_t off is untouched and will always be constant 0. It will only used for adding or subtracting on some integer or indexes. I think this is a leftover of some prior code state of yours ;)

I assume it's for the splitter logic where a single block can't fit the reply, and therefore the later replies will start from off. But as @m-schmoock points out it isn't set anywhere, so presumably it should be set to n on line 605, before decrementing n-- to make it fit :-)

Good catch, fixed.

…ge replies. The spec (since d4bafcb67dcf1e4de4d16224ea4de6b543ae73bf in March 2020) requires that reply_channel_range be in order (and all implementations did this anyway). But when I tried this, I found that LND doesn't (always) obey this, since don't divide on block boundaries. So we have to loosen the constraints here a little. We got rid of the old LND compat handling though, since everyone should now be upgraded (there are CVEs out for older LNDs). Signed-off-by: Rusty Russell <[email protected]> Changelog-Removed: Support for receiving full gossip from ancient LND nodes.

rustyrussell · 2020-11-09T06:30:42Z

Fixed the un-updated off variable, and fixed up the various changes which broke tests and check-source.

It's not (yet?) compulsory to have the timestamps, but handing them around together makes sense (a missing timestamp has the same effect as a zero timestamp). Signed-off-by: Rusty Russell <[email protected]>

We used to create the entire reply, the if it was too big, split in half and retry. Now that the main network is larger, this always happens with a full request, which is inefficient. Instead, produce a reply assuming no compression, then compress as a bonus. This is simpler and more efficient, at cost of sending more packets. I also renamed an internal dev var to make it clearer. Signed-off-by: Rusty Russell <[email protected]>

Thanks to m-schmook's feedback. Signed-off-by: Rusty Russell <[email protected]>

m-schmoock · 2020-11-09T19:24:45Z

For some strange reasons I can't resolve comments and suggestions. Just ignore them, everything I found is addressed

niftynei

ACK cefac90

rustyrussell added gossip lnd-compat labels Nov 6, 2020

rustyrussell added this to the v0.9.2 milestone Nov 6, 2020

m-schmoock requested changes Nov 6, 2020

View reviewed changes

ElementsProject deleted a comment from m-schmoock Nov 6, 2020

ElementsProject deleted a comment from cdecker Nov 6, 2020

rustyrussell force-pushed the guilt/remove-lnd-gossip-workaround branch from 287a382 to d1e231f Compare November 9, 2020 06:29

rustyrussell added 3 commits November 9, 2020 20:00

gossipd: new struct to hold scids and timestamps together.

bb84988

It's not (yet?) compulsory to have the timestamps, but handing them around together makes sense (a missing timestamp has the same effect as a zero timestamp). Signed-off-by: Rusty Russell <[email protected]>

gossipd: minor cleanups.

cefac90

Thanks to m-schmook's feedback. Signed-off-by: Rusty Russell <[email protected]>

rustyrussell force-pushed the guilt/remove-lnd-gossip-workaround branch from d1e231f to cefac90 Compare November 9, 2020 09:34

niftynei approved these changes Nov 9, 2020

View reviewed changes

niftynei merged commit 9b0af9f into ElementsProject:master Nov 9, 2020

C-Otto mentioned this pull request Nov 17, 2020

PermanentFailure handling should be more lenient (ChanStatusBorked) lightningnetwork/lnd#4776

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove lnd gossip workaround, clean up for modern spec #4184

Remove lnd gossip workaround, clean up for modern spec #4184

rustyrussell commented Nov 6, 2020

m-schmoock left a comment

m-schmoock Nov 6, 2020

rustyrussell Nov 9, 2020

m-schmoock Nov 6, 2020

rustyrussell Nov 9, 2020

m-schmoock Nov 6, 2020

m-schmoock Nov 6, 2020

rustyrussell Nov 9, 2020

m-schmoock Nov 6, 2020

cdecker Nov 6, 2020

rustyrussell Nov 9, 2020

rustyrussell commented Nov 9, 2020

m-schmoock commented Nov 9, 2020

niftynei left a comment

	const static struct channel_update_timestamps zero_ts;
	const static struct channel_update_timestamps zero_ts = {0, 0};

	static size_t tlv_len(size_t num_entries, size_t size)
	static inline size_t tlv_len(size_t num_entries, size_t size)

-		if (!ts || tal_count(ts) != tal_count(scids)) {
+		if (!ts) {
+			return towire_errorfmt(peer, NULL,
+                                              "reply_channel_range can't decode timestamps.");
+		}
+		if (tal_count(ts) != tal_count(scids)) {
+			...

Remove lnd gossip workaround, clean up for modern spec #4184

Remove lnd gossip workaround, clean up for modern spec #4184

Conversation

rustyrussell commented Nov 6, 2020

m-schmoock left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rustyrussell commented Nov 9, 2020

m-schmoock commented Nov 9, 2020

niftynei left a comment

Choose a reason for hiding this comment