Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EXPERIMENTAL: MPP send and receive support (lowlevel) #3309

Merged
merged 22 commits into from
Dec 12, 2019

Conversation

rustyrussell
Copy link
Contributor

Not added to the pay plugin, but it's in sendpay etc.

There's no documentation changes since this only adds optional params, which are only useful with EXPERIMENTAL_FEATURES=1.

@rustyrussell rustyrussell added this to the 0.7.4 milestone Dec 2, 2019
@rustyrussell rustyrussell force-pushed the part-payment-send branch 2 times, most recently from 58e7c99 to 911b53b Compare December 3, 2019 00:54
Copy link
Member

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only got as far as 45fe549, but will resume tomorrow. So far it's looking good , though I'm afraid it might introduce a couple of regressions for sendonion or may make it a second class citizen.

EDIT: All my concerns were addressed a couple of commits further down.

  • Not sure if parallel_id is the correct term, since it is multi-part
    payments, not multi-parallel payments 😉
  • 43ba99e is a very nice consolidation 👍

/* We can now have multiple payments in progress for a single hash, so
* add two fields; combination of payment_hash & parallel_id is unique. */
{SQL("ALTER TABLE payments RENAME TO temp_payments;"), NULL},
{SQL("CREATE TABLE payments ("
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we maybe go the extra mile and call this payment_parts instead? We could then pull the common fields into a separate payments table. Would be a nice cleanups and fix that naming nit about pays being the combination of multiple payments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that should be done at the db level; it may be premature optimization?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, that'd be a bigger refactor that we should probably be done during a bit of downtime. It'd just be nice to have an actual payment concept that collects all the attempts that are performed trying to complete it. It may be my OOP/ORM past that is catching up with me 😉

wallet/wallet.h Outdated Show resolved Hide resolved
wallet/wallet.c Outdated Show resolved Hide resolved
lightningd/pay.c Show resolved Hide resolved
Copy link
Member

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite a nice PR, I just have some questions inline and a couple of naming nits.

lightningd/pay.c Outdated Show resolved Hide resolved
tests/test_pay.py Outdated Show resolved Hide resolved
Comment on lines 2634 to 2644
line = l4.daemon.wait_for_log('print_htlc_onion.py: Got onion')
assert "'type': 'tlv'" in line
assert "'forward_amount': '500msat'" in line
assert "'total_msat': '1000msat'" in line
assert "'payment_secret': '{}'".format(paysecret) in line

line = l4.daemon.wait_for_log('print_htlc_onion.py: Got onion')
assert "'type': 'tlv'" in line
assert "'forward_amount': '500msat'" in line
assert "'total_msat': '1000msat'" in line
assert "'payment_secret': '{}'".format(paysecret) in line
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use a loop to emphasize that these are identical and the order is not important.

{
for (size_t i = 0; i < tal_count(set->htlcs); i++) {
/* Don't remove from set */
tal_del_destructor2(set->htlcs[i], htlc_set_hin_destroyed, set);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done independently of whether it was set as a destructor, hence the function sig above being outside of the compile guards, correct? Just checked, tal_del_destructor2 just returns false if the function was never registered. Why not make this call and the function definition above experimental only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's probably even clearer.

lightningd/htlc_set.c Show resolved Hide resolved
lightningd/htlc_set.c Outdated Show resolved Hide resolved
@@ -181,8 +181,7 @@ void htlc_add_set(struct lightningd *ld,
}

if (amount_msat_eq(set->so_far, total_msat)) {
/* FIXME: hand to invoice_try_pay! */
tal_free(set);
invoice_try_pay(ld, set, details);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be take(set) for self-documentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, no, it has to dispose of it. take() is for things where it may or may not take ownership.

We've generally used naming to try to indicate which functions are a sink. Should we rename this to 'invoice_resolve_htlc_in'?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a good idea. I keep getting confused by take() :-)

I'm tempted to propose an OWNS annotation and an own() nop-helper that just serves as annotation to say: I'll take ownership no matter what and you MUST take ownership. Purely documentation but we'd know which side of the call is wrong if we have a leak without having to dig through both the caller and callee to see what the semantics should have been. Just fantasizing here of course 😃

'parallel_id': 2})
l1.rpc.call('waitsendpay', [inv['payment_hash'], None, 1])
l1.rpc.call('waitsendpay', {'payment_hash': inv['payment_hash'],
'parallel_id': 2})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These need checks that we actually get the parts in the correct call, and that there isn't a place we cross wires :-)

tests/test_pay.py Show resolved Hide resolved
@unittest.skipIf(not DEVELOPER, "needs dev-fail")
@unittest.skipIf(not EXPERIMENTAL_FEATURES, "needs parallel_id support")
def test_partial_payment_htlc_loss(node_factory, bitcoind):
"""Test that we discard a set when the HTLC is lost"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment seems a copy-pasta

@cdecker
Copy link
Member

cdecker commented Dec 3, 2019

Since this is touching the DB in some places I'm running the tests against postgres now.

@cdecker
Copy link
Member

cdecker commented Dec 3, 2019

I assume the changes to pyln-client are deferred until the feature is non-experimental, corret?

@cdecker
Copy link
Member

cdecker commented Dec 3, 2019

One of the newly added fields seems to have the wrong type:

u64 field doesn't match size: expected 8, actual 4

My guess would be this:

{SQL("ALTER TABLE channel_htlcs ADD parallel_id INTEGER;"), NULL},

Since it declares the field as a 4 byte INTEGER, while trying to read it as a 8 byte BIGINT.

@cdecker
Copy link
Member

cdecker commented Dec 3, 2019

One of the newly added fields seems to have the wrong type:

u64 field doesn't match size: expected 8, actual 4

My guess would be this:

{SQL("ALTER TABLE channel_htlcs ADD parallel_id INTEGER;"), NULL},

Since it declares the field as a 4 byte INTEGER, while trying to read it as a 8 byte BIGINT.

Just verified that that fixes the issue. Now all I'm left with is some spurious MPP_TIMEOUTs that seem a bit odd. The timeouts are repeatable on both sqlite3 and postgres:

  • test_channel_spendable_large
  • test_channel_spendable
  • test_invoice_payment_hook_hold

@rustyrussell rustyrussell added Optech Make Me Famous! Look! Look! Look! COOL NEW FEATURE! protocol These issues are protocol level issues that should be discussed on the protocol spec repo labels Dec 5, 2019
@cdecker cdecker modified the milestones: 0.7.4, next Dec 11, 2019
This is in preparation for partial payments.  For existing payments,
partid is 0 (arbitrarity) and total_msat is msatoshi.

Signed-off-by: Rusty Russell <[email protected]>
This is in preparation for partial payments.  For existing payments,
partid is 0 (to match the corresponding payment).

Signed-off-by: Rusty Russell <[email protected]>
This is a transient field, so rework things so we don't leave it in
struct htlc_out.  Instead, load htlc_in first and connect htlc_out to
them as we go.

This also changes one place where we use it instead of the am_origin
flag.

Signed-off-by: Rusty Russell <[email protected]>
In particular, we're about to do surgery on the detection-of-previous-payments
logic, and we should not do this in two places.

Signed-off-by: Rusty Russell <[email protected]>
We currently refuse a payment if one is already in flight.  For parallel
payments, it's a bit more subtle: we want to refuse if it we already have
the total-amount-of-invoice in flight.

So we get all the current payments, and sum the pending ones.

Signed-off-by: Rusty Russell <[email protected]>
msatoshi was used to indicate the amount the invoice asked for, but
for parallel sendpay it's required, as it allows our sanity check of
limiting the total payments in flight, ie. it becomes
'total_msat'.

There's a special case for sendonion, which always tells us the value is 0.

Signed-off-by: Rusty Russell <[email protected]>
…URES and payment_secret

Explicit #if EXPERIMENTAL_FEATURES check in case we enable them at different
times, but it requires a payment_secret since we put them in the same field.

This incidently stops it working on legacy nodes.

Signed-off-by: Rusty Russell <[email protected]>
We need to be able to wait for a unique payment, now payment_hash is not
always unique.

Signed-off-by: Rusty Russell <[email protected]>
The cltv_expiry and payment_hash are in hin, so no need to hand them
in here.

Signed-off-by: Rusty Russell <[email protected]>
We now return the same error for various "does not match this
invoice", so it makes sense to encapsulate these checks.  We'll also
want to expose this for multi-part payments.

Signed-off-by: Rusty Russell <[email protected]>
This isn't plumbed in yet, but the idea is that every htlc gets put
into a "set" and then we process them once the set is satisfied.  For
the !EXPERIMENTAL_FEATURES, the set is simply always size 1.

Signed-off-by: Rusty Russell <[email protected]>
AFAICT this only "worked" previously because replay htlc simply failed
them all (no peers are currently connected).  With upcoming changes
(foreshadowed by the comment) this is no longer true:

Attempting to prepare a db_stmt outside of a transaction: wallet/invoices.c:373
lightningd: FATAL SIGNAL 6 (version v0.7.3-188-g45b0af4-modded)
0x55b475590a73 send_backtrace
	common/daemon.c:41
0x55b475590b1d crashdump
	common/daemon.c:54
0x7f16c557b46f ???
	???:0
0x7f16c557b3eb ???
	???:0
0x7f16c555a898 ???
	???:0
0x55b475564c8f fatal
	lightningd/log.c:814
0x55b4755c3ed5 db_prepare_v2_
	wallet/db.c:605
0x55b4755c76b5 invoices_find_unpaid
	wallet/invoices.c:373
0x55b4755ce91c wallet_invoice_find_unpaid
	wallet/wallet.c:1990
0x55b47555861f invoice_check_payment
	lightningd/invoice.c:257
0x55b475557a7c htlc_add_set
	lightningd/htlc_set.c:112
0x55b47557b294 handle_localpay
	lightningd/peer_htlcs.c:332
0x55b47557c63c htlc_accepted_hook_callback
	lightningd/peer_htlcs.c:857
0x55b475585573 plugin_hook_call_
	lightningd/plugin_hook.c:118
0x55b47557c747 plugin_hook_call_htlc_accepted
	lightningd/peer_htlcs.c:882
0x55b47557ca3e peer_accepted_htlc
	lightningd/peer_htlcs.c:991
0x55b47557ffb9 htlcs_resubmit
	lightningd/peer_htlcs.c:2131
0x55b4755620f7 main
	lightningd/lightningd.c:801

Signed-off-by: Rusty Russell <[email protected]>
The invoice_try_pay code now takes a set, rather than a single htlc, but
it's basically the same thing.

Signed-off-by: Rusty Russell <[email protected]>
…bolt11.

The pay plugin has been supplying the bolt11 string since 0.7.0, so only
ancient "pay" commands would be omitted by this change.

You can create a no-bolt11 "sendpay" manually, but then you'll find it
in 'listsendpays'.

Changelog-Removed: JSON: `listpays` won't shown payments made via sendpay without a bolt11 string, or before 0.7.0.
Signed-off-by: Rusty Russell <[email protected]>
This won't usually be visible to the end-user, since the pay plugin doesn't
do multi-part yet (and mpp requires EXPERIMENTAL_FEATURES), but we're ready
once it does.

Signed-off-by: Rusty Russell <[email protected]>
It makes sense, and it's been proposed for addition to the spec to
broad agreement:

	lightning/bolts#712

Signed-off-by: Rusty Russell <[email protected]>
Bastien TEINTURIER <[email protected]> writes:
> It looks like the split on c-lightning side is quite limited at the moment:
> the only option is to split a payment in exactly its two halves,
> otherwise I get rejected because of the rule of overpaying more than
> twice the amount?

We only tested exactly two equal-size payments; indeed, our finalhop
test was backwards.  We only complain if the final hop pays more than
twice msat (technically, this test is still too loose for mpp: the
spec says we should sum to the exact amount).

Reported-by: @t-bast
Signed-off-by: Rusty Russell <[email protected]>
Bastien TEINTURIER <[email protected]> writes:
> One thing I noticed but didn't investigate much: after sending the two
> payments, I tried using `waitsendpay` and it reported an error *208*
> (*"Never attempted payment for
> '98ee736d29d860948e436546a88b0cc84f267de8818531b0fdbe6ce3d080f22a'"*).
> 
> I was expecting the result to be something like: "payment succeeded for
> that payment hash" (the HTLCs were correctly settled).

Indeed, if you waitsendpay without specifying a partid, you are waiting
for 0, which may not exist.  Clarify the error msg.

Reported-by: @t-bast
Signed-off-by: Rusty Russell <[email protected]>
Copy link
Member

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some very minor things that could be cleaned up in another PR.

ACK e6edb76

wallet/wallet.c Outdated
{
struct db_stmt *stmt;

/* FIXME: Put partid into locally-generated htlc_out, select here! */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow this comment.

@@ -573,11 +573,13 @@ void payment_failed(struct lightningd *ld, const struct htlc_out *hout,
}

/* Save to DB */
payment_store(ld, &hout->payment_hash, /* FIXME: Set partid! */ 0);
wallet_payment_set_status(ld->wallet, &hout->payment_hash, /* FIXME: Set partid! */ 0,
payment_store(ld, &hout->payment_hash, hout->partid);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be refactored to pass in hout instead of its fields individually.

@@ -590,7 +592,7 @@ static void forward_htlc(struct htlc_in *hin,
hout = tal(tmpctx, struct htlc_out);
failcode = send_htlc_out(next, amt_to_forward,
outgoing_cltv_value, &hin->payment_hash,
next_onion, hin, &hout);
0, next_onion, hin, &hout);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we had a dont-care value for these situations :-)

struct htlc_in_map *htlcs_in,
struct htlc_out_map *htlcs_out);
/* Only defined if COMPAT_V061 */
void fixup_htlcs_out(struct lightningd *ld);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this require compile guards?

@@ -532,12 +532,12 @@ def check_balances():
wait_for(check_balances)

# Repeat will "succeed", but won't actually send anything (duplicate)
assert not l1.daemon.is_in_log('... succeeded')
assert not l1.daemon.is_in_log('Payment 0/1: .* COMPLETE')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For non-experimental builds this doesn't test anything. I'll just also apply #3335 😉

@@ -2567,3 +2567,61 @@ def serialize_payload(n):
except RpcError as e:
assert(e.error['code'] == 204)
assert(e.error['data']['raw_message'] == "400f00000000000003e80000006c")


@unittest.skipIf(not DEVELOPER, "needs DEVELOPER=1")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd also require EXPERIMENTAL_FEATURES. Merging #3335 also fixes this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also fixed in the next request, but breaks bisectability.

@cdecker cdecker merged commit ec15051 into ElementsProject:master Dec 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Optech Make Me Famous! Look! Look! Look! COOL NEW FEATURE! protocol These issues are protocol level issues that should be discussed on the protocol spec repo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants