Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Piece Cid mismatch between deal proposal in Actor state and deal Piece added to Sector (online and offline deals) #7103

Closed
4 tasks done
aarshkshah1992 opened this issue Aug 17, 2021 · 15 comments
Labels
kind/bug Kind: Bug P2 P2: Should be resolved team/ignite Issues and PRs being tracked by Team Ignite at Protocol Labs
Milestone

Comments

@aarshkshah1992
Copy link
Contributor

Checklist

  • This is not a question or a support request. If you have any lotus related questions, please ask in the lotus forum.
  • I am reporting a bug w.r.t one of the M1 tags. If not, choose another issue option here.
  • I am not reporting a bug around deal making. If yes, create a M1 Bug Report For Deal Making.
  • I have searched on the issue tracker and the lotus forum, and there is no existing related issue or discussion.

Lotus component

lotus miner/worker - sealing

Lotus Tag and Version

Unknown

Describe the Bug

@stuberman reported an issue where the deal Piece CID stored in the Actor subsystem does not match the PieceCID of the same deal as known by the Sector being sealed.

Logging Information

021-08-17T01:02:55.149Z	WARN	sectors	storage-sealing/states_sealing.go:194	invalid deals in sector 1507: piece 0 (of 14) of sector 1507 refers deal 2285896 with wrong PieceCID: baga6ea4seaqijqccdoqgqwqbx54vui2eazh6ijf5kku5eq3xwokp6tclivuoqei != baga6ea4seaqlopksik42wyueute5eiyrge6nfqga54o4shsa24va23tz74fcacq
2021-08-17T01:02:55.172Z	WARN	sectors	storage-sealing/states_failed.go:357	piece 0 (of 14) of sector 1507 refers deal 2285896 with wrong PieceCID: baga6ea4seaqijqccdoqgqwqbx54vui2eazh6ijf5kku5eq3xwokp6tclivuoqei != baga6ea4seaqlopksik42wyueute5eiyrge6nfqga54o4shsa24va23tz74fcacq
2021-08-17T01:02:55.172Z	WARN	sectors	storage-sealing/states_failed.go:357	piece 9 (of 14) of sector 1507 refers deal 2285895 with wrong PieceCID: baga6ea4seaqmzq6acl23axubdiv37xipnaz3qqtvwr57ekoaauviescphqnfwpi != baga6ea4seaqf2nua5pmmhcx3cn7t7aqkqh2mxownmnm5u45r5p7txm2r65po6ni
Aug 16 19:55:30 true bafyreicpw7mhurcjlwzgwdryrmsgi6fvep4e72hn5abwuxfuvczk4md5tm 2285895 StorageDealAwaitingPreCommit       f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a 1GiB  0 FIL 1494720 12D3KooWCVXs8P7iq6ao4XhfAmKWrEeuKFWCJgqe9jGDMTqHYBjw-12D3KooWSsaFCtzDJUEhLQYDdwoFtdCMqqfk562UMvccFz12kYxU-1629070675753904095  

Aug 16 20:23:55 true bafyreihjjzgaszu4zs6irezzrubo4g4bwrhusysdudbb4ttantlhvgs7l4 2285896 StorageDealAwaitingPreCommit       f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a 4MiB  0 FIL 1494720 12D3KooWCVXs8P7iq6ao4XhfAmKWrEeuKFWCJgqe9jGDMTqHYBjw-12D3KooWSsaFCtzDJUEhLQYDdwoFtdCMqqfk562UMvccFz12kYxU-1629070675753904112

Repo Steps

  1. Run '...'
  2. Do '...'
  3. See error '...'
    ...
@aarshkshah1992
Copy link
Contributor Author

cc @jennijuju @jacobheun for tracking.

@stuberman to fill out details.

@jacobheun jacobheun added the team/ignite Issues and PRs being tracked by Team Ignite at Protocol Labs label Aug 17, 2021
@stuberman
Copy link

Some details:

These are two deals that were received from Estuary (see Logging info above)
The deals came in, were accepted and published.
Other deals are coming in from Estuary and sealing without problem before and after this.
Using a separate markets node on a dedicated machine.
lotus-miner version
Daemon: 1.11.1-m1.3.5+mainnet+git.3ff8e256b+api1.2.0
Local: lotus-miner version 1.11.1-m1.3.5+mainnet+git.3ff8e256b

@stuberman
Copy link

stuberman commented Aug 17, 2021

Seeing this occur in the next batch of Estuary deals:

2021-08-17T12:09:42.881Z WARN sectors storage-sealing/states_sealing.go:194 invalid deals in sector 1508: piece 0 (of 7) of sector 1508 refers deal 2287254 with wrong PieceCID: baga6ea4seaqdsvqopmj2soyhujb72jza76t4wpq5fzifvm3ctz47iyytkewnubq != baga6ea4seaqdtuc2eyr47qcoj5yti5sn6vvns3tjmne647b3brxz3po7gutikka
2021-08-17T12:09:42.904Z WARN sectors storage-sealing/states_failed.go:357 piece 0 (of 7) of sector 1508 refers deal 2287254 with wrong PieceCID: baga6ea4seaqdsvqopmj2soyhujb72jza76t4wpq5fzifvm3ctz47iyytkewnubq != baga6ea4seaqdtuc2eyr47qcoj5yti5s

lotus-miner sectors status 1508
SectorID: 1508
Status: Removing
CIDcommD:
CIDcommR:
Ticket: f84d48abd9ad4330783b253bec0d25fa879ddf850ba1b348ee0b2e57043cb8e0
TicketH: 1028172
Seed:
SeedH: 0
Precommit:
Commit:
Proof:
Deals: [2287254 2287255 0 0 2287256 0 0]
Retries: 0

lotus-miner storage-deals list -v | grep 228725

Aug 16 23:12:51 true bafyreiemgm7jr3tx76utrng6ngwmaiasn5quqoud5vdy7k3q34weulllvm 2287254 StorageDealAwaitingPreCommit f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a 512MiB 0 FIL 1494720 12D3KooWGBWx9gyUFTVQcKMTenQMSyE2ad9m7c9fpjS4NMjoDien-12D3KooWSsaFCtzDJUEhLQYDdwoFtdCMqqfk562UMvccFz12kYxU-1629155331192764681

Aug 17 02:50:53 true bafyreih5ybnawjvcrnt64ybh5srnbqn3kvx6f6gzha2gqiz57fhhaa54zq 2287256 StorageDealAwaitingPreCommit f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a 4GiB 0 FIL 1494720 12D3KooWGBWx9gyUFTVQcKMTenQMSyE2ad9m7c9fpjS4NMjoDien-12D3KooWSsaFCtzDJUEhLQYDdwoFtdCMqqfk562UMvccFz12kYxU-1629155331192765192

Aug 17 02:51:36 true bafyreigf3awkkoo3hpsrgefnokx2ru56gqfv6h4j4g3f7jkqfb6egtqplu 2287255 StorageDealAwaitingPreCommit f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a 512MiB 0 FIL 1494720 12D3KooWGBWx9gyUFTVQcKMTenQMSyE2ad9m7c9fpjS4NMjoDien-12D3KooWSsaFCtzDJUEhLQYDdwoFtdCMqqfk562UMvccFz12kYxU-1629155331192765197

@stuberman
Copy link

Logs from f01278 markets and miner machines as requested

markets.tar.gz
miner.tar.gz

@aarshkshah1992
Copy link
Contributor Author

@stuberman So you've now seen this problem with two separate batches of Estuary deals, right ?

@stuberman
Copy link

stuberman commented Aug 17, 2021

Yes, two of the most recent batches. A third batch just was published

¢ lotus-miner storage-deals list |grep Publish
...6sn52ouy  0        StorageDealPublishing                    f1wdxdpqh3hirrhp353i4o6ld7bsw6evh3v7i5jtq                                               32GiB     0 FIL                            522436
...tarjfcri  0        StorageDealPublishing                    f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a  1GiB      0 FIL                            1494720
...2nqofrwu  0        StorageDealPublishing                    f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a  512MiB    0 FIL                            1494720
...c3e5cx3y  0        StorageDealPublishing                    f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a  512MiB    0 FIL                            1494720
...k7bpckbi  0        StorageDealPublishing                    f1wdxdpqh3hirrhp353i4o6ld7bsw6evh3v7i5jtq                                               32GiB     0 FIL                            521798
...x2clqqwq  0        StorageDealPublishing                    f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a  512MiB    0 FIL                            1494720
...ihggr7ka  0        StorageDealPublishing                    f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a  512MiB    0 FIL                            1494720
...jtj7umny  0        StorageDealPublishing                    f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a  512MiB    0 FIL                            1494720
...gyq62v54  0        StorageDealPublishing                    f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a  512MiB    0 FIL                            1494720
...qyximqby  0        StorageDealPublishing                    f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a  1GiB      0 FIL                            1494720
...luva4u6i  0        StorageDealPublishing                    f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a  1GiB      0 FIL                            1494720
...og3aukvq  0        StorageDealPublishing                    f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a  512MiB    0 FIL                            1494720
...cj6r2zqq  0        StorageDealPublishing                    f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a  512MiB    0 FIL                            1494720
...ojcgz3u4  0        StorageDealPublishing                    f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a  512MiB    0 FIL                            1494720
...i2gyfqam  0        StorageDealPublishing                    f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a  512MiB    0 FIL                            1494720
...lku4irwa  0        StorageDealPublish                       f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a  512MiB    0 FIL                            1494720

@jacobheun jacobheun added the P2 P2: Should be resolved label Aug 17, 2021
@jacobheun jacobheun added this to the v1.11.2 milestone Aug 17, 2021
@raulk
Copy link
Member

raulk commented Aug 17, 2021

@stuberman could you please use the -v flag always? otherwise we get truncated outputs.

@neondragon
Copy link

neondragon commented Aug 18, 2021

I'm seeing this too on f019551.

@neondragon
Copy link

7 sectors and 63 deals stuck with RecoverDealIDs on f019551 (1.11.1-m1.3.5+mainnet+git.7be207bc5.dirty+api1.2.0).

#7117

@aarshkshah1992
Copy link
Contributor Author

aarshkshah1992 commented Aug 19, 2021

@raulk @jacobheun Looks like @magik6k has a fix for this in #7117.

@aarshkshah1992
Copy link
Contributor Author

aarshkshah1992 commented Aug 19, 2021

@neondragon

  • Please can you attach your complete miner and market logs for the session when you saw this ?
  • Also, has your markets been restarted by any chance when some of those deals were in progress ?

@stuberman
Copy link

I am also seeing this on a single 32GiB deal from bidbot:

2021-08-20T00:33:37.689Z WARN sectors storage-sealing/states_failed.go:357 piece 0 (of 1) of sector 1522 refers deal 2303929 with wrong PieceCID: baga6ea4seaqp4q7ndvzeivdj3ouwa4lcgonnag4pols7m3y435c7ycnd22fucja != baga6ea4seaqdgp7rok7tz3quk44xqbyogq5bqil5ibtfbghusxo5qe57f5fuqbi
2021-08-20T00:33:37.834Z WARN sectors storage-sealing/states_sealing.go:201 invalid deals in sector 1522: piece 0 (of 1) of sector 1522 refers deal 2303929 with wrong PieceCID: baga6ea4seaqp4q7ndvzeivdj3ouwa4lcgonnag4pols7m3y435c7ycnd22fucja != baga6ea4seaqdgp7rok7tz3quk44xqbyogq5bqil5ibtfbghusxo5qe57f5fuqbi

@raulk raulk changed the title [BUG] Piece Cid mismatch between deal proposal in Actor state and deal Piece added to Sector [BUG] Piece Cid mismatch between deal proposal in Actor state and deal Piece added to Sector (online and offline deals) Aug 20, 2021
@stuberman
Copy link

Seeing these from another Slingshot participant as well tonight - using split market on dedicated machine

lotus-miner version
Daemon: 1.11.2-dev+mainnet+git.b5b0598bd+api1.2.0
Local: lotus-miner version 1.11.2-dev+mainnet+git.b5b0598bd

2021-08-23T02:48:47.069Z WARN sectors storage-sealing/states_sealing.go:201 invalid deals in sector 1536: piece 1 (of 4) of sector 1536 refers deal 2321237 with wrong PieceCID: baga6ea4seaqnqyicdbbfvnpjlmokmi45fgroiigxa2uw6nz6f6ojveoxlhizwai != baga6ea4seaqa62vdf2yo26kbrrhcvsy5hyddw66q2adaeokfquflsbu3rhgr6li

@stuberman
Copy link

stuberman commented Aug 23, 2021

f01278 Logs - market is too big to upload to GitHub
miner.tar.gz

@aarshkshah1992
Copy link
Contributor Author

Talked to @stuberman off band and he hasn't seen this after upgrading Markets to v1.11.2-rc1 which contains some fixes we made to the piece handover flow to Lotus.

No other miners too reporting this after the upgrade.

Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Kind: Bug P2 P2: Should be resolved team/ignite Issues and PRs being tracked by Team Ignite at Protocol Labs
Projects
None yet
Development

No branches or pull requests

6 participants