-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: deal tracker spade sync cron #57
Conversation
View stack outputs
|
9c84fa3
to
1925c34
Compare
1925c34
to
2b70946
Compare
2b70946
to
e3b1164
Compare
Stack outputs updated
|
e3b1164
to
04b854a
Compare
.pipe(ZSTDDecompress()) | ||
) | ||
/** @type {SpadeOracle} */ | ||
const SpadeOracle = JSON.parse(toString(resDecompressed)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the readme you can just call resDecompressed.toString()
} | ||
}, | ||
has: async (key) => { | ||
const putCmd = new GetObjectCommand({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const putCmd = new GetObjectCommand({ | |
const headCmd = new HeadObjectCommand({ |
} | ||
|
||
// Get updated spade oracle contracts | ||
const getUpdatedSpadeOracle = await getSpadeOracleCurrentState(spadeOracleUrl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps do this first, so we don't hold all the other state in memory for the duration. I assume this will take longer/be more likely to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we swap "spade oracle state" for "deal archive"? So this would be for example:
const getUpdatedSpadeOracle = await getSpadeOracleCurrentState(spadeOracleUrl) | |
const { ok: nextArchive, error } = await fetchDealArchive(spadeOracleUrl) |
|
||
// Get diff of contracts | ||
const diffOracleContracts = computeDiffOracleState({ | ||
// fallsback to empty map if not found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably skip computing diff altogether if not found.
*/ | ||
export async function spadeOracleSyncTick ({ | ||
dealStore, | ||
spadeOracleStore, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should name after what is being stored. So I'd call this dealArchiveStore
or something.
} | ||
|
||
return { | ||
ok: new Map(Object.entries(decode(getRes.ok.value))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why doesn't the store decode do this?
spadeOracleUrl | ||
}) { | ||
// Get previous recorded spade oracle contracts | ||
const getPreviousSpadeOracle = await getSpadeOracleState({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're fetching the current deal archive from our store. Until it is replaced, it is current. Previous sounds like it's the archive that's 1 before current.
Ideally this would just be const { ok: currentArchive } = dealArchiveStore.get()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will rename to current! Thanks
we have spade now, but we could have other stores, but changing to dealArchiveStore
we should keep it with explicit key for spade.
} | ||
|
||
// Get diff of contracts | ||
const diffOracleContracts = computeDiffOracleState({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const diffOracleContracts = computeDiffOracleState({ | |
const diff = computeDiff(currentArchive, nextArchive) |
// @ts-expect-error not PieceCIDv2 | ||
piece: legacyPieceCid, | ||
provider: `${contract.provider}`, | ||
insertedAt: (new Date()).toISOString() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we track insertedAt
and updatedAt
?
.pipe(ZSTDDecompress()) | ||
) | ||
/** @type {SpadeOracle} */ | ||
const SpadeOracle = JSON.parse(toString(resDecompressed)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const SpadeOracle = JSON.parse(toString(resDecompressed)) | |
const rawData = JSON.parse(toString(resDecompressed)) |
const piece = Piece.fromInfo({ | ||
link, | ||
size: Piece.Size.fromHeight(height) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You already have a height and root is link.multihash.digest
so you could alse use Piece.view
and skip whole info creation piece, that said you would have to do something like
const piece = Piece.fromInfo({ | |
link, | |
size: Piece.Size.fromHeight(height) | |
}) | |
const piece = Piece.toView({ | |
root: link.multihash.digest, | |
height, | |
padding: 0n | |
}) |
If we do know original CAR size however it would be great to compute padding also instead of just putting 0n
however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need the size here and not height: https://github.com/web3-storage/data-segment/blob/main/src/piece.js#L104
This is the aggregate piece CID that Spade Oracle has, while tracking Deals (uses the old Piece Cid v1 + height)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need the size here and not height: https://github.com/web3-storage/data-segment/blob/main/src/piece.js#L104
Sorry I had typo I meant Piece.toView
function in the suggestion, which takes {height, root, padding}
triple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the aggregate piece CID that Spade Oracle has, while tracking Deals (uses the old Piece Cid v1 + height)
Ah ok, aggregates don't really have padding so that makes sense. I actually wanted to use different CIDs for aggregates, but also don't really want to argue for them in the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool thanks, changed to suggestion
Stack outputs updated
|
5b74177
to
e247249
Compare
Stack outputs updated
|
d5e7a66
to
1d549ac
Compare
1d549ac
to
a97ef26
Compare
Looks like @alanshaw has already reviewed this, I'll let him do and focus on other things unless you ask me otherise. |
@alanshaw skipped old tests for now given they were timing out sometimes. We will integrate as follow up the aggregator-api new tests, and I will get rid of this anyway then |
diffPieceContracts = fetchLatestDealArchiveRes.ok | ||
} else { | ||
diffPieceContracts = computeDiff({ | ||
// falls back to empty map if not found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// falls back to empty map if not found |
return dealStore.put({ | ||
...contract, | ||
// @ts-expect-error not PieceCIDv2 | ||
piece: legacyPieceCid, | ||
provider: `${contract.provider}`, | ||
insertedAt: (new Date()).toISOString(), | ||
updatedAt: (new Date()).toISOString() | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just so we don't get ever so slightly different dates...and can tell if something was ever updated.
return dealStore.put({ | |
...contract, | |
// @ts-expect-error not PieceCIDv2 | |
piece: legacyPieceCid, | |
provider: `${contract.provider}`, | |
insertedAt: (new Date()).toISOString(), | |
updatedAt: (new Date()).toISOString() | |
}) | |
const insertedAt = new Date().toISOString() | |
return dealStore.put({ | |
...contract, | |
// @ts-expect-error not PieceCIDv2 | |
piece: legacyPieceCid, | |
provider: `${contract.provider}`, | |
insertedAt, | |
updatedAt: insertedAt | |
}) |
*/ | ||
export async function putDiffToDealStore ({ dealStore, diffPieceContracts }) { | ||
const res = await Promise.all( | ||
Array.from(diffPieceContracts, ([key, value]) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be easier to understand if these were named after what they represent:
Array.from(diffPieceContracts, ([key, value]) => { | |
Array.from(diffPieceContracts, ([pieceCidStr, contracts]) => { |
for (const [pieceCid, contracts] of updatedPieceContracts.entries() ) { | ||
const currentContracts = currentPieceContracts.get(pieceCid) || [] | ||
// Find diff when different length | ||
if (contracts.length !== currentContracts.length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could one contract have expired and another added? i.e. the number of contracts did not change but the contracts did. I would imagine you need to do the diff regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure, but should not hurt removing this validation
const piecCid = convertPieceCidV1toPieceCidV2( | ||
parseLink(replica.piece_cid), | ||
replica.piece_log2_size | ||
) | ||
dealMap.set(piecCid.toString(), replica.contracts.map(c => ({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const piecCid = convertPieceCidV1toPieceCidV2( | |
parseLink(replica.piece_cid), | |
replica.piece_log2_size | |
) | |
dealMap.set(piecCid.toString(), replica.contracts.map(c => ({ | |
const pieceCid = convertPieceCidV1toPieceCidV2( | |
parseLink(replica.piece_cid), | |
replica.piece_log2_size | |
) | |
dealMap.set(pieceCid.toString(), replica.contracts.map(c => ({ |
spadeOracleUrl | ||
}) { | ||
// Get latest deal archive | ||
const fetchLatestDealArchiveRes = await fetchLatestDealArchive(spadeOracleUrl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not blocking - I imagine there will be times where the archive has not changed - would it be possible to do a head request to get an etag and avoid pulling a dump with no differences?
Then the deal archive store is more like hash => data and we store a pointer to the latest and delete (or just expire) archives we no longer need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will create an issue to improve this and add a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async function putLatestDealArchive ({ dealArchiveStore, spadeOracleId, oracleContracts }) { | ||
const putRes = await dealArchiveStore.put({ | ||
key: spadeOracleId, | ||
value: encode(Object.fromEntries(oracleContracts)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just store the data we retrieved and avoid re-encoding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then we would need to encode/decode all CIDs given stores functions use CID. This was the reason to do the dag json.
if (!res.ok) { | ||
return { | ||
// TODO: Error | ||
error: new Error('could not read') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I typically just report the status code:
error: new Error('could not read') | |
error: new Error(`unexpected response status fetching deal archive: ${res.status}`) |
5be1e32
to
a0c30c9
Compare
Adds deal tracker spade sync cron. As previously decided, this implements a diff based approach where newest Spade Oracle Data is fetched, compared with previous dowloaded one, generating a diff for insertion in DynamoDB
Decisions:
filecoin-api
repo given it is implementation detail, while other CRONs are part of the receipt chainNotes:
json.zst
)Needs: