Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frontend can't scan table with staled schema #2510

Open
WenyXu opened this issue Sep 27, 2023 · 5 comments
Open

Frontend can't scan table with staled schema #2510

WenyXu opened this issue Sep 27, 2023 · 5 comments
Assignees
Labels
C-bug Category Bugs O-chaos Found by chaos tests

Comments

@WenyXu
Copy link
Member

WenyXu commented Sep 27, 2023

What type of bug is this?

Unexpected error, Other

What subsystems are affected?

Frontend

What happened?

Frontend might contain the staled schema, e.g., the heartbeat stream is broken, and some invalidate cache messages are lost.

However, Frontend will return an incorrect stream.

while let Some(batch) = stream.next().await {
let batch = batch?;
metric.record_output_batch_rows(batch.num_rows());
yield Ok(Self::remove_metadata_from_record_batch(batch));
if let Some(first_consume_timer) = first_consume_timer.as_mut().take() {
first_consume_timer.stop();
}
}
}
}));
Ok(Box::pin(RecordBatchStreamAdaptor {
schema: self.schema.clone(),
stream,
output_ordering: None,
}))

L190, It uses the staled schema.
L180, The RecordBatch might contain a newer schema.

We should be able to align one schema to another and tell the upper to invalidate the table cache.

What operating system did you use?

Doesn't matter

Relevant log output and stack trace

No response

How can we reproduce the bug?

Build a Frontend with a table contains a staled schema e.g., [col:int, col2:int, ts: Timestamp]
Build a Datanode with a table contains a new schema e.g., [col:int, col2:int, col3: int, ts: Timestamp]
Scan the table

@killme2008
Copy link
Contributor

Is this bug still present? If yes, let's resolve it as soon as possible.

@tisonkun
Copy link
Collaborator

tisonkun commented May 15, 2024

This seems quite a common issue in our internal cache. We have the same cache invalidation issues for schemas and scripts.

For scripts, there is even no notification to tell the entry is stale. For table schema here, it can have a minute-level inconsistent time window.

Are we theoretically able to keep the cache up-to-date or at least invalidate it promptly? It seems we would then depend on the meta server ...

cc @GreptimeTeam/db-approver

@tisonkun tisonkun mentioned this issue May 15, 2024
3 tasks
@evenyag
Copy link
Contributor

evenyag commented May 15, 2024

Maybe we could add a query node or a stream adapter to adjust the schemas of streams from different region servers.

Currently, the mito engine uses a similar approach to adapt SSTs with different schemas.

@killme2008
Copy link
Contributor

What's going on with this issue? @WenyXu

@WenyXu
Copy link
Member Author

WenyXu commented Jun 26, 2024

What's going on with this issue? @WenyXu

I'll fix it recently👀

@WenyXu WenyXu self-assigned this Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category Bugs O-chaos Found by chaos tests
Projects
None yet
Development

No branches or pull requests

5 participants