Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent segments archiving/deleting for size-based retention in 3.3 #34

Closed
jeqo opened this issue Jul 4, 2023 · 0 comments
Closed
Labels
3.3 Related specifically for 3.3 bug Stale tiered storage Issues related to Tiered Storage

Comments

@jeqo
Copy link

jeqo commented Jul 4, 2023

Testing archival and deleting find out that segments expected to be deleted are not, or segments are archived when they shouldn’t

How to reproduce

Testing with https://github.com/aiven/tiered-storage-for-apache-kafka/tree/main/demo

Happy path (no TS)

First, create a topic without TS:

docker exec kafka-ts \
        kafka-topics \
        --bootstrap-server kafka:29092 \
        --create \
        --config segment.bytes=1000000  \
        --config retention.bytes=20000000  \
        --partitions 1 \
        --topic t1-nots
Created topic t1-nots.

20MB retention, 1MB segment files. Expected segments: 20-21 (including active)

Fill topic with data (~60MB):

docker exec kafka-ts \
        kafka-producer-perf-test --producer-props bootstrap.servers=kafka:29092 \
        --topic t1-nots \
        --num-records 60000 \
        --record-size 1024 \
        --throughput 1000

Check local segment files:

make topic=t1-nots show_local_data | grep '.log$' | wc -l

Should return 20 or 21 segments as expected.

And segments should include latest ~20K messages:

make topic=t1-nots show_local_data | grep '.log$'
-rw-r--r--. 1 appuser appuser 976K Jun 27 09:35 00000000000000041217.log
-rw-r--r--. 1 appuser appuser 976K Jun 27 09:35 00000000000000042151.log
-rw-r--r--. 1 appuser appuser 976K Jun 27 09:35 00000000000000043086.log
-rw-r--r--. 1 appuser appuser 975K Jun 27 09:35 00000000000000044021.log
-rw-r--r--. 1 appuser appuser 976K Jun 27 09:35 00000000000000044954.log
# ...
-rw-r--r--. 1 appuser appuser 977K Jun 27 09:35 00000000000000056153.log
-rw-r--r--. 1 appuser appuser 976K Jun 27 09:35 00000000000000057090.log
-rw-r--r--. 1 appuser appuser 976K Jun 27 09:35 00000000000000058024.log
-rw-r--r--. 1 appuser appuser 976K Jun 27 09:35 00000000000000058960.log
-rw-r--r--. 1 appuser appuser 108K Jun 27 09:35 00000000000000059897.log

With TS enabled

Create topic with TS enabled, local retention of 5MB

docker exec kafka-ts \
        kafka-topics \
        --bootstrap-server kafka:29092 \
        --create \
        --config remote.storage.enable=true \
        --config segment.bytes=1000000  \
        --config retention.bytes=20000000  \
        --config local.retention.bytes=5000000  \
        --partitions 1 \
        --topic t1-ts
Created topic t1-ts.

It’s expected to have 5 or 6 segments locally, and at least 15 segments remotely.

Let’s fill the topic:

docker exec kafka-ts \
        kafka-producer-perf-test --producer-props bootstrap.servers=kafka:29092 \
        --topic t1-ts \
        --num-records 60000 \
        --record-size 1024 \
        --throughput 1000

Monitor remote segments:

make topic=t1-ts show_remote_data_s3_minio | grep '.log$' | wc -l

In the latest test, it returns 10, and local segments return 6

Checking the segments in more detail:

Local segments look ok:

-rw-r--r--. 1 appuser appuser 975K Jun 27 09:44 00000000000000055245.log
-rw-r--r--. 1 appuser appuser 975K Jun 27 09:44 00000000000000056179.log
-rw-r--r--. 1 appuser appuser 976K Jun 27 09:44 00000000000000057115.log
-rw-r--r--. 1 appuser appuser 977K Jun 27 09:44 00000000000000058052.log
-rw-r--r--. 1 appuser appuser 976K Jun 27 09:44 00000000000000058991.log
-rw-r--r--. 1 appuser appuser  74K Jun 27 09:44 00000000000000059929.log

But remote segments seems suspicious with 10 segments:

[2023-06-27 09:44:34 UTC] 975KiB STANDARD tiered-storage-demo/t1-ts-Etgijt_mRCe-N8WW_Pac6Q/0/00000000000000050565-YsYKhWHnR4G_QeHOwoMGUA.log
[2023-06-27 09:44:34 UTC] 976KiB STANDARD tiered-storage-demo/t1-ts-Etgijt_mRCe-N8WW_Pac6Q/0/00000000000000051505-SEUI-ie8T2WH2_fSuQsmHA.log
[2023-06-27 09:44:34 UTC] 976KiB STANDARD tiered-storage-demo/t1-ts-Etgijt_mRCe-N8WW_Pac6Q/0/00000000000000052441-YvjikTNCR9C3R08dhiZM0Q.log
[2023-06-27 09:44:35 UTC] 975KiB STANDARD tiered-storage-demo/t1-ts-Etgijt_mRCe-N8WW_Pac6Q/0/00000000000000053376-uzqfbrt6QrmJucjAm-DFCg.log
[2023-06-27 09:44:35 UTC] 976KiB STANDARD tiered-storage-demo/t1-ts-Etgijt_mRCe-N8WW_Pac6Q/0/00000000000000054308-Z1ZTbREMShW5yhF-RN_pzQ.log
[2023-06-27 09:44:35 UTC] 975KiB STANDARD tiered-storage-demo/t1-ts-Etgijt_mRCe-N8WW_Pac6Q/0/00000000000000055245-EFECwnOSS2-y_Un_dlglSw.log
[2023-06-27 09:44:42 UTC] 975KiB STANDARD tiered-storage-demo/t1-ts-Etgijt_mRCe-N8WW_Pac6Q/0/00000000000000056179-V4cfN4LCS9OATFU9gnshyA.log
[2023-06-27 09:44:43 UTC] 976KiB STANDARD tiered-storage-demo/t1-ts-Etgijt_mRCe-N8WW_Pac6Q/0/00000000000000057115-MMFiR96dR4KsFUvYhlySeA.log
[2023-06-27 09:44:43 UTC] 976KiB STANDARD tiered-storage-demo/t1-ts-Etgijt_mRCe-N8WW_Pac6Q/0/00000000000000058052-_mAKtQx3SMqARrpu7owLNw.log
[2023-06-27 09:44:43 UTC] 976KiB STANDARD tiered-storage-demo/t1-ts-Etgijt_mRCe-N8WW_Pac6Q/0/00000000000000058991-leZD6nzLSf62IN9C2XW8Aw.log

First of all, it contains less than expected (15), and it’s most recent segment overlaps with local segments (even after test stops) most recent remote log starts at 00000000000000058991 but oldest log starts at 00000000000000055245

Also, it does not contain the 15 expected segments leading to data lost compared to no TS: 11 segments are missing.

Potential causes

  • Seems that TS is shipping segments files "eagerly", i.e. even if segments are after local.retention.
  • This affects how total log size is calculated. Seems to be expecting that remote + log = total log size
    val segmentMetadataList = remoteLogMetadataManager.listRemoteLogSegments(tpId).asScala.toSeq
    if (segmentMetadataList.nonEmpty) {
    fetchLog(tpId.topicPartition()).foreach { log =>
    val retentionMs = log.config.retentionMs
    val totalSize = log.size + segmentMetadataList.map(_.segmentSizeInBytes()).sum
    val (checkTimestampRetention, cleanupTs) = (retentionMs > -1, time.milliseconds() - retentionMs)
    val checkSizeRetention = log.config.retentionSize > -1
    var remainingSize = totalSize - log.config.retentionSize

PS. Seems upstream current work on implementing retention is already handling this conditions: https://github.com/apache/kafka/pull/13561/files#diff-380e4d8859ea9148f21794c09039425c82d9012a392c2dbbe1ce2ec8677a1970R804

@jeqo jeqo added bug tiered storage Issues related to Tiered Storage 3.3 Related specifically for 3.3 labels Jul 4, 2023
@github-actions github-actions bot added the Stale label Oct 3, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.3 Related specifically for 3.3 bug Stale tiered storage Issues related to Tiered Storage
Projects
None yet
Development

No branches or pull requests

1 participant