Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CRASH] KeyDB 6.3.4 crash on replication db.cpp:2944 'm_fTrackingChanges >= 0' is not true #874

Open
sbaier1 opened this issue Oct 8, 2024 · 2 comments

Comments

@sbaier1
Copy link

sbaier1 commented Oct 8, 2024

Crash report

looks similar to #541 but i'm running 6.3.4 (also tried current HEAD@main FWIW)

Paste the complete crash log between the quotes below. Please include a few lines from the log preceding the crash report to provide some context.

1:16:S 08 Oct 2024 13:42:13.924 * MASTER <-> REPLICA sync started


=== KEYDB BUG REPORT START: Cut & paste starting from here ===
1:16:S 08 Oct 2024 13:42:13.924 # === ASSERTION FAILED ===
1:16:S 08 Oct 2024 13:42:13.924 # ==> db.cpp:2944 'm_fTrackingChanges >= 0' is not true

------ STACK TRACE ------

Backtrace:
keydb-server *:6379(redisDbPersistentData::processChanges(bool)+0x108) [0xaaaae3078338]
keydb-server *:6379(beforeSleep(aeEventLoop*)+0xbac) [0xaaaae303c4cc]
keydb-server *:6379(aeProcessEvents+0x3d0) [0xaaaae30338b0]
keydb-server *:6379(aeMain+0xb4) [0xaaaae30340f4]
keydb-server *:6379(workerThreadMain(void*)+0x100) [0xaaaae3052500]
/lib/aarch64-linux-gnu/libc.so.6(+0x7d5c8) [0xffffa12cd5c8]
/lib/aarch64-linux-gnu/libc.so.6(+0xe5edc) [0xffffa1335edc]

------ INFO OUTPUT ------
# Server
redis_version:255.255.255
redis_git_sha1:603ebb27
redis_git_dirty:0
redis_build_id:c822dc7b2d5ab23f
redis_mode:standalone
os:Linux 6.10.0-linuxkit aarch64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:atomic-builtin
gcc_version:11.4.0
process_id:1
process_supervised:no
run_id:ce43730b4900f3d074ce27dbf5828f82ffb6fabc
tcp_port:6379
server_time_usec:1728394933926041
uptime_in_seconds:63
uptime_in_days:0
hz:10
configured_hz:10
lru_clock:341685
executable:/keydb-server
config_file:/tmp4/redis.conf
availability_zone:
features:cluster_mget

# Clients
connected_clients:0
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:0
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
current_client_thread:0
thread_0_clients:0

# Memory
used_memory:1660016
used_memory_human:1.58M
used_memory_rss:22654976
used_memory_rss_human:21.61M
used_memory_peak:1679504
used_memory_peak_human:1.60M
used_memory_peak_perc:98.84%
used_memory_overhead:1677696
used_memory_startup:1677696
used_memory_dataset:18446744073709533936
used_memory_dataset_perc:1844674407370955161600.00%
allocator_allocated:2357320
allocator_active:2748416
allocator_resident:6537216
total_system_memory:14640783360
total_system_memory_human:13.64G
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.17
allocator_frag_bytes:391096
allocator_rss_ratio:2.38
allocator_rss_bytes:3788800
rss_overhead_ratio:3.47
rss_overhead_bytes:16117760
mem_fragmentation_ratio:13.66
mem_fragmentation_bytes:20996816
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:0
mem_aof_buffer:0
mem_allocator:jemalloc-5.2.1
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0
storage_provider:none
available_system_memory:unavailable

# Persistence
loading:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1728394870
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:-1
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:0
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0

# Stats
total_connections_received:0
total_commands_processed:3
instantaneous_ops_per_sec:0
total_net_input_bytes:1297999586
total_net_output_bytes:0
instantaneous_input_kbps:29962.83
instantaneous_output_kbps:0.00
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:0
expired_stale_perc:0.00
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:0
evicted_keys:0
keyspace_hits:0
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:0
total_forks:0
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:0
dump_payload_sanitizations:0
total_reads_processed:0
total_writes_processed:0
instantaneous_lock_contention:1
avg_lock_contention:0.375000
storage_provider_read_hits:0
storage_provider_read_misses:0

# Replication
role:slave
master_global_link_status:down
connected_masters:0
master_host:host.docker.internal
master_port:6378
master_link_status:down
master_last_io_seconds_ago:-1
master_sync_in_progress:0
slave_read_repl_offset:1
slave_repl_offset:1
master_link_down_since_seconds:-1
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:e8faa2cd336dccfdc1e23cd9417b00eb2dc2ddb4
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

# CPU
used_cpu_sys:3.137355
used_cpu_user:0.429532
used_cpu_sys_children:0.000000
used_cpu_user_children:0.001520
server_threads:1
long_lock_waits:0
used_cpu_sys_main_thread:2.814500
used_cpu_user_main_thread:0.331181

# Modules
module:name=search,ver=20614,api=1,filters=0,usedby=[],using=[ReJSON],options=[handle-io-errors]
module:name=timeseries,ver=11202,api=1,filters=0,usedby=[],using=[],options=[handle-io-errors]
module:name=ReJSON,ver=20600,api=1,filters=0,usedby=[search],using=[],options=[handle-io-errors]

# Commandstats
cmdstat_info:calls=3,usec=46,usec_per_call=15.33,rejected_calls=0,failed_calls=0

# Errorstats

# Cluster
cluster_enabled:0

# Keyspace

# KeyDB
mvcc_depth:0

------ CLIENT LIST OUTPUT ------

------ MODULES INFO OUTPUT ------
# search_version
search_version:2.6.14
search_redis_version:255.255.255 - oss

# search_index
search_number_of_indexes:0

# search_fields_statistics

# search_dialect_statistics
search_dialect_1:0
search_dialect_2:0
search_dialect_3:0

# search_runtime_configurations
search_concurrent_mode:OFF
search_enableGC:ON
search_minimal_term_prefix:2
search_maximal_prefix_expansions:200
search_query_timeout_ms:500
search_timeout_policy:return
search_cursor_read_size:1000
search_cursor_max_idle_time:300000
search_max_doc_table_size:1000000
search_max_search_results:1000000
search_max_aggregate_results:-1
search_search_pool_size:20
search_index_pool_size:8
search_gc_scan_size:100
search_min_phonetic_term_length:3

# ReJSON_trace
ReJSON_trace:   0: redis_module::base_info_func
             at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/redis-module-1.0.1/src/lib.rs:73:37
   1: rejson::__info_func
             at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/redis-module-1.0.1/src/macros.rs:120:13
   2: _Z18modulesCollectInfoPcPKcii
             at /opt/keydb/src/module.cpp:7270:24
   3: _Z14logModulesInfov
             at /opt/keydb/src/debug.cpp:1818:40
   4: _Z16printCrashReportv
             at /opt/keydb/src/debug.cpp:2077:19
      _serverAssert
             at /opt/keydb/src/debug.cpp:1014:25
   5: _ZN21redisDbPersistentData14processChangesEb
             at /opt/keydb/src/db.cpp:2944:5
   6: _Z11beforeSleepP11aeEventLoop
             at /opt/keydb/src/server.cpp:2951:55
   7: aeProcessEvents
             at /opt/keydb/src/ae.cpp:755:35
   8: aeMain
             at /opt/keydb/src/ae.cpp:823:24
   9: _Z16workerThreadMainPv
             at /opt/keydb/src/server.cpp:7386:15
  10: <unknown>
  11: <unknown>


------ FAST MEMORY TEST ------
1:16:S 08 Oct 2024 13:42:13.961 # main thread terminated
1:16:S 08 Oct 2024 13:42:13.961 # Bio thread for job type #0 terminated
1:16:S 08 Oct 2024 13:42:13.962 # Bio thread for job type #1 terminated
1:16:S 08 Oct 2024 13:42:13.962 # Bio thread for job type #2 terminated

Fast memory test PASSED, however your memory can still be broken. Please run a memory test for several hours if possible.

=== KEYDB BUG REPORT END. Make sure to include from START to END. ===

       Please report the crash by opening an issue on github:

           https://github.com/JohnSully/KeyDB/issues

  Suspect RAM error? Use keydb-server --test-memory to verify it.

Aditional information

Context: I am evaluating KeyDB as an alternative for a current Redis (7.2.0 FWIW) setup. I was trying to use replicaof to run KeyDB as a read-replica for an existing redis replica. The replica contains various time-series and JSON objects.

Out of curiosity, I've also tried running the same image on amd64 and with server-threads 1, neither of which resolved the issue, so it seems to be thread-safety (for the modules) + architecture independent.

The crash happens during replication, while no clients are connected yet.

The configuration file is unremarkable. the minimal config that reproduces the issue is just loadmodule + replicaof to connect to the replica.

  1. OS distribution and version

I'm using a container build based on ubuntu 22.04 including a patch from another issue here to get redisearch to work with KeyDB.

  1. Steps to reproduce (if any)
  • build container
  • run replicaof from a Redis 7.2.0 deployment containing JSON+timeseries objects

Dockerfile FWIW

FROM ubuntu:22.04 AS builder

# Avoid interactive prompts
ENV LANG=C.UTF-8
ENV LC_ALL=C.UTF-8
ENV TZ=America/New_York

# Consolidated dependencies to improve build speed + layer efficiency
RUN apt-get update && \
    apt-get install -y --no-install-recommends locales && \
    locale-gen en_US.UTF-8 && \
    update-locale LANG=en_US.UTF-8 && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y \
    build-essential \
    curl \
    git \
    unzip \
    libssl-dev \
    pkg-config \
    python3 \
    autoconf \
    automake \
    libtool \
    clang \
    libclang-dev \
    nasm \
    autotools-dev \
    libjemalloc-dev \
    tcl \
    tcl-dev \
    redis \
    uuid-dev \
    libcurl4-openssl-dev \
    libbz2-dev \
    libzstd-dev \
    liblz4-dev \
    libsnappy-dev \
    && rm -rf /var/lib/apt/lists/*

# Install Rust (required for building RedisJSON)
RUN curl https://sh.rustup.rs -sSf | sh -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"

RUN mkdir -p /opt/keydb && \
    cd /opt/keydb && \
    git clone https://github.com/Snapchat/KeyDB.git --branch v6.3.4 . && \
    git submodule update --init --recursive && \
    make -j4 && \
    make install


RUN apt-get update && \
    apt-get install -y \
    wget \
    && rm -rf /var/lib/apt/lists/*

RUN mkdir -p /usr/lib/redis/modules

# via https://github.com/Snapchat/KeyDB/issues/708#issuecomment-1817843245
COPY fix-search.patch /tmp/
# RedisSearch module
RUN mkdir -p /opt/redis-modules/redisearch && \
    cd /opt/redis-modules/redisearch && \
    git clone --depth 1 https://github.com/RediSearch/RediSearch.git --branch v2.6.14 . && \
    git submodule update --init --recursive && \
    git apply /tmp/fix-search.patch && \
    make setup && \
    rm -rf /var/lib/apt/lists/* && \
    make build DEBUG=0 -j4 && \
    make pack && \
    cp bin/artifacts/redisearch*.zip /usr/lib/redis/modules && \
    cp /opt/redis-modules/redisearch/bin/linux-*-release/search/redisearch.so /usr/lib/redis/modules/redisearch.so && \
    make clean

# RedisTimeSeries module
RUN mkdir -p /opt/redis-modules/redistimeseries && \
    cd /opt/redis-modules/redistimeseries && \
    git clone --depth 1 https://github.com/RedisTimeSeries/RedisTimeSeries.git --branch v1.12.2 . && \
    git submodule update --init --recursive

RUN cd /opt/redis-modules/redistimeseries && \
    ./deps/readies/bin/getpy3 && \
    sbin/system-setup.py && \
    make deps && \
    rm -rf /var/lib/apt/lists/* && \
    make build DEBUG=0 -j4 && \
    make pack && \
    cp bin/artifacts/*.zip /usr/lib/redis/modules && \
    cp /opt/redis-modules/redistimeseries/bin/linux-*-release/redistimeseries.so /usr/lib/redis/modules/redistimeseries.so && \
    make clean


# Download and compile Redis modules

# RedisJSON module
RUN mkdir -p /opt/redis-modules/redisjson && \
    cd /opt/redis-modules/redisjson && \
    git clone --depth 1 https://github.com/RedisJSON/RedisJSON.git --branch v2.6.0 . && \
    git submodule update --init --recursive && \
    ./deps/readies/bin/getpy3 --modern && \
    ./deps/readies/bin/system-setup.py && \
    rm -rf /var/lib/apt/lists/* && \
    make DEBUG=0 -j4 && \
    make pack && \
    cp bin/artifacts/*.zip /usr/lib/redis/modules && \
    cp /opt/redis-modules/redisjson/bin/linux-*-release/target/release/librejson.so /usr/lib/redis/modules/librejson.so && \
    make clean

RUN cd /opt/redis-modules/redisjson && \
    mkdir -p /usr/lib/redis/modules/ && \
    cp bin/linux-*-release/rejson.so /usr/lib/redis/modules/

RUN apt-get update && apt-get install -y \
    cmake \
    libuv1-dev \
    && rm -rf /var/lib/apt/lists/*

FROM ubuntu:22.04


# Consolidated dependencies to improve build speed + layer efficiency
RUN apt-get update && \
    apt-get install -y --no-install-recommends locales && \
    locale-gen en_US.UTF-8 && \
    update-locale LANG=en_US.UTF-8 && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y \
    curl \
    && rm -rf /var/lib/apt/lists/*

RUN mkdir -p /usr/lib/redis/modules

COPY --from=builder /usr/lib/redis/modules/redisearch.so /usr/lib/redis/modules
COPY --from=builder /usr/lib/redis/modules/librejson.so /usr/lib/redis/modules
COPY --from=builder /usr/lib/redis/modules/redistimeseries.so /usr/lib/redis/modules
COPY --from=builder /usr/local/bin/keydb-server /usr/local/bin

# Configure KeyDB to load the modules
RUN mkdir -p /etc/keydb && \
    echo "loadmodule /usr/lib/redis/modules/librejson.so" >> /etc/keydb/redis.conf && \
    echo "loadmodule /usr/lib/redis/modules/redisearch.so" >> /etc/keydb/redis.conf && \
    echo "loadmodule /usr/lib/redis/modules/redistimeseries.so" >> /etc/keydb/redis.conf

# Expose default KeyDB ports
EXPOSE 6379

# Metadata
LABEL org.opencontainers.image.version="keydb-6.3.4-redisearch-2.6.14-redistimeseries-1.12.2-redisjson-2.6.0"

# Command to start KeyDB
CMD ["keydb-server", "/etc/keydb/redis.conf"]

patch

Index: src/spec.c
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/spec.c b/src/spec.c
--- a/src/spec.c	(revision 1f6278e1b3c9730a693f8b8b2a651b53a7dad5ed)
+++ b/src/spec.c	(date 1726578731076)
@@ -2480,6 +2480,7 @@
                                  void *data) {
   if (subevent == REDISMODULE_SUBEVENT_LOADING_RDB_START ||
       subevent == REDISMODULE_SUBEVENT_LOADING_AOF_START ||
+      subevent == REDISMODULE_SUBEVENT_LOADING_FLASH_START ||
       subevent == REDISMODULE_SUBEVENT_LOADING_REPL_START) {
     Indexes_Free(specDict_g);
     if (legacySpecDict) {
Index: src/redismodule.h
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/redismodule.h b/src/redismodule.h
--- a/src/redismodule.h	(revision 1f6278e1b3c9730a693f8b8b2a651b53a7dad5ed)
+++ b/src/redismodule.h	(date 1726578763608)
@@ -336,6 +336,7 @@
 #define REDISMODULE_SUBEVENT_PERSISTENCE_RDB_START 0
 #define REDISMODULE_SUBEVENT_PERSISTENCE_AOF_START 1
 #define REDISMODULE_SUBEVENT_PERSISTENCE_SYNC_RDB_START 2
+#define REDISMODULE_SUBEVENT_LOADING_FLASH_START 6
 #define REDISMODULE_SUBEVENT_PERSISTENCE_ENDED 3
 #define REDISMODULE_SUBEVENT_PERSISTENCE_FAILED 4
 #define _REDISMODULE_SUBEVENT_PERSISTENCE_NEXT 5
@sbaier1 sbaier1 changed the title [CRASH] [CRASH] KeyDB 6.3.4 crash on replication Oct 8, 2024
@sbaier1 sbaier1 changed the title [CRASH] KeyDB 6.3.4 crash on replication [CRASH] KeyDB 6.3.4 crash on replication db.cpp:2944 'm_fTrackingChanges >= 0' is not true Oct 8, 2024
@keithchew
Copy link

Hi @sbaier1

Just on a side note, the patch above will only get KeyDB to startup with redisearch module with FLASH. However, once started up, the indexes will not be created so it will not actually work. You will need additional work on startup to:

  1. Create indexes
  2. Feed data from FLASH to module to initialise indexes

I am unfamiliar with redisjson and timeseries, but I suspect they might also need some work like redisearch to with FLASH. If you disabled FLASH, will replication start to work?

@sbaier1
Copy link
Author

sbaier1 commented Oct 17, 2024

Hi @sbaier1

Just on a side note, the patch above will only get KeyDB to startup with redisearch module with FLASH. However, once started up, the indexes will not be created so it will not actually work. You will need additional work on startup to:

  1. Create indexes
  2. Feed data from FLASH to module to initialise indexes

I am unfamiliar with redisjson and timeseries, but I suspect they might also need some work like redisearch to with FLASH. If you disabled FLASH, will replication start to work?

i think that was mostly a red hering, i just included all the info for complete-ness. it crashes in the same way regardless of

  • whether FLASH is enabled or disabled
  • whether redisearch is even loaded at all

at this point i could imagine it's simply due to the master i'm replicating being an upstream redis 7.2.0 vs keydb being based on 6.x, so maybe it's just a format error causing a crash (which is still cosmetically not great though)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants