Bugs found for distributed systems
Here I list some bugs found by my fuzzing tools for distributed systems and databases.
Braft/brpc
memory leak
Fail to rename
No exception handling after error?
Check failed: meta.term == header.term
data loss after restart?
memory leak if binding failed
Node cannot response after one node stepped down
TaskControl may interrupt non-exist threads
C-Raft
ASan reported heap-buffer-overflow
ASan report bad-free
ASan reported SEGV
example/server failed to start after killing
Fix memory leak when crc check failed.
NuRaft
Got FATL logs when test with the calc example
Exit if ctrl-d received
RedisRaft
Fix heap buffer overflow
Assertion `cache->start_idx + cache->len == idx’ failed
Radis raft panic
Assertion failed.
RethinkDB
Memory bugs reported by ASan
Guarantee failed: [iterator_and_is_new.second] value to be inserted already exists.
Aerospike
stack-buffer-underflow
error creating fabric published endpoint list
dirty read if the connection is not stable?
ClickHouse
Distributed table cannot be used after rename column
Distributed table cannot find column with “greater” query
Not found column in substr
Constraint check throws Missing columns exception
Does distributed table support select with order?
Always use default database when creating distributed table
Cannot drop table after a wrong create distributed table statement executed
Cannot create table after drop
Create Distributed table can succeed if the database not exists
Fatal error when createDatabase
Got weird fatal log: floating point inexact result
Logical error: ‘It’s new replica, but database is not empty’
etcd
runtime error: slice bounds out of range
ZooKeeper
Committing zxid 0x100000003 but next pending txn 0x100000002
NullPointerException in SendAckRequestProcessor
Zookeeper crashes after commit fail