The server is abnormal after data is written in the cluster environment #3784
-
When I write some data to IoTDB by using the cluster version, sometimes an exception occurs in the server log. So I did two versions of the test. case [1] : In released v0.12, three nodes and three replications. And i have 20 strorage group,100000 devices,and each device has 30 sensors.Write continuously for a few minutes at a time, then stop for a few minutes and continue writing. This is the abnormal information in the server log. ps. I deleted the information of IP : )
case [2] : In master 199519d, three nodes and three replications. And i have 20 strorage group,100000 devices,and each device has 50 sensors.After two hours of uninterrupted writing, I tried to write again, but the client write was rejected.I found that the server log is sending an error message. It seems that raftlog failed during the commit.
Now, I am not familiar with the processing logic of the If this is caused by some bugs, let's see how to fix it together. |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 6 replies
-
UPDATE: Today, I tried other machines and performed the same experiment. Three nodes and three replications, 20 strorage group,100000 devices,and each device has 50 sensors. The problem in case[1] has reappeared. Besides, when i write again, some connections began to be abnormal after about an hour of writing.
There might be some bugs here, and I'm going to try to figure out what's going on. |
Beta Was this translation helpful? Give feedback.
-
Case 1 is more likely a concurrent read and write happened on Firstly it declares a I have saw the codes about accessing This is only my opinion by scanning the codes. |
Beta Was this translation helpful? Give feedback.
-
For the case [2], if my understanding is right, it because |
Beta Was this translation helpful? Give feedback.
-
For case[2], we find the root cause. The issue happened because some entry is so large that can't be put into
|
Beta Was this translation helpful? Give feedback.
-
This discussion will be frozen. But because there are several replies and these replies can not be moved automatically, So, I will lock freeze this discussion. More discussions about this issue can go to #3856 or https://issues.apache.org/jira/browse/IOTDB-1583 |
Beta Was this translation helpful? Give feedback.
For case[2], we find the root cause. The issue happened because some entry is so large that can't be put into
logDataBuffer
which will throw aBufferOverflowException
. The function call is like below in RaftLogManager.java