The server is abnormal after data is written in the cluster environment #3784

cigarl · 2021-08-19T03:07:15Z

cigarl
Aug 19, 2021

When I write some data to IoTDB by using the cluster version, sometimes an exception occurs in the server log. So I did two versions of the test.

case [1] : In released v0.12, three nodes and three replications. And i have 20 strorage group,100000 devices,and each device has 30 sensors.Write continuously for a few minutes at a time, then stop for a few minutes and continue writing. This is the abnormal information in the server log. ps. I deleted the information of IP : )

2021-08-10 13:23:02,335 [check-log-applier-Node(internalIp: x.x.x.x, metaPort:9003, nodeIdentifier:185043799, dataPort:40010, clientPort:6667, clientIp:0.0.0.0)] ERROR o.a.i.c.l.m.RaftLogManager:918 - Node(internalIp: x.x.x.x, metaPort:9003, nodeIdentifier:185043799, dataPort:40010, clientPort:6667, clientIp:0.0.0.0), an exception occurred when checking the applied log index
java.lang.IndexOutOfBoundsException: Index: 1965, Size: 1001
        at java.util.ArrayList.rangeCheck(ArrayList.java:657)
        at java.util.ArrayList.get(ArrayList.java:433)
        at java.util.Collections$SynchronizedList.get(Collections.java:2417)
        at org.apache.iotdb.cluster.log.manage.CommittedEntryManager.getEntry(CommittedEntryManager.java:188)
        at org.apache.iotdb.cluster.log.manage.RaftLogManager.doCheckAppliedLogIndex(RaftLogManager.java:937)
        at org.apache.iotdb.cluster.log.manage.RaftLogManager.checkAppliedLogIndex(RaftLogManager.java:916)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

case [2] : In master 199519d, three nodes and three replications. And i have 20 strorage group,100000 devices,and each device has 50 sensors.After two hours of uninterrupted writing, I tried to write again, but the client write was rejected.I found that the server log is sending an error message. It seems that raftlog failed during the commit.

2021-08-18 17:50:38,479 [DataClientThread-1100] ERROR o.a.i.c.l.m.RaftLogManager:648 - Node(internalIp: x.x.x.x, metaPort:9003, nodeIdentifier:1190416664, dataPort:40010, clientPort:6667, clientIp:0.0.0.0): Unexpected error:
org.apache.iotdb.cluster.exception.TruncateCommittedEntryException: The committed entries cannot be truncated: parameter: 50000606, commitIndex : 50000606
        at org.apache.iotdb.cluster.log.manage.CommittedEntryManager.append(CommittedEntryManager.java:246)
        at org.apache.iotdb.cluster.log.manage.RaftLogManager.commitTo(RaftLogManager.java:625)
        at org.apache.iotdb.cluster.server.member.RaftMember.commitLog(RaftMember.java:1533)
        at org.apache.iotdb.cluster.server.member.RaftMember.appendLogInGroup(RaftMember.java:1699)
        at org.apache.iotdb.cluster.server.member.RaftMember.processPlanLocally(RaftMember.java:1040)
        at org.apache.iotdb.cluster.server.member.DataGroupMember.executeNonQueryPlanWithKnownLeader(DataGroupMember.java:753)
        at org.apache.iotdb.cluster.server.member.DataGroupMember.executeNonQueryPlan(DataGroupMember.java:715)
        at org.apache.iotdb.cluster.server.member.RaftMember.executeNonQueryPlan(RaftMember.java:765)
        at org.apache.iotdb.cluster.server.service.BaseSyncService.executeNonQueryPlan(BaseSyncService.java:176)
        at org.apache.iotdb.cluster.server.DataClusterServer.executeNonQueryPlan(DataClusterServer.java:1036)
        at org.apache.iotdb.cluster.rpc.thrift.RaftService$Processor$executeNonQueryPlan.getResult(RaftService.java:918)
        at org.apache.iotdb.cluster.rpc.thrift.RaftService$Processor$executeNonQueryPlan.getResult(RaftService.java:898)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Now, I am not familiar with the processing logic of the CommittedEntryManager , so I am not sure where the problem is. If you have the same problem, please tell me what happened to it.

If this is caused by some bugs, let's see how to fix it together.

Answered by chengjianyun

Aug 24, 2021

For case[2], we find the root cause. The issue happened because some entry is so large that can't be put into logDataBuffer which will throw a BufferOverflowException. The function call is like below in RaftLogManager.java

     ...
     startTime = Statistic.RAFT_SENDER_COMMIT_APPEND_AND_STABLE_LOGS.getOperationStartTime();
     // the entry is add to committedEntries
     getCommittedEntryManager().append(entries);

     if (ClusterDescriptor.getInstance().getConfig().isEnableRaftLogPersistence()) {
       // exception be throwed from here
       getStableEntryManager().append(entries, maxHaveAppliedCommitIndex);
     }
     
     // all the rest lines won't be executed for the entry. 
 …

View full answer

cigarl · 2021-08-19T09:36:11Z

cigarl
Aug 19, 2021
Author

UPDATE:

Today, I tried other machines and performed the same experiment. Three nodes and three replications, 20 strorage group,100000 devices,and each device has 50 sensors.

The problem in case[1] has reappeared.

Besides, when i write again, some connections began to be abnormal after about an hour of writing.

2021-08-19 17:08:12,416 [ClusterClient-76] ERROR o.a.t.s.TThreadPoolServer$WorkerProcess:258 - Thrift Error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Required field 'ip' was not present! Struct: EndPoint(ip:null, port:0)
        at org.apache.iotdb.service.rpc.thrift.EndPoint.validate(EndPoint.java:346)
        at org.apache.iotdb.service.rpc.thrift.TSStatus.validate(TSStatus.java:534)
        at org.apache.iotdb.service.rpc.thrift.TSIService$insertTablet_result.validate(TSIService.java:22262)
        at org.apache.iotdb.service.rpc.thrift.TSIService$insertTablet_result$insertTablet_resultStandardScheme.write(TSIService.java:22321)
        at org.apache.iotdb.service.rpc.thrift.TSIService$insertTablet_result$insertTablet_resultStandardScheme.write(TSIService.java:22288)
        at org.apache.iotdb.service.rpc.thrift.TSIService$insertTablet_result.write(TSIService.java:22239)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:58)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

There might be some bugs here, and I'm going to try to figure out what's going on.

2 replies

OneSizeFitsQuorum Aug 23, 2021
Collaborator

Maybe this is because you don't set rpc_address in iotdb-engine.properties to your public_ip/private_ip. You can refer to cluster-setup-example to perform a check.

cigarl Aug 23, 2021
Author

Maybe this is because you don't set rpc_address in iotdb-engine.properties to your public_ip/private_ip. You can refer to cluster-setup-example to perform a check.

Thank you for your reminding. I will pay attention to this part in subsequent tests.
I just think it's strange that this exception occurred during the write process. Perhaps the answer to this question will come when I see this part of the code.

ericpai · 2021-08-19T13:42:55Z

ericpai
Aug 19, 2021
Collaborator

Case 1 is more likely a concurrent read and write happened on CommittedEntryManager.entries. Let's see case1, the error happens in this method:

https://github.com/apache/iotdb/blob/rel/0.12/cluster/src/main/java/org/apache/iotdb/cluster/log/manage/CommittedEntryManager.java#L170-L189

Firstly it declares a dummyIndex by calling getDummyIndex , which reads the first element in entries directly. Then after the index check in Line179, Line188 is out of bound ! (See your output execption).

I have saw the codes about accessing entries briefly. All write(append) operations are protected by a synchronized block of LogManager. But all read options are not. Consider the compactEntries is called between Line179 and Line188...

This is only my opinion by scanning the codes.

2 replies

cigarl Aug 20, 2021
Author

It looks like a bug in my test case. I'll focus on your point when I try to understand the logic about this part .

Thank you so much.

OneSizeFitsQuorum Aug 23, 2021
Collaborator

The problem actually occurs because we have two asynchronous threads, one to periodically delete the raftLog in memory to prevent a memory overflow, and the other to advance the applyIndex to apply to the state machine.

Actually, I think this is because our RaftLogManager not doing a good job with concurreny, and it's issuing some exception logs, but it doesn't have serious side effects. You can refer to related docs for details. And we may need an enhancement for `RaftLogManager' to process concurreny better.

chengjianyun · 2021-08-20T12:37:53Z

chengjianyun
Aug 20, 2021
Collaborator

For the case [2], if my understanding is right, it because log 50000606 which has been committed would like to commit again. Sounds like the concurrent log index update issue. @lebronal @jt2594838 , could you give a hand in the issue.

2 replies

OneSizeFitsQuorum Aug 23, 2021
Collaborator

I believe this is a bug in our current implememnt. Actually, at the beginning of the design we referenced etcd's implementation of RaftLogManager, which you can check out their design on this blog.

As etcd are event-driven, they are internally lock-free. Sadly, This is not the case in our implementation, there may be multiple threads accessing RaftLogManager at the same time, so this bug may require further careful debugging and enhancement.

cigarl Aug 23, 2021
Author

I believe this is a bug in our current implememnt. Actually, at the beginning of the design we referenced etcd's implementation of RaftLogManager, which you can check out their design on this blog.

As etcd are event-driven, they are internally lock-free. Sadly, This is not the case in our implementation, there may be multiple threads accessing RaftLogManager at the same time, so this bug may require further careful debugging and enhancement.

Thanks for answering.
I created some related questions on JIRA(case 1 & case 2).
Maybe we can discuss it further there.

chengjianyun · 2021-08-24T08:59:55Z

chengjianyun
Aug 24, 2021
Collaborator

For case[2], we find the root cause. The issue happened because some entry is so large that can't be put into logDataBuffer which will throw a BufferOverflowException. The function call is like below in RaftLogManager.java

     ...
     startTime = Statistic.RAFT_SENDER_COMMIT_APPEND_AND_STABLE_LOGS.getOperationStartTime();
     // the entry is add to committedEntries
     getCommittedEntryManager().append(entries);

     if (ClusterDescriptor.getInstance().getConfig().isEnableRaftLogPersistence()) {
       // exception be throwed from here
       getStableEntryManager().append(entries, maxHaveAppliedCommitIndex);
     }
     
     // all the rest lines won't be executed for the entry. 
     // So the entry will be stay in committedEntries and UnCommittedEntries 
     // at the same time. The issue [2] would happen when it tries to commit a new entry.
     Log lastLog = entries.get(entries.size() - 1);
     getUnCommittedEntryManager().stableTo(lastLog.getCurrLogIndex());
     Statistic.RAFT_SENDER_COMMIT_APPEND_AND_STABLE_LOGS.calOperationCostTimeFromStart(startTime);

     commitIndex = lastLog.getCurrLogIndex();
     startTime = Statistic.RAFT_SENDER_COMMIT_APPLY_LOGS.getOperationStartTime();
     applyEntries(entries);
     Statistic.RAFT_SENDER_COMMIT_APPLY_LOGS.calOperationCostTimeFromStart(startTime);
     ...

0 replies

jixuan1989 · 2021-08-27T05:58:54Z

jixuan1989
Aug 27, 2021
Collaborator

This discussion will be frozen.
It should be moved to github ISSUE, not DISCUSSION.

But because there are several replies and these replies can not be moved automatically,
this discussion will not be removed from DISCUSSION.

So, I will lock freeze this discussion. More discussions about this issue can go to #3856 or https://issues.apache.org/jira/browse/IOTDB-1583

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The server is abnormal after data is written in the cluster environment #3784

{{title}}

Replies: 5 comments 6 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

The server is abnormal after data is written in the cluster environment #3784

cigarl Aug 19, 2021

Replies: 5 comments · 6 replies

cigarl Aug 19, 2021 Author

OneSizeFitsQuorum Aug 23, 2021 Collaborator

cigarl Aug 23, 2021 Author

ericpai Aug 19, 2021 Collaborator

cigarl Aug 20, 2021 Author

OneSizeFitsQuorum Aug 23, 2021 Collaborator

chengjianyun Aug 20, 2021 Collaborator

OneSizeFitsQuorum Aug 23, 2021 Collaborator

cigarl Aug 23, 2021 Author

chengjianyun Aug 24, 2021 Collaborator

jixuan1989 Aug 27, 2021 Collaborator

cigarl
Aug 19, 2021

Replies: 5 comments 6 replies

cigarl
Aug 19, 2021
Author

OneSizeFitsQuorum Aug 23, 2021
Collaborator

cigarl Aug 23, 2021
Author

ericpai
Aug 19, 2021
Collaborator

cigarl Aug 20, 2021
Author

OneSizeFitsQuorum Aug 23, 2021
Collaborator

chengjianyun
Aug 20, 2021
Collaborator

OneSizeFitsQuorum Aug 23, 2021
Collaborator

cigarl Aug 23, 2021
Author

chengjianyun
Aug 24, 2021
Collaborator

jixuan1989
Aug 27, 2021
Collaborator