Error: KafkaStorageException Too many open files

created at 08-30-2022 views: 13

problem Description

In the process of Flink cluster big data processing, the production data and consumption data are sent to Kafka; if there is an abnormality in the Flink processing process, the corresponding restart mechanism or checkpoint strategy is adopted; after the project is started, as more and more devices are connected , Kafka's topic is dynamically generated more and more, and Flink processing begins to appear abnormal.

Caused by: org.apache.kafka.common.errors.TimeoutException: Expiring 27 record(s) for topic-call_XAucjhIN-0:120000 ms has passed since batch creation

A server in the Kafka cluster hangs up, and the error message is as follows:

[2022-08-01 14:55:22,453] ERROR Error while writing to checkpoint file /home/kafka-logs/fan_sink_29-1/leader-epoch-checkpoint (kafka.server.LogDirFailureChannel)
java.io.FileNotFoundException: /home/kafka-logs/topic_min/leader-epoch-checkpoint.tmp (too many open files)

solution

The treatment plan is as follows:

Modify operating system restrictions

[root@kafka101 ~] vi /etc/security/limits.conf

root soft nofile 65536
root hard nofile 65536

Find the directory or file containing kafka [Locate kafka.service]

[root@kafka103 ~]# cd /

[root@kafka103 ~]# find / -name *kafka*

/etc/systemd/system/kafka.service

Modify configuration - increase read file size

[root@kafka103 ~]# cd /etc/systemd/system/

[root@kafka103 ~]# vi kafka.service

#Increase the maximum number of files
LimitNOFILE=65535

[root@kafka103 ~]# systemctl daemon-reload

restart kafka

[root@kafka103 ~]# systemctl stop kafka
[root@kafka103 ~]# systemctl start kafka

View kafka process

[root@kafka103 system]# ps -ef|grep kafka
The kafka process number found here is 19694
[root@kafka103 system]# cat /proc/19694/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             2062355              2062355              processes 
Max open files            65535                65535                files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       2062355              2062355              signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited           

Max Open Files has become 65535

So far the "too many open files" problem has been solved

created at:08-30-2022
edited at: 08-30-2022: