The server is down abnormally, causing RabbitMQ to fail to start

created at 08-19-2021 views: 50

cause of the problem

RabbitMQ hangs due to an abnormal server downtime. After the server is restored, an attempt to start MQ is found and the startup fails. The error message is as follows

[root@bogon rabbitMq]# rabbitmqctl  start_app
Error: unable to perform an operation on node 'rabbit@iZbp128yw4rvtfbytgv4y7Z'. Please see diagnostics information and suggestions below.

Most common reasons for this are:

 * Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
 * CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
 * Target node is not running

In addition to the diagnostics info below:

 * See the CLI, clustering and networking guides on http://rabbitmq.com/documentation.html to learn more
 * Consult server logs on node rabbit@iZbp128yw4rvtfbytgv4y7Z
 * If target node is configured to use long node names, don't forget to use --longnames with CLI tools

DIAGNOSTICS
===========

attempted to contact: [rabbit@iZbp128yw4rvtfbytgv4y7Z]

rabbit@iZbp128yw4rvtfbytgv4y7Z:
  * connected to epmd (port 4369) on iZbp128yw4rvtfbytgv4y7Z
  * epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic 
  * can't establish TCP connection to the target node, reason: timeout (timed out)
  * suggestion: check if host 'iZbp128yw4rvtfbytgv4y7Z' resolves, is reachable and ports 25672, 4369 are not blocked by firewall

Current node details:
 * node name: 'rabbitmqcli-5146-rabbit@iZbp128yw4rvtfbytgv4y7Z'
 * effective user's home directory: /var/lib/rabbitmq
 * Erlang cookie hash: mVWZ9hzwnH+BCsNzXPPxQA==

view status:

rabbitmqctl status

error message

Error: unable to perform an operation on node ‘rabbit@iZbp128yw4rvtfbytgv4y7Z’. Please see diagnostics information and suggestions below.

Most common reasons for this are:

Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
CLI tool fails to authenticate with the server (e.g. due to CLI tool’s Erlang cookie not matching that of the server)
Target node is not running
In addition to the diagnostics info below:

See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
Consult server logs on node rabbit@iZbp128yw4rvtfbytgv4y7Z
If target node is configured to use long node names, don’t forget to use --longnames with CLI tools
DIAGNOSTICS
===========
attempted to contact: [‘rabbit@iZbp128yw4rvtfbytgv4y7Z’]

rabbit@iZbp128yw4rvtfbytgv4y7Z:

connected to epmd (port 4369) on iZbp128yw4rvtfbytgv4y7Z
epmd reports: node ‘rabbit’ not running at all
no other nodes on iZbp128yw4rvtfbytgv4y7Z
suggestion: start the node
Current node details:

node name: ‘rabbitmqcli-11079-rabbit@iZbp128yw4rvtfbytgv4y7Z’
effective user’s home directory: /root
Erlang cookie hash: vVAgrz18VW8gDZQB2YRW8A==

solution

  • It may be because the process was not completely closed or occupied by other applications when we stopped, which caused subsequent restarts to fail and rabbitmq did not start.
    The default port numbers occupied by Rabbitmq-server are: 5672, 15672, 25672, 4369
    Query lsof -i: 4369 port occupation process kill -9 pid After pid kills the relevant process, it restarts and finds that the error is still reported, and the 4369 port kill shows that the process does not exist, but re-queries the port occupation and finds that another process occupies the port
  • Error: unable to perform an operation on node'rabbit@iZbp128yw4rvtfbytgv4y7Z'. According to this error message, it may be an ip mapping problem, so make a mapping in the /etc/hosts file, and map the hostname iZbp128yw4rvtfbytgv4y7Z with the local ip a bit. Save, source /etc/hosts. Start MQ error still
ifconfig #The internal network IP of this machine is queried: 10.25.0.181
#method one: 
echo 10.25.0.181 iZbp128yw4rvtfbytgv4y7Z >> /etc/host #Write the mapping content directly to the host file
#Method Two:
vim /etc/hosts #Press i to enter edit mode
# Add an ip mapping to this machine
10.25.0.181 iZbp128yw4rvtfbytgv4y7Z
# Esc Exit edit mode
:wq #Save the file and exit
source /etc/hosts #Refresh the mapping file error -bash: 127.0.0.1: Command not found -bash: ::1: Command not found -bash: 10.25.0.181: Command not found

My final solution

journalctl -xe #View the system log to troubleshoot the cause of MQ startup failure

The system log is as follows:

-- Subject: Unit rabbitmq-server.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit rabbitmq-server.service has failed.
-- 
-- The result is failed.
8月 19 11:15:38 iZbp128yw4rvtfbytgv4y7Z systemd[1]: Unit rabbitmq-server.service entered failed state.
8月 19 11:15:38 iZbp128yw4rvtfbytgv4y7Z systemd[1]: rabbitmq-server.service failed.
8月 19 11:15:48 iZbp128yw4rvtfbytgv4y7Z systemd[1]: rabbitmq-server.service holdoff time over, scheduling restart.
8月 19 11:15:48 iZbp128yw4rvtfbytgv4y7Z systemd[1]: Stopped RabbitMQ broker.
-- Subject: Unit rabbitmq-server.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit rabbitmq-server.service has finished shutting down.
8月 19 11:15:48 iZbp128yw4rvtfbytgv4y7Z systemd[1]: Starting RabbitMQ broker...
-- Subject: Unit rabbitmq-server.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit rabbitmq-server.service has begun starting up.
8月 19 11:15:49 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: Configuring logger redirection
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: ##  ##      RabbitMQ 3.8.19
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: ##  ##
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: ##########  Copyright (c) 2007-2021 VMware, Inc. or its affiliates.
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: ######  ##
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: ##########  Licensed under the MPL 2.0. Website: https://rabbitmq.com
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: Erlang:      23.3.4.4 [emu]
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: TLS Library: OpenSSL - OpenSSL 1.0.2k-fips  26 Jan 2017
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: Doc guides:  https://rabbitmq.com/documentation.html
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: Support:     https://rabbitmq.com/contact.html
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: Tutorials:   https://rabbitmq.com/getstarted.html
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: Monitoring:  https://rabbitmq.com/monitoring.html
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: Logs: /var/log/rabbitmq/rabbit@iZbp128yw4rvtfbytgv4y7Z.log
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: /var/log/rabbitmq/rabbit@iZbp128yw4rvtfbytgv4y7Z_upgrade.log
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: Config file(s): (none)
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: Starting broker...
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: BOOT FAILED
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: ===========
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: Error during startup: {error,
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: {cannot_delete_plugins_expand_dir,
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: ["/var/lib/rabbitmq/mnesia/rabbit@iZbp128yw4rvtfbytgv4y7Z-plugins-expand",
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: {cannot_delete,
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: "/var/lib/rabbitmq/mnesia/rabbit@iZbp128yw4rvtfbytgv4y7Z-plugins-expand/cowboy-2.8.0/ebin/cowboy_app.beam",
8月 19 11:15:53 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: eacces}]}}
8月 19 11:15:55 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: {"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{{cannot_delete_plugins_expand_dir,[\"/var/lib/rabbitmq/mnesia/rabbit@iZbp128yw4rvtfb
8月 19 11:15:55 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{{cannot_delete_plugins_expand_dir,["/var/lib/rabbitmq/mnesia/rabbit@iZbp128yw4rvtfbyt
8月 19 11:15:55 iZbp128yw4rvtfbytgv4y7Z rabbitmq-server[16624]: Crash dump is being written to: /var/log/rabbitmq/erl_crash.dump...done
8月 19 11:15:55 iZbp128yw4rvtfbytgv4y7Z systemd[1]: rabbitmq-server.service: main process exited, code=exited, status=1/FAILURE
8月 19 11:15:55 iZbp128yw4rvtfbytgv4y7Z systemd[1]: Failed to start RabbitMQ broker.

Delete rabbit@iZbp128yw4rvtfbytgv4y7Z.pid, rabbit@iZbp128yw4rvtfbytgv4y7Z, and rabbit@iZbp128yw4rvtfbytgv4y7Z-plugins-expand in /var/lib/rabbitmq/mnesia directory and then start with command **systemctl start rabbitmq-server**.
Note: The file contains information such as switch queues and users. Deleting is equivalent to resetting MQ (the queue will be emptied). Please operate with caution.

created at:08-19-2021
edited at: 08-19-2021: