1、rabbitmq使用guest用户连接报错,rabbitmq日志报错信息为 "guest" user can only connect via localhost"
rabbitmq官网上有解决方法:
注:在centos系统是可以的,ubuntu下使用guest用户连接是报错的。
2、haproxy代理rabbitmq(openstack HA环境下),经常报这个错:
。。。2015-04-06 20:12:45.187 18618 TRACE oslo.messaging._drivers.impl_rabbit (40, 11), # Channel.exchange_declare_ok2015-04-06 20:12:45.187 18618 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/abstract_channel.py", line 67, in wait2015-04-06 20:12:45.187 18618 TRACE oslo.messaging._drivers.impl_rabbit self.channel_id, allowed_methods)2015-04-06 20:12:45.187 18618 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 237, in _wait_method2015-04-06 20:12:45.187 18618 TRACE oslo.messaging._drivers.impl_rabbit self.method_reader.read_method()2015-04-06 20:12:45.187 18618 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 189, in read_method2015-04-06 20:12:45.187 18618 TRACE oslo.messaging._drivers.impl_rabbit raise m2015-04-06 20:12:45.187 18618 TRACE oslo.messaging._drivers.impl_rabbit IOError: Socket closed
解决方法:
参考这里 # 查看镜像队列的master队列和slave队列rabbitmqctl list_queues name pid slave_pids synchronised_slave_pids
3、rabbitmq cluster日志中报handshake timeout
解决方法:增加handshake timeout值
注:rabbitmq 3.3.5代码中handshake_timeout是hard code,写死值是10000毫秒;
详情见 (搜索关键字handshake_timeout)
-export([system_continue/3, system_terminate/4, system_code_change/4]).-export([init/2, mainloop/4, recvloop/4]).-export([conserve_resources/3, server_properties/1]).-define(HANDSHAKE_TIMEOUT, 10). # hard code-define(NORMAL_TIMEOUT, 3).-define(CLOSING_TIMEOUT, 30).-define(CHANNEL_TERMINATION_TIMEOUT, 3).-define(SILENT_CLOSE_DELAY, 3).-define(CHANNEL_MIN, 1).
4、 rabbitmq连接缺少心跳检测或tcp长连接保持(OpenStakc Juno + Rabbitmq 3.3.5),通常是控制节点和计算节点之间存在防火墙会把这个问题放大。
解决方法:
1、修改/etc/rabbitmq/rabbitmq.conf
[
{rabbit, [{tcp_listen_options, [binary,
{packet, raw},
{reuseaddr, true},
{backlog, 128},
{nodelay, true},
{exit_on_close, false},
{keepalive, true}]} # 主要增加这行
]}
].
2、操作系统层面tuning
echo "5" > /proc/sys/net/ipv4/tcp_keepalive_time
echo "5" > /proc/sys/net/ipv4/tcp_keepalive_probes
echo "1" > /proc/sys/net/ipv4/tcp_keepalive_intvl
bug描述:
OpenStack Juno 增加rabbitmq heartbeat patch:
5、rabbitmq集群脑裂问题
针对network partition的分析
6、msg_store_persistent、msg_store_transient概念
[root@controller1 rabbit(keystone_admin)]# du -h /var/lib/rabbitmq/mnesia/rabbit/msg_store_transient/48G/var/lib/rabbitmq/mnesia/rabbit/msg_store_transient/
重启rabbitmq-server服务,msg_store_transient目录下的*.rdq文件都删除了。
参考链接:
参考链接