Skip to main content

Monitoring Zookeeper

Collect and monitor the general performance Metrics of Zookeeper.

PreRequisites

Zookeeper four word command

The current implementation scheme uses the four word command provided by zookeeper to collect Metrics. Users need to add the four word command of zookeeper to the white list by themselves.

Steps

1.Find our zookeeper configuration file, which is usually zoo.cfg.

2.Add the following commands to the configuration file

# Add the required command to the white list
4lw.commands.whitelist=stat, ruok, conf, isro

# Add all commands to the white list
4lw.commands.whitelist=*

3.Restart service

zkServer.sh restart

netcat protocol

The current implementation scheme requires us to deploy the Linux server of zookeeper Command environment for installing netcat

netcat installation steps

yum install -y nc

If the terminal displays the following information, the installation is successful

Complete!

Configuration parameter

Parameter nameParameter help description
Monitoring HostMonitored IPV4, IPV6 or domain name. Note⚠️Without protocol header (eg: https://, http://)
Monitoring nameIdentify the name of this monitoring. The name needs to be unique
PortPort provided by Zookeeper. The default is 2181
Query timeoutSet the timeout of Zookeeper connection, unit: ms, default: 3000ms
UsernameUser name of the Linux connection where Zookeeper is located
PasswordPassword of the Linux connection where Zookeeper is located
Collection intervalInterval time of monitor periodic data collection, unit: second, and the minimum interval that can be set is 30 seconds
Whether to detectWhether to detect and check the availability of monitoring before adding monitoring. Adding and modifying operations will continue only after the detection is successful
Description remarksFor more information about identifying and describing this monitoring, users can note information here

Collection Metric

Metric set:conf

Metric nameMetric unitMetric help description
clientPortnonePort
dataDirnoneData snapshot file directory. By default, 100000 operations generate a snapshot
dataDirSizekbData snapshot file size
dataLogDirnoneTransaction log file directory, production environment on a separate disk
dataLogSizekbTransaction log file size
tickTimemsTime interval between servers or between clients and servers to maintain heartbeat
minSessionTimeoutmsMinimum session timeout. Heartbeat timex2. The specified time is less than this time, which is used by default
maxSessionTimeoutmsMaximum session timeout. Heartbeat timex20. The specified time is greater than this time, which is used by default
serverIdnoneServer id

Metric set:stats

Metric nameMetric unitMetric help description
zk_versionnoneServer version
zk_server_statenoneServer role
zk_num_alive_connectionsnumberNumber of connections
zk_avg_latencymsAverage latency
zk_outstanding_requestsnumberNumber of outstanding requests
zk_znode_countnumberNumber of znode
zk_packets_sentnumberNumber of packets sent
zk_packets_receivednumberNumber of packets received
zk_watch_countnumberNumber of watch
zk_max_file_descriptor_countnumberMaximum number of file descriptors
zk_approximate_data_sizekbdata size
zk_open_file_descriptor_countnumberNumber of open file descriptors
zk_max_latencymsMax latency
zk_ephemerals_countnumberNumber of ephemeral nodes
zk_min_latencymsMin latency