[Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて

Back to archive index

renay****@ybb***** renay****@ybb*****
2015年 3月 17日 (火) 09:45:52 JST


福田さん

おはようございます。山内です。

念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
(実際には、改行に気を付けてください)

以下の例は、PM1.1系での設定で、
nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。

stonith自体は、helperとsshです。


(snip)
### Group Configuration ###
group grpStonith1 \
prmStonith1-1 \
prmStonith1-2

group grpStonith2 \
prmStonith2-1 \
prmStonith2-2

### Fencing Topology ###
fencing_topology \
nodea: prmStonith1-1 prmStonith1-2 \
nodeb: prmStonith2-1 prmStonith2-2
(snp)
primitive prmStonith1-1 stonith:external/stonith-helper \
params \

pcmk_reboot_retries="1" \
pcmk_reboot_timeout="40s" \
hostlist="nodea" \
dead_check_target="192.168.28.60 192.168.28.70" \
standby_check_command="/usr/sbin/crm_resource -r prmRES -W | grep -qi `hostname`" \
run_online_check="yes" \
op start interval="0s" timeout="60s" on-fail="restart" \
op stop interval="0s" timeout="60s" on-fail="ignore"

primitive prmStonith1-2 stonith:external/ssh \
params \
pcmk_reboot_timeout="60s" \
hostlist="nodea" \
op start interval="0s" timeout="60s" on-fail="restart" \
op monitor interval="3600s" timeout="60s" on-fail="restart" \
op stop interval="0s" timeout="60s" on-fail="ignore"

primitive prmStonith2-1 stonith:external/stonith-helper \
params \
pcmk_reboot_retries="1" \
pcmk_reboot_timeout="40s" \
hostlist="nodeb" \
dead_check_target="192.168.28.61 192.168.28.71" \
standby_check_command="/usr/sbin/crm_resource -r prmRES -W | grep -qi `hostname`" \
run_online_check="yes" \
op start interval="0s" timeout="60s" on-fail="restart" \
op stop interval="0s" timeout="60s" on-fail="ignore"

primitive prmStonith2-2 stonith:external/ssh \
params \
pcmk_reboot_timeout="60s" \
hostlist="nodeb" \
op start interval="0s" timeout="60s" on-fail="restart" \
op monitor interval="3600s" timeout="60s" on-fail="restart" \
op stop interval="0s" timeout="60s" on-fail="ignore"
(snip)
location rsc_location-grpStonith1-2 grpStonith1 \
rule -INFINITY: #uname eq nodea
location rsc_location-grpStonith2-3 grpStonith2 \
rule -INFINITY: #uname eq nodeb


以上です。




----- Original Message -----
>From: Masamichi Fukuda - elf-systems <masamichi_fukud****@elf-s*****>
>To: 山内英生 <renay****@ybb*****>; "linux****@lists*****" <linux****@lists*****> 
>Date: 2015/3/17, Tue 09:30
>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> 
>
>山内さん
>
>おはようございます、福田です。
>
>サンプル等の参考urlの情報ありがとうございます。
>
>宜しくお願いします。
>
>以上
>
>
>
>2015-03-16 21:48 GMT+09:00 <renay****@ybb*****>:
>
>福田さん
>>
>>こんばんは、山内です。
>>
>>以下に去年のOSC Tokyoでのfencing_topologyのサンプルがあるようです。
>>
>> * http://linux-ha.sourceforge.jp/wp/wp-content/uploads/osc2014_crm.txt
>>
>>fencing_topologyで対象とするノードと実行stonithエージェントが制御出来ます。
>>
>>-----------------
>>fencing_topology \
>>
>>server01: prmStonith1 \ server02: prmStonith2
>>-----------------
>>
>>の形式で、
>>1行に対象ノード: 実行するstonithエージェントを記載...[複数可能]
>>以下にも本家の情報があります。
>>* http://clusterlabs.org/wiki/Fencing_topology
>>以上です。
>>
>>
>>
>>
>>----- Original Message -----
>>>From: Masamichi Fukuda - elf-systems <masamichi_fukud****@elf-s*****>
>>
>>>To: "linux****@lists*****" <linux****@lists*****>
>>>Date: 2015/3/16, Mon 19:24
>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>
>>>
>>>松島さん
>>>
>>>こんばんは、福田です。
>>>早速のご連絡ありがとうございます。
>>>
>>>crm_mon -rfAの表示です。
>>>
>>>Last updated: Mon Mar 16 18:26:37 2015
>>>Last change: Mon Mar 16 18:04:31 2015
>>>Stack: heartbeat
>>>Current DC: lbv2.beta.com (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>tion with quorum
>>>Version: 1.1.12-561c4cf
>>>2 Nodes configured
>>>10 Resources configured
>>>
>>>
>>>Online: [ lbv1.beta.com lbv2.beta.com ]
>>>
>>>Full list of resources:
>>>
>>> Resource Group: HAvarnish
>>>     vip_208    (ocf::heartbeat:IPaddr2):       Stopped
>>>     varnishd   (lsb:varnish):  Stopped
>>> Resource Group: grpStonith1
>>>     Stonith1-1 (stonith:external/stonith-helper):      Stopped
>>>     Stonith1-2 (stonith:external/xen0):        Stopped
>>>     Stonith1-3 (stonith:meatware):     Stopped
>>> Resource Group: grpStonith2
>>>     Stonith2-1 (stonith:external/stonith-helper):      Stopped
>>>     Stonith2-2 (stonith:external/xen0):        Stopped
>>>     Stonith2-3 (stonith:meatware):     Stopped
>>> Clone Set: clone_ping [ping]
>>>     Stopped: [ lbv1.beta.com lbv2.beta.com ]
>>>
>>>Node Attributes:
>>>* Node lbv1.beta.com:
>>>* Node lbv2.beta.com:
>>>
>>>Migration summary:
>>>* Node lbv2.beta.com:
>>>   Stonith1-1: migration-threshold=1 fail-count=1000000 last-failure='Mon Mar 16
>>> 18:23:47 2015'
>>>   ping: migration-threshold=1 fail-count=1000000 last-failure='Mon Mar 16 18:23
>>>:47 2015'
>>>* Node lbv1.beta.com:
>>>   Stonith2-1: migration-threshold=1 fail-count=1000000 last-failure='Mon Mar 16
>>> 18:23:48 2015'
>>>   ping: migration-threshold=1 fail-count=1000000 last-failure='Mon Mar 16 18:23
>>>:55 2015'
>>>
>>>Failed actions:
>>>    Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1): call=39, st
>>>atus=Error, last-rc-change='Mon Mar 16 18:23:44 2015', queued=0ms, exec=2014ms
>>>    ping_start_0 on lbv2.beta.com 'unknown error' (1): call=40, status=c
>>>omplete, last-rc-change='Mon Mar 16 18:23:45 2015', queued=0ms, exec=995ms
>>>    Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1): call=39, st
>>>atus=Error, last-rc-change='Mon Mar 16 18:23:45 2015', queued=0ms, exec=2009ms
>>>    ping_start_0 on lbv1.beta.com 'unknown error' (1): call=41, status=c
>>>omplete, last-rc-change='Mon Mar 16 18:23:54 2015', queued=0ms, exec=182ms
>>>
>>>
>>>標準出力、標準エラー出力はなく、ログ(/var/log/ha-debug)になります。
>>>
>>>ノード1側(lbv1)
>>>
>>>Mar 16 18:22:47 lbv1.beta.com heartbeat: [1914]: info: Pacemaker support: yes
>>>Mar 16 18:22:47 lbv1.beta.com heartbeat: [1914]: WARN: File /etc/ha.d//haresources exists.
>>>Mar 16 18:22:47 lbv1.beta.com heartbeat: [1914]: WARN: This file is not used because pacemaker is enabled
>>>Mar 16 18:22:47 lbv1.beta.com heartbeat: [1914]: debug: Checking access of: /usr/local/heartbeat/libexec/heartbeat/ccm
>>>Mar 16 18:22:47 lbv1.beta.com heartbeat: [1914]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/cib
>>>Mar 16 18:22:47 lbv1.beta.com heartbeat: [1914]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/stonithd
>>>Mar 16 18:22:47 lbv1.beta.com heartbeat: [1914]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/lrmd
>>>Mar 16 18:22:47 lbv1.beta.com heartbeat: [1914]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/attrd
>>>Mar 16 18:22:47 lbv1.beta.com heartbeat: [1914]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/crmd
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1914]: WARN: Core dumps could be lost if multiple dumps occur.
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1914]: WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1914]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1914]: WARN: Logging daemon is disabled --enabling logging daemon is recommended
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1914]: info: **************************
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1914]: info: Configuration validated. Starting heartbeat 3.0.6
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1957]: info: heartbeat: version 3.0.6
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1957]: info: Heartbeat generation: 1423534103
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1957]: info: seed is -1702799346
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1957]: info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth1
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1957]: info: glib: ucast: bound send socket to device: eth1
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1957]: info: glib: ucast: set SO_REUSEADDR
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1957]: info: glib: ucast: bound receive socket to device: eth1
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1957]: info: glib: ucast: started on port 694 interface eth1 to 10.0.17.133
>>>Mar 16 18:22:48 lbv1.beta.com heartbeat: [1957]: info: Local status now set to: 'up'
>>>Mar 16 18:22:53 lbv1.beta.com heartbeat: [1957]: info: Link lbv2.beta.com:eth1 up.
>>>Mar 16 18:22:53 lbv1.beta.com heartbeat: [1957]: info: Status update for node lbv2.beta.com: status up
>>>Mar 16 18:22:53 lbv1.beta.com heartbeat: [1957]: debug: get_delnodelist: delnodelist=
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [1957]: info: Comm_now_up(): updating status to active
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [1957]: info: Local status now set to: 'active'
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [1957]: info: Starting child client "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [1957]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [1957]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [1957]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [1957]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [1957]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [1957]: info: Status update for node lbv2.beta.com: status active
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [2868]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0  gid 0 (pid 2868)
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [2866]: info: Starting "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109  gid 113 (pid 2866)
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [2871]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109  gid 113 (pid 2871)
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [2869]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0  gid 0 (pid 2869)
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [2867]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109  gid 113 (pid 2867)
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [2870]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109  gid 113 (pid 2870)
>>>Mar 16 18:22:54 lbv1.beta.com ccm: [2866]: info: Hostname: lbv1.beta.com
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [1957]: info: the send queue length from heartbeat to client ccm is set to 1024
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [1957]: info: the send queue length from heartbeat to client attrd is set to 1024
>>>Mar 16 18:22:54 lbv1.beta.com heartbeat: [1957]: info: the send queue length from heartbeat to client stonithd is set to 1024
>>>Mar 16 18:22:55 lbv1.beta.com heartbeat: [1957]: info: the send queue length from heartbeat to client cib is set to 1024
>>>Mar 16 18:22:58 lbv1.beta.com heartbeat: [1957]: WARN: 1 lost packet(s) for [lbv2.beta.com] [33:35]
>>>Mar 16 18:22:58 lbv1.beta.com heartbeat: [1957]: info: No pkts missing from lbv2.beta.com!
>>>Mar 16 18:22:59 lbv1.beta.com heartbeat: [1957]: info: the send queue length from heartbeat to client crmd is set to 1024
>>>Mar 16 18:22:59 lbv1.beta.com heartbeat: [1957]: WARN: 1 lost packet(s) for [lbv2.beta.com] [40:42]
>>>Mar 16 18:22:59 lbv1.beta.com heartbeat: [1957]: info: No pkts missing from lbv2.beta.com!
>>>ping(ping)[3164]:    2015/03/16_18:23:54 WARNING: Could not update default_ping_set = 100: rc=127
>>>
>>>ノード2側(lbv2)
>>>
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: info: Pacemaker support: yes
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: WARN: File /etc/ha.d//haresources exists.
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: WARN: This file is not used because pacemaker is enabled
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: debug: Checking access of: /usr/local/heartbeat/libexec/heartbeat/ccm
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/cib
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/stonithd
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/lrmd
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/attrd
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/crmd
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: WARN: Core dumps could be lost if multiple dumps occur.
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: WARN: Logging daemon is disabled --enabling logging daemon is recommended
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: info: **************************
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1925]: info: Configuration validated. Starting heartbeat 3.0.6
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1977]: info: heartbeat: version 3.0.6
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1977]: info: Heartbeat generation: 1423534179
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1977]: info: seed is 2086609325
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1977]: info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth1
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1977]: info: glib: ucast: bound send socket to device: eth1
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1977]: info: glib: ucast: set SO_REUSEADDR
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1977]: info: glib: ucast: bound receive socket to device: eth1
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1977]: info: glib: ucast: started on port 694 interface eth1 to 10.0.17.132
>>>Mar 16 18:22:47 lbv2.beta.com heartbeat: [1977]: info: Local status now set to: 'up'
>>>Mar 16 18:22:48 lbv2.beta.com heartbeat: [1977]: info: Link lbv1.beta.com:eth1 up.
>>>Mar 16 18:22:48 lbv2.beta.com heartbeat: [1977]: info: Status update for node lbv1.beta.com: status up
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [1977]: debug: get_delnodelist: delnodelist=
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [1977]: info: Comm_now_up(): updating status to active
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [1977]: info: Local status now set to: 'active'
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [1977]: info: Starting child client "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [1977]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [1977]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [1977]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [1977]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [1977]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [3026]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109  gid 113 (pid 3026)
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [3023]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109  gid 113 (pid 3023)
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [3025]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0  gid 0 (pid 3025)
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [3024]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0  gid 0 (pid 3024)
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [3022]: info: Starting "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109  gid 113 (pid 3022)
>>>Mar 16 18:22:53 lbv2.beta.com heartbeat: [3027]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109  gid 113 (pid 3027)
>>>Mar 16 18:22:54 lbv2.beta.com ccm: [3022]: info: Hostname: lbv2.beta.com
>>>Mar 16 18:22:54 lbv2.beta.com heartbeat: [1977]: info: the send queue length from heartbeat to client ccm is set to 1024
>>>Mar 16 18:22:54 lbv2.beta.com heartbeat: [1977]: info: the send queue length from heartbeat to client attrd is set to 1024
>>>Mar 16 18:22:54 lbv2.beta.com heartbeat: [1977]: info: Status update for node lbv1.beta.com: status active
>>>Mar 16 18:22:54 lbv2.beta.com heartbeat: [1977]: info: the send queue length from heartbeat to client stonithd is set to 1024
>>>Mar 16 18:22:54 lbv2.beta.com heartbeat: [1977]: info: the send queue length from heartbeat to client cib is set to 1024
>>>Mar 16 18:22:58 lbv2.beta.com ccm: [3022]: debug: quorum plugin: majority
>>>Mar 16 18:22:58 lbv2.beta.com ccm: [3022]: debug: cluster:linux-ha, member_count=1, member_quorum_votes=100
>>>Mar 16 18:22:58 lbv2.beta.com ccm: [3022]: debug: total_node_count=2, total_quorum_votes=200
>>>Mar 16 18:22:58 lbv2.beta.com ccm: [3022]: debug: quorum plugin: twonodes
>>>Mar 16 18:22:58 lbv2.beta.com ccm: [3022]: debug: cluster:linux-ha, member_count=1, member_quorum_votes=100
>>>Mar 16 18:22:58 lbv2.beta.com ccm: [3022]: debug: total_node_count=2, total_quorum_votes=200
>>>Mar 16 18:22:58 lbv2.beta.com ccm: [3022]: info: Break tie for 2 nodes cluster
>>>Mar 16 18:22:58 lbv2.beta.com heartbeat: [1977]: WARN: 1 lost packet(s) for [lbv1.beta.com] [30:32]
>>>Mar 16 18:22:58 lbv2.beta.com heartbeat: [1977]: info: No pkts missing from lbv1.beta.com!
>>>Mar 16 18:22:58 lbv2.beta.com heartbeat: [1977]: info: the send queue length from heartbeat to client crmd is set to 1024
>>>Mar 16 18:22:59 lbv2.beta.com heartbeat: [1977]: WARN: 1 lost packet(s) for [lbv1.beta.com] [35:37]
>>>Mar 16 18:22:59 lbv2.beta.com heartbeat: [1977]: info: No pkts missing from lbv1.beta.com!
>>>Mar 16 18:22:59 lbv2.beta.com ccm: [3022]: debug: quorum plugin: majority
>>>Mar 16 18:22:59 lbv2.beta.com ccm: [3022]: debug: cluster:linux-ha, member_count=2, member_quorum_votes=200
>>>Mar 16 18:22:59 lbv2.beta.com ccm: [3022]: debug: total_node_count=2, total_quorum_votes=200
>>>ping(ping)[3144]:    2015/03/16_18:23:46 WARNING: Could not update default_ping_set = 100: rc=127
>>>
>>>
>>>
>>>宜しくお願いします。
>>>
>>>以上
>>>
>>>
>>>
>>>
>>>2015年3月16日 18:53 Takehiro Matsushima <takeh****@gmail*****>:
>>>
>>>福田さん
>>>>
>>>>こんばんは、松島です。
>>>>取り急ぎ1点確認させていただけますでしょうか。
>>>>
>>>>ping RAのstartでunknown errorになっているのも気になりますので、
>>>>pingやStonith Helperについて、各RAが標準出力・標準エラー出力に吐き出した部分も含めて
>>>>該当しそうなログの引用をいただければ幸いです。
>>>>
>>>>----
>>>>Takehiro Matsushima
>>>>
>>>>_______________________________________________
>>>>Linux-ha-japan mailing list
>>>>Linux****@lists*****
>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>
>>>>
>>>
>>>
>>>--
>>>
>>>ELF Systems
>>>Masamichi Fukuda
>>>mail to: masamichi_fukud****@elf-s*****
>>>_______________________________________________
>>>Linux-ha-japan mailing list
>>>Linux****@lists*****
>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>
>>>
>>>
>>
>>_______________________________________________
>>Linux-ha-japan mailing list
>>Linux****@lists*****
>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>
>
>
>-- 
>
>ELF Systems
>Masamichi Fukuda
>mail to: masamichi_fukud****@elf-s*****
>
>




Linux-ha-japan メーリングリストの案内
Back to archive index