[Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて

Back to archive index

renay****@ybb***** renay****@ybb*****
2015年 3月 17日 (火) 22:28:59 JST


福田さん

こんばんは、山内です。

変わらないようですね。。。

とりあえず、明日くらいに、RHEL上ですが、

Heartbeat3.0.6
Pacemakerの最新

組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。

#stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・


以上です。



----- Original Message -----
>From: Masamichi Fukuda - elf-systems <masamichi_fukud****@elf-s*****>
>To: 山内英生 <renay****@ybb*****>; "linux****@lists*****" <linux****@lists*****> 
>Date: 2015/3/17, Tue 21:24
>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
> 
>
>山内さん
>
>こんばんは、福田です。
>最新版の情報をありがとうございました。
>
>早速インストールしてみました。
>
>起動後の状態です。
>
>failed actionsは変わりないようです。
>
>
>
># crm_mon -rfA
>Last updated: Tue Mar 17 21:03:49 2015
>Last change: Tue Mar 17 20:30:58 2015
>Stack: heartbeat
>Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti
>tion with quorum
>Version: 1.1.12-e32080b
>2 Nodes configured
>8 Resources configured
>
>
>Online: [ lbv1.beta.com lbv2.beta.com ]
>
>Full list of resources:
>
> Resource Group: HAvarnish
>     vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>     varnishd   (lsb:varnish):  Started lbv1.beta.com
> Resource Group: grpStonith1
>     Stonith1-1 (stonith:external/stonith-helper):      Stopped
>     Stonith1-2 (stonith:external/xen0):        Stopped
> Resource Group: grpStonith2
>     Stonith2-1 (stonith:external/stonith-helper):      Stopped
>     Stonith2-2 (stonith:external/xen0):        Stopped
> Clone Set: clone_ping [ping]
>     Started: [ lbv1.beta.com lbv2.beta.com ]
>
>Node Attributes:
>* Node lbv1.beta.com:
>    + default_ping_set                  : 100
>* Node lbv2.beta.com:
>    + default_ping_set                  : 100
>
>Migration summary:
>* Node lbv1.beta.com: 
>   Stonith2-1: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
> 21:03:39 2015'
>* Node lbv2.beta.com: 
>   Stonith1-1: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
> 21:03:32 2015'
>
>Failed actions:
>    Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1): call=31, st
>atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 21:03:37 2015', queue
>d=0ms, exec=1085ms
>    Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1): call=18, st
>atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 21:03:30 2015', queue
>d=0ms, exec=1061ms
>
>
>
>
>ログです。
>
>
># less /var/log/ha-debug
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Pacemaker support: yes
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: File /etc/ha.d//haresources exists.
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: This file is not used because pacemaker is enabled
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: /usr/local/heartbeat/libexec/heartbeat/ccm
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/cib
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/stonithd
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/lrmd
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/attrd
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: /usr/local/heartbeat/libexec/pacemaker/crmd
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Core dumps could be lost if multiple dumps occur.
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Logging daemon is disabled --enabling logging daemon is recommended
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: **************************
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Configuration validated. Starting heartbeat 3.0.6
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: heartbeat: version 3.0.6
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Heartbeat generation: 1423534116
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: seed is -1702799346
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth1
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: bound send socket to device: eth1
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: set SO_REUSEADDR
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: bound receive socket to device: eth1
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: started on port 694 interface eth1 to 10.0.17.133
>Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Local status now set to: 'up'
>Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Link lbv2.beta.com:eth1 up.
>Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Status update for node lbv2.beta.com: status up
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Comm_now_up(): updating status to active
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Local status now set to: 'active'
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: debug: get_delnodelist: delnodelist= 
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4250]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109  gid 113 (pid 4250)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4246]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109  gid 113 (pid 4246)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4249]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109  gid 113 (pid 4249)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4245]: info: Starting "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109  gid 113 (pid 4245)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4248]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0  gid 0 (pid 4248)
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4247]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0  gid 0 (pid 4247)
>Mar 17 21:02:47 lbv1.beta.com ccm: [4245]: info: Hostname: lbv1.beta.com
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length from heartbeat to client ccm is set to 1024
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length from heartbeat to client attrd is set to 1024
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length from heartbeat to client stonith-ng is set to 1024
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Status update for node lbv2.beta.com: status active
>Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length from heartbeat to client cib is set to 1024
>Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for [lbv2.beta.com] [15:17]
>Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from lbv2.beta.com!
>Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for [lbv2.beta.com] [19:21]
>Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from lbv2.beta.com!
>Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: the send queue length from heartbeat to client crmd is set to 1024
>Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for [lbv2.beta.com] [24:26]
>Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from lbv2.beta.com!
>Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for [lbv2.beta.com] [26:28]
>Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from lbv2.beta.com!
>Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for [lbv2.beta.com] [30:32]
>Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from lbv2.beta.com!
>
>
>
># less /var/log/error
>
>Mar 17 21:02:47 lbv1 attrd[4249]:    error: ha_msg_dispatch: Ignored incoming message. Please set_msg_callback on hbclstat
>Mar 17 21:02:48 lbv1 attrd[4249]:    error: ha_msg_dispatch: Ignored incoming message. Please set_msg_callback on hbclstat
>Mar 17 21:02:53 lbv1 stonith-ng[4247]:    error: ha_msg_dispatch: Ignored incoming message. Please set_msg_callback on hbclstat
>Mar 17 21:02:53 lbv1 stonith-ng[4247]:    error: ha_msg_dispatch: Ignored incoming message. Please set_msg_callback on hbclstat
>Mar 17 21:03:39 lbv1 crmd[4250]:    error: process_lrm_event: Operation Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4, cib-update=42, confirmed=true) Error
>
># cat syslog|egrep 'Mar 17 21:03|Mar 17 21:02' |egrep 'heartbeat|stonith|pacemaker|error'
>Mar 17 21:03:24 lbv1 pengine[4253]:   notice: process_pe_message: Calculated Transition 0: /var/lib/pacemaker/pengine/pe-input-115.bz2
>Mar 17 21:03:27 lbv1 crmd[4250]:   notice: run_graph: Transition 0 (Complete=15, Pending=0, Fired=0, Skipped=16, Incomplete=2, Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped
>Mar 17 21:03:29 lbv1 pengine[4253]:   notice: process_pe_message: Calculated Transition 1: /var/lib/pacemaker/pengine/pe-input-116.bz2
>Mar 17 21:03:34 lbv1 crmd[4250]:   notice: run_graph: Transition 1 (Complete=8, Pending=0, Fired=0, Skipped=12, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped
>Mar 17 21:03:37 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure: Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>Mar 17 21:03:37 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure: Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>Mar 17 21:03:37 lbv1 pengine[4253]:   notice: process_pe_message: Calculated Transition 2: /var/lib/pacemaker/pengine/pe-input-117.bz2
>Mar 17 21:03:39 lbv1 stonith-ng[4247]:   notice: log_operation: Operation 'monitor' [4377] for device 'Stonith2-1' returned: -201 (Generic Pacemaker error)
>Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation: Stonith2-1:4377 [ Performing: stonith -t external/stonith-helper -S ]
>Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation: Stonith2-1:4377 [ failed to exec "stonith" ]
>Mar 17 21:03:39 lbv1 stonith-ng[4247]:  warning: log_operation: Stonith2-1:4377 [ failed:  2 ]
>Mar 17 21:03:39 lbv1 crmd[4250]:    error: process_lrm_event: Operation Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4, cib-update=42, confirmed=true) Error
>Mar 17 21:03:40 lbv1 crmd[4250]:   notice: run_graph: Transition 2 (Complete=12, Pending=0, Fired=0, Skipped=3, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped
>Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure: Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown error (1)
>Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure: Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown error (1)
>Mar 17 21:03:42 lbv1 pengine[4253]:  warning: unpack_rsc_op_failure: Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1)
>Mar 17 21:03:42 lbv1 pengine[4253]:   notice: process_pe_message: Calculated Transition 3: /var/lib/pacemaker/pengine/pe-input-118.bz2
>Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]: INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto not_used not_used
>Mar 17 21:03:47 lbv1 crmd[4250]:   notice: run_graph: Transition 3 (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete
>
>宜しくお願いします。
>
>以上
>
>
>
>2015年3月17日 18:31 <renay****@ybb*****>:
>
>福田さん
>>
>>こんばんは、山内です。
>>
>>tag付けされていないので、本日の最新版は、
>>
>> * https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c
>>
>>
>>になります。
>>右側の[Download ZIP]からダウンロード出来ます。
>>
>>以上です。
>>
>>
>>----- Original Message -----
>>>From: Masamichi Fukuda - elf-systems <masamichi_fukud****@elf-s*****>
>>
>>>To: "renay****@ybb*****" <renay****@ybb*****>; "linux****@lists*****" <linux****@lists*****>
>>>Date: 2015/3/17, Tue 18:07
>>>Subject: スプリットブレイン時のSTONITHエラーについて
>>>
>>>
>>>山内さん
>>>
>>>
>>>お疲れ様です、福田です。
>>>
>>>
>>>こちらを見たのですが、
>>>https://github.com/ClusterLabs/pacemaker/tags
>>>
>>>
>>>
>>>pacemaker 1.1.12 561c4cf が最新のようなのですが。
>>>済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。
>>>
>>>
>>>宜しくお願いします。
>>>
>>>
>>>以上
>>>
>>>
>>>
>>>2015年3月17日火曜日、<renay****@ybb*****>さんは書きました:
>>>
>>>福田さん
>>>>
>>>>お疲れ様です。山内です。
>>>>
>>>>はい。古いです。
>>>>
>>>>PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。
>>>>もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・)
>>>>
>>>>
>>>>
>>>>本家のgithubから入手可能です。
>>>> * https://github.com/ClusterLabs/pacemaker
>>>>
>>>>
>>>>場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって
>>>>いくのが良いと思います。
>>>>
>>>>以上です。
>>>>
>>>>
>>>>
>>>>----- Original Message -----
>>>>>From: Masamichi Fukuda - elf-systems <masamichi_fukud****@elf-s*****>
>>>>>To: 山内英生 <renay****@ybb*****>; "linux****@lists*****" <linux****@lists*****>
>>>>>Date: 2015/3/17, Tue 16:06
>>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>
>>>>>
>>>>>山内さん
>>>>>
>>>>>お疲れ様です、福田です。
>>>>>
>>>>>以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。
>>>>>そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。
>>>>>
>>>>>heartbeat configuration: Version = "3.0.6"
>>>>>pacemaker configuration: Version = 1.1.12 (Build: 561c4cf)pacemakerがまだ古いということでしょうか。
>>>>>
>>>>>済みませんが、宜しくお願いします。
>>>>>
>>>>>以上
>>>>>
>>>>>
>>>>>
>>>>>2015年3月17日 14:59 <renay****@ybb*****>:
>>>>>
>>>>>福田さん
>>>>>>
>>>>>>お疲れ様です。山内です。
>>>>>>
>>>>>>ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか?
>>>>>>
>>>>>>
>>>>>>>>>>>> 2)Heartbeat3.0.6+Pacemaker最新 : OK
>>>>>>>>>>>>   
>>>>>>>>>>>> どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。
>>>>>>>>>>>>  * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f
>>>>>>
>>>>>>以下のcrm_monのバージョンを見ると、1.1.12のようです。
>>>>>>Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。
>>>>>>
>>>>>>># crm_mon -rfA
>>>>>>>
>>>>>>>Last updated: Tue Mar 17 14:14:39 2015
>>>>>>>Last change: Tue Mar 17 14:01:43 2015
>>>>>>>Stack: heartbeat
>>>>>>>Current DC: lbv2.beta.com (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>tion with quorum
>>>>>>>Version: 1.1.12-561c4cf
>>>>>>
>>>>>>たぶん、以下の変更以降は少なくとも必要かと思います。
>>>>>>
>>>>>>https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3
>>>>>>
>>>>>>
>>>>>>
>>>>>>以上です。
>>>>>>
>>>>>>
>>>>>>
>>>>>>----- Original Message -----
>>>>>>>From: Masamichi Fukuda - elf-systems <masamichi_fukud****@elf-s*****>
>>>>>>>To: 山内英生 <renay****@ybb*****>; "linux****@lists*****" <linux****@lists*****>
>>>>>>
>>>>>>>Date: 2015/3/17, Tue 14:38
>>>>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>
>>>>>>>
>>>>>>>山内さん
>>>>>>>
>>>>>>>お疲れ様です、福田です。
>>>>>>>
>>>>>>>stonith-helperのシェバング行に-xを追加すれば良いのでしょうか?
>>>>>>>stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。
>>>>>>>
>>>>>>>crm_monでは先ほどと変わりはないようです。
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>># crm_mon -rfA
>>>>>>>
>>>>>>>Last updated: Tue Mar 17 14:14:39 2015
>>>>>>>Last change: Tue Mar 17 14:01:43 2015
>>>>>>>Stack: heartbeat
>>>>>>>Current DC: lbv2.beta.com (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>tion with quorum
>>>>>>>Version: 1.1.12-561c4cf
>>>>>>>2 Nodes configured
>>>>>>>8 Resources configured
>>>>>>>
>>>>>>>Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>
>>>>>>>Full list of resources:
>>>>>>>
>>>>>>> Resource Group: HAvarnish
>>>>>>>     vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>>>>>>     varnishd   (lsb:varnish):  Started lbv1.beta.com
>>>>>>> Resource Group: grpStonith1
>>>>>>>     Stonith1-1 (stonith:external/stonith-helper):      Stopped
>>>>>>>     Stonith1-2 (stonith:external/xen0):        Stopped
>>>>>>> Resource Group: grpStonith2
>>>>>>>     Stonith2-1 (stonith:external/stonith-helper):      Stopped
>>>>>>>     Stonith2-2 (stonith:external/xen0):        Stopped
>>>>>>> Clone Set: clone_ping [ping]
>>>>>>>     Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>
>>>>>>>Node Attributes:
>>>>>>>* Node lbv1.beta.com:
>>>>>>>    + default_ping_set                  : 100
>>>>>>>* Node lbv2.beta.com:
>>>>>>>    + default_ping_set                  : 100
>>>>>>>
>>>>>>>Migration summary:
>>>>>>>* Node lbv2.beta.com:
>>>>>>>   Stonith1-1: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
>>>>>>> 14:12:16 2015'
>>>>>>>* Node lbv1.beta.com:
>>>>>>>   Stonith2-1: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
>>>>>>> 14:12:21 2015'
>>>>>>>
>>>>>>>Failed actions:
>>>>>>>    Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1): call=31, st
>>>>>>>atus=Error, last-rc-change='Tue Mar 17 14:12:14 2015', queued=0ms, exec=1065ms
>>>>>>>    Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1): call=26, st
>>>>>>>atus=Error, last-rc-change='Tue Mar 17 14:12:19 2015', queued=0ms, exec=1081ms
>>>>>>>
>>>>>>>その他のログを探してみました。
>>>>>>>
>>>>>>>heartbeat起動時です。
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>># less /var/log/pm_logconv.out
>>>>>>>Mar 17 14:11:28 lbv1.beta.com info: Starting Heartbeat 3.0.6.
>>>>>>>Mar 17 14:11:33 lbv1.beta.com info: Link lbv2.beta.com:eth1 is up.
>>>>>>>Mar 17 14:11:34 lbv1.beta.com info: Start "ccm" process. (pid=13264)
>>>>>>>Mar 17 14:11:34 lbv1.beta.com info: Start "lrmd" process. (pid=13267)
>>>>>>>Mar 17 14:11:34 lbv1.beta.com info: Start "attrd" process. (pid=13268)
>>>>>>>Mar 17 14:11:34 lbv1.beta.com info: Start "stonithd" process. (pid=13266)
>>>>>>>Mar 17 14:11:34 lbv1.beta.com info: Start "cib" process. (pid=13265)
>>>>>>>Mar 17 14:11:34 lbv1.beta.com info: Start "crmd" process. (pid=13269)
>>>>>>>
>>>>>>>
>>>>>>># less /var/log/error
>>>>>>>Mar 17 14:12:20 lbv1 crmd[13269]:    error: process_lrm_event: Operation Stonith2-1_start_0 (node=lbv1.beta.com, call=26, status=4, cib-update=19, confirmed=true) Error
>>>>>>>
>>>>>>>
>>>>>>>syslogからstonithをgrepしたものです
>>>>>>>
>>>>>>>Mar 17 14:11:34 lbv1 heartbeat: [13255]: info: Starting child client "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0)
>>>>>>>Mar 17 14:11:34 lbv1 heartbeat: [13266]: info: Starting "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0  gid 0 (pid 13266)
>>>>>>>Mar 17 14:11:34 lbv1 stonithd[13266]:   notice: crm_cluster_connect: Connecting to cluster infrastructure: heartbeat
>>>>>>>Mar 17 14:11:34 lbv1 heartbeat: [13255]: info: the send queue length from heartbeat to client stonithd is set to 1024
>>>>>>>Mar 17 14:11:40 lbv1 stonithd[13266]:   notice: setup_cib: Watching for stonith topology changes
>>>>>>>Mar 17 14:11:40 lbv1 stonithd[13266]:   notice: unpack_config: On loss of CCM Quorum: Ignore
>>>>>>>Mar 17 14:11:40 lbv1 stonithd[13266]:  warning: handle_startup_fencing: Blind faith: not fencing unseen nodes
>>>>>>>Mar 17 14:11:40 lbv1 stonithd[13266]:  warning: handle_startup_fencing: Blind faith: not fencing unseen nodes
>>>>>>>Mar 17 14:11:41 lbv1 stonithd[13266]:   notice: stonith_device_register: Added 'Stonith2-1' to the device list (1 active devices)
>>>>>>>Mar 17 14:11:41 lbv1 stonithd[13266]:   notice: stonith_device_register: Added 'Stonith2-2' to the device list (2 active devices)
>>>>>>>Mar 17 14:12:04 lbv1 stonithd[13266]:   notice: xml_patch_version_check: Versions did not change in patch 0.5.0
>>>>>>>Mar 17 14:12:20 lbv1 stonithd[13266]:   notice: log_operation: Operation 'monitor' [13386] for device 'Stonith2-1' returned: -201 (Generic Pacemaker error)
>>>>>>>Mar 17 14:12:20 lbv1 stonithd[13266]:  warning: log_operation: Stonith2-1:13386 [ Performing: stonith -t external/stonith-helper -S ]
>>>>>>>Mar 17 14:12:20 lbv1 stonithd[13266]:  warning: log_operation: Stonith2-1:13386 [ failed to exec "stonith" ]
>>>>>>>Mar 17 14:12:20 lbv1 stonithd[13266]:  warning: log_operation: Stonith2-1:13386 [ failed:  2 ]
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>宜しくお願いします。
>>>>>>>
>>>>>>>以上
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>2015年3月17日 13:32 <renay****@ybb*****>:
>>>>>>>
>>>>>>>福田さん
>>>>>>>>
>>>>>>>>お疲れ様です。山内です。
>>>>>>>>
>>>>>>>>ということは、stonith-helperのstartに問題があるようですね。
>>>>>>>>
>>>>>>>>stonith-helperの先頭に
>>>>>>>>
>>>>>>>>#!/bin/bash -x
>>>>>>>>
>>>>>>>>
>>>>>>>>を入れて、クラスタを起動すると何かわかるかも知れません。
>>>>>>>>
>>>>>>>>ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>以上です。
>>>>>>>>
>>>>>>>>----- Original Message -----
>>>>>>>>>From: Masamichi Fukuda - elf-systems <masamichi_fukud****@elf-s*****>
>>>>>>>>>To: 山内英生 <renay****@ybb*****>; "linux****@lists*****" <linux****@lists*****>
>>>>>>>>
>>>>>>>>>Date: 2015/3/17, Tue 12:31
>>>>>>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>山内さん
>>>>>>>>>cc:松島さん
>>>>>>>>>
>>>>>>>>>こんにちは、福田です。
>>>>>>>>>
>>>>>>>>>同じディレクトリにxen0はありました。
>>>>>>>>>
>>>>>>>>># pwd
>>>>>>>>>/usr/local/heartbeat/lib/stonith/plugins/external
>>>>>>>>>
>>>>>>>>># ls
>>>>>>>>>drac5           ibmrsa          kdumpcheck  riloe          vmware
>>>>>>>>>dracmc-telnet  ibmrsa-telnet  libvirt      ssh          xen0
>>>>>>>>>hetzner        ipmi          nut      stonith-helper  xen0-ha
>>>>>>>>>hmchttp        ippower9258    rackpdu      vcenter
>>>>>>>>>
>>>>>>>>>宜しくお願いします。
>>>>>>>>>
>>>>>>>>>以上
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>2015-03-17 10:53 GMT+09:00 <renay****@ybb*****>:
>>>>>>>>>
>>>>>>>>>福田さん
>>>>>>>>>>cc:松島さん
>>>>>>>>>>
>>>>>>>>>>お疲れ様です。山内です。
>>>>>>>>>>
>>>>>>>>>>>標準出力や標準エラー出力はありませんでした。
>>>>>>>>>>>
>>>>>>>>>>>stonith-helperがおかしいのでしょうか。
>>>>>>>>>>>stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>>>>>>stonith-helperはここに配置されています。
>>>>>>>>>>>/usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>>>>>
>>>>>>>>>>このディレクトリにxen0もありますか?
>>>>>>>>>>無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに
>>>>>>>>>>コピーしてみてください。
>>>>>>>>>>
>>>>>>>>>>それで稼働するなら、pm_extrasのインストールに問題があるということになります。
>>>>>>>>>>
>>>>>>>>>>以上です。
>>>>>>>>>>
>>>>>>>>>>----- Original Message -----
>>>>>>>>>>>From: Masamichi Fukuda - elf-systems <masamichi_fukud****@elf-s*****>
>>>>>>>>>>>To: 山内英生 <renay****@ybb*****>; "linux****@lists*****" <linux****@lists*****>
>>>>>>>>>>
>>>>>>>>>>>Date: 2015/3/17, Tue 10:31
>>>>>>>>>>>Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>山内さん
>>>>>>>>>>>cc:松島さん
>>>>>>>>>>>
>>>>>>>>>>>おはようございます、福田です。
>>>>>>>>>>>crmの例をありがとうございます。
>>>>>>>>>>>
>>>>>>>>>>>早速、こちらの環境に合わせてみました。
>>>>>>>>>>>
>>>>>>>>>>>$ cat test.crm
>>>>>>>>>>>### Cluster Option ###
>>>>>>>>>>>property \
>>>>>>>>>>>    no-quorum-policy="ignore" \
>>>>>>>>>>>    stonith-enabled="true" \
>>>>>>>>>>>    startup-fencing="false" \
>>>>>>>>>>>    stonith-timeout="710s" \
>>>>>>>>>>>    crmd-transition-delay="2s"
>>>>>>>>>>>
>>>>>>>>>>>### Resource Default ###
>>>>>>>>>>>rsc_defaults \
>>>>>>>>>>>    resource-stickiness="INFINITY" \
>>>>>>>>>>>    migration-threshold="1"
>>>>>>>>>>>
>>>>>>>>>>>### Group Configuration ###
>>>>>>>>>>>group HAvarnish \
>>>>>>>>>>>    vip_208 \
>>>>>>>>>>>    varnishd
>>>>>>>>>>>
>>>>>>>>>>>group grpStonith1 \
>>>>>>>>>>>    Stonith1-1 \
>>>>>>>>>>>    Stonith1-2
>>>>>>>>>>>
>>>>>>>>>>>group grpStonith2 \
>>>>>>>>>>>    Stonith2-1 \
>>>>>>>>>>>    Stonith2-2
>>>>>>>>>>>
>>>>>>>>>>>### Clone Configuration ###
>>>>>>>>>>>clone clone_ping \
>>>>>>>>>>>    ping
>>>>>>>>>>>
>>>>>>>>>>>### Fencing Topology ###
>>>>>>>>>>>fencing_topology \
>>>>>>>>>>>    lbv1.beta.com: Stonith1-1 Stonith1-2 \
>>>>>>>>>>>    lbv2.beta.com: Stonith2-1 Stonith2-2
>>>>>>>>>>>
>>>>>>>>>>>### Primitive Configuration ###
>>>>>>>>>>>primitive vip_208 ocf:heartbeat:IPaddr2 \
>>>>>>>>>>>    params \
>>>>>>>>>>>        ip="192.168.17.208" \
>>>>>>>>>>>        nic="eth0" \
>>>>>>>>>>>        cidr_netmask="24" \
>>>>>>>>>>>    op start interval="0s" timeout="90s" on-fail="restart" \
>>>>>>>>>>>    op monitor interval="5s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op stop interval="0s" timeout="100s" on-fail="fence"
>>>>>>>>>>>
>>>>>>>>>>>primitive varnishd lsb:varnish \
>>>>>>>>>>>    op start interval="0s" timeout="90s" on-fail="restart" \
>>>>>>>>>>>    op monitor interval="10s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op stop interval="0s" timeout="100s" on-fail="fence"
>>>>>>>>>>>
>>>>>>>>>>>primitive ping ocf:pacemaker:ping \
>>>>>>>>>>>    params \
>>>>>>>>>>>        name="default_ping_set" \
>>>>>>>>>>>        host_list="192.168.17.254" \
>>>>>>>>>>>        multiplier="100" \
>>>>>>>>>>>        dampen="1" \
>>>>>>>>>>>    op start interval="0s" timeout="90s" on-fail="restart" \
>>>>>>>>>>>    op monitor interval="10s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op stop interval="0s" timeout="100s" on-fail="fence"
>>>>>>>>>>>
>>>>>>>>>>>primitive Stonith1-1 stonith:external/stonith-helper \
>>>>>>>>>>>    params \
>>>>>>>>>>>        pcmk_reboot_retries="1" \
>>>>>>>>>>>        pcmk_reboot_timeout="40s" \
>>>>>>>>>>>        hostlist="lbv1.beta.com" \
>>>>>>>>>>>        dead_check_target="192.168.17.132 10.0.17.132" \
>>>>>>>>>>>        standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep -q `hostname`" \
>>>>>>>>>>>        run_online_check="yes" \
>>>>>>>>>>>    op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>
>>>>>>>>>>>primitive Stonith1-2 stonith:external/xen0 \
>>>>>>>>>>>    params \
>>>>>>>>>>>        pcmk_reboot_timeout="60s" \
>>>>>>>>>>>        hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \
>>>>>>>>>>>        dom0="xen0.beta.com" \
>>>>>>>>>>>    op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op monitor interval="3600s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>
>>>>>>>>>>>primitive Stonith2-1 stonith:external/stonith-helper \
>>>>>>>>>>>    params \
>>>>>>>>>>>        pcmk_reboot_retries="1" \
>>>>>>>>>>>        pcmk_reboot_timeout="40s" \
>>>>>>>>>>>        hostlist="lbv2.beta.com" \
>>>>>>>>>>>        dead_check_target="192.168.17.133 10.0.17.133" \
>>>>>>>>>>>        standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep -q `hostname`" \
>>>>>>>>>>>        run_online_check="yes" \
>>>>>>>>>>>    op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>
>>>>>>>>>>>primitive Stonith2-2 stonith:external/xen0 \
>>>>>>>>>>>    params \
>>>>>>>>>>>        pcmk_reboot_timeout="60s" \
>>>>>>>>>>>        hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \
>>>>>>>>>>>        dom0="xen0.beta.com" \
>>>>>>>>>>>    op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op monitor interval="3600s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>    op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>
>>>>>>>>>>>### Resource Location ###
>>>>>>>>>>>location HA_location-1 HAvarnish \
>>>>>>>>>>>    rule 200: #uname eq lbv1.beta.com \
>>>>>>>>>>>    rule 100: #uname eq lbv2.beta.com
>>>>>>>>>>>
>>>>>>>>>>>location HA_location-2 HAvarnish \
>>>>>>>>>>>    rule -INFINITY: not_defined default_ping_set or default_ping_set lt 100
>>>>>>>>>>>
>>>>>>>>>>>location HA_location-3 grpStonith1 \
>>>>>>>>>>>    rule -INFINITY: #uname eq lbv1.beta.com
>>>>>>>>>>>
>>>>>>>>>>>location HA_location-4 grpStonith2 \
>>>>>>>>>>>    rule -INFINITY: #uname eq lbv2.beta.com
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>これを流しこんだところ、昨日とはメッセージが異なります。
>>>>>>>>>>>pingのメッセージはなくなっていました。
>>>>>>>>>>>
>>>>>>>>>>># crm_mon -rfA
>>>>>>>>>>>Last updated: Tue Mar 17 10:21:28 2015
>>>>>>>>>>>Last change: Tue Mar 17 10:21:09 2015
>>>>>>>>>>>Stack: heartbeat
>>>>>>>>>>>Current DC: lbv2.beta.com (82ffc36f-1ad8-8686-7db0-35686465c624) - parti
>>>>>>>>>>>tion with quorum
>>>>>>>>>>>Version: 1.1.12-561c4cf
>>>>>>>>>>>2 Nodes configured
>>>>>>>>>>>8 Resources configured
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>Online: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>>>>
>>>>>>>>>>>Full list of resources:
>>>>>>>>>>>
>>>>>>>>>>> Resource Group: HAvarnish
>>>>>>>>>>>     vip_208    (ocf::heartbeat:IPaddr2):       Started lbv1.beta.com
>>>>>>>>>>>     varnishd   (lsb:varnish):  Started lbv1.beta.com
>>>>>>>>>>> Resource Group: grpStonith1
>>>>>>>>>>>     Stonith1-1 (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>     Stonith1-2 (stonith:external/xen0):        Stopped
>>>>>>>>>>> Resource Group: grpStonith2
>>>>>>>>>>>     Stonith2-1 (stonith:external/stonith-helper):      Stopped
>>>>>>>>>>>     Stonith2-2 (stonith:external/xen0):        Stopped
>>>>>>>>>>> Clone Set: clone_ping [ping]
>>>>>>>>>>>     Started: [ lbv1.beta.com lbv2.beta.com ]
>>>>>>>>>>>
>>>>>>>>>>>Node Attributes:
>>>>>>>>>>>* Node lbv1.beta.com:
>>>>>>>>>>>    + default_ping_set                  : 100
>>>>>>>>>>>* Node lbv2.beta.com:
>>>>>>>>>>>    + default_ping_set                  : 100
>>>>>>>>>>>
>>>>>>>>>>>Migration summary:
>>>>>>>>>>>* Node lbv2.beta.com:
>>>>>>>>>>>   Stonith1-1: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>> 10:21:17 2015'
>>>>>>>>>>>* Node lbv1.beta.com:
>>>>>>>>>>>   Stonith2-1: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17
>>>>>>>>>>> 10:21:17 2015'
>>>>>>>>>>>
>>>>>>>>>>>Failed actions:
>>>>>>>>>>>    Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1): call=31, st
>>>>>>>>>>>atus=Error, last-rc-change='Tue Mar 17 10:21:15 2015', queued=0ms, exec=1082ms
>>>>>>>>>>>    Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1): call=31, st
>>>>>>>>>>>atus=Error, last-rc-change='Tue Mar 17 10:21:16 2015', queued=0ms, exec=1079ms
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>/var/log/ha-debugのログです。
>>>>>>>>>>>
>>>>>>>>>>>IPaddr2(vip_208)[7851]: 2015/03/17_10:21:22 INFO: Adding inet address 192.168.17.208/24 with broadcast address 192.168.17.255 to device eth0
>>>>>>>>>>>IPaddr2(vip_208)[7851]: 2015/03/17_10:21:22 INFO: Bringing device eth0 up
>>>>>>>>>>>IPaddr2(vip_208)[7851]: 2015/03/17_10:21:22 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto not_used not_used
>>>>>>>>>>>
>>>>>>>>>>>標準出力や標準エラー出力はありませんでした。
>>>>>>>>>>>
>>>>>>>>>>>stonith-helperがおかしいのでしょうか。
>>>>>>>>>>>stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。
>>>>>>>>>>>stonith-helperはここに配置されています。
>>>>>>>>>>>/usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>宜しくお願いします。
>>>>>>>>>>>
>>>>>>>>>>>以上
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>2015-03-17 9:45 GMT+09:00 <renay****@ybb*****>:
>>>>>>>>>>>
>>>>>>>>>>>福田さん
>>>>>>>>>>>>
>>>>>>>>>>>>おはようございます。山内です。
>>>>>>>>>>>>
>>>>>>>>>>>>念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。
>>>>>>>>>>>>(実際には、改行に気を付けてください)
>>>>>>>>>>>>
>>>>>>>>>>>>以下の例は、PM1.1系での設定で、
>>>>>>>>>>>>nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。
>>>>>>>>>>>>nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。
>>>>>>>>>>>>
>>>>>>>>>>>>stonith自体は、helperとsshです。
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>(snip)
>>>>>>>>>>>>### Group Configuration ###
>>>>>>>>>>>>group grpStonith1 \
>>>>>>>>>>>>prmStonith1-1 \
>>>>>>>>>>>>prmStonith1-2
>>>>>>>>>>>>
>>>>>>>>>>>>group grpStonith2 \
>>>>>>>>>>>>prmStonith2-1 \
>>>>>>>>>>>>prmStonith2-2
>>>>>>>>>>>>
>>>>>>>>>>>>### Fencing Topology ###
>>>>>>>>>>>>fencing_topology \
>>>>>>>>>>>>nodea: prmStonith1-1 prmStonith1-2 \
>>>>>>>>>>>>nodeb: prmStonith2-1 prmStonith2-2
>>>>>>>>>>>>(snp)
>>>>>>>>>>>>primitive prmStonith1-1 stonith:external/stonith-helper \
>>>>>>>>>>>>params \
>>>>>>>>>>>>
>>>>>>>>>>>>pcmk_reboot_retries="1" \
>>>>>>>>>>>>pcmk_reboot_timeout="40s" \
>>>>>>>>>>>>hostlist="nodea" \
>>>>>>>>>>>>dead_check_target="192.168.28.60 192.168.28.70" \
>>>>>>>>>>>>standby_check_command="/usr/sbin/crm_resource -r prmRES -W | grep -qi `hostname`" \
>>>>>>>>>>>>run_online_check="yes" \
>>>>>>>>>>>>op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>>op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>>
>>>>>>>>>>>>primitive prmStonith1-2 stonith:external/ssh \
>>>>>>>>>>>>params \
>>>>>>>>>>>>pcmk_reboot_timeout="60s" \
>>>>>>>>>>>>hostlist="nodea" \
>>>>>>>>>>>>op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>>op monitor interval="3600s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>>op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>>
>>>>>>>>>>>>primitive prmStonith2-1 stonith:external/stonith-helper \
>>>>>>>>>>>>params \
>>>>>>>>>>>>pcmk_reboot_retries="1" \
>>>>>>>>>>>>pcmk_reboot_timeout="40s" \
>>>>>>>>>>>>hostlist="nodeb" \
>>>>>>>>>>>>dead_check_target="192.168.28.61 192.168.28.71" \
>>>>>>>>>>>>standby_check_command="/usr/sbin/crm_resource -r prmRES -W | grep -qi `hostname`" \
>>>>>>>>>>>>run_online_check="yes" \
>>>>>>>>>>>>op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>>op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>>
>>>>>>>>>>>>primitive prmStonith2-2 stonith:external/ssh \
>>>>>>>>>>>>params \
>>>>>>>>>>>>pcmk_reboot_timeout="60s" \
>>>>>>>>>>>>hostlist="nodeb" \
>>>>>>>>>>>>op start interval="0s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>>op monitor interval="3600s" timeout="60s" on-fail="restart" \
>>>>>>>>>>>>op stop interval="0s" timeout="60s" on-fail="ignore"
>>>>>>>>>>>>(snip)
>>>>>>>>>>>>location rsc_location-grpStonith1-2 grpStonith1 \
>>>>>>>>>>>>rule -INFINITY: #uname eq nodea
>>>>>>>>>>>>location rsc_location-grpStonith2-3 grpStonith2 \
>>>>>>>>>>>>rule -INFINITY: #uname eq nodeb
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>以上です。
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>--
>>>>>>>>>>>
>>>>>>>>>>>ELF Systems
>>>>>>>>>>>Masamichi Fukuda
>>>>>>>>>>>mail to: masamichi_fukud****@elf-s*****
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>_______________________________________________
>>>>>>>>>>Linux-ha-japan mailing list
>>>>>>>>>>Linux****@lists*****
>>>>>>>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>--
>>>>>>>>>
>>>>>>>>>ELF Systems
>>>>>>>>>Masamichi Fukuda
>>>>>>>>>mail to: masamichi_fukud****@elf-s*****
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>_______________________________________________
>>>>>>>>Linux-ha-japan mailing list
>>>>>>>>Linux****@lists*****
>>>>>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>--
>>>>>>>
>>>>>>>ELF Systems
>>>>>>>Masamichi Fukuda
>>>>>>>mail to: masamichi_fukud****@elf-s*****
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>_______________________________________________
>>>>>>Linux-ha-japan mailing list
>>>>>>Linux****@lists*****
>>>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>>>
>>>>>
>>>>>
>>>>>--
>>>>>
>>>>>ELF Systems
>>>>>Masamichi Fukuda
>>>>>mail to: masamichi_fukud****@elf-s*****
>>>>>
>>>>>
>>>>
>>>>_______________________________________________
>>>>Linux-ha-japan mailing list
>>>>Linux****@lists*****
>>>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>>>
>>>
>>>--
>>>
>>>ELF Systems
>>>Masamichi Fukuda
>>>mail to: masamichi_fukud****@elf-s*****
>>>
>>>
>>>
>>
>>_______________________________________________
>>Linux-ha-japan mailing list
>>Linux****@lists*****
>>http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan
>>
>
>
>-- 
>
>ELF Systems
>Masamichi Fukuda
>mail to: masamichi_fukud****@elf-s*****
>
>




Linux-ha-japan メーリングリストの案内
Back to archive index