• R/O
  • HTTP
  • SSH
  • HTTPS

Commit

Frequently used words (click to add to your profile)

javac++androidlinuxc#windowsobjective-ccocoa誰得qtpythonphprubygameguibathyscaphec計画中(planning stage)翻訳omegatframeworktwitterdomtestvb.netdirectxゲームエンジンbtronarduinopreviewer

Anthyを正かなづかひの入力と變換に對應させるパッチの作成作業リポジトリ。<git://git.debian.org/git/collab-maint/anthy.git>のmasterブランチからフォーク。


Commit MetaInfo

Révision33f494486f19f71b87af834157b2c5b2112c09b1 (tree)
l'heure2012-06-27 20:46:48
AuteurMORIYAMA Hiroshi <hiroshi@kvd....>
CommiterMORIYAMA Hiroshi

Message de Log

Add a new script depgraph/anthy-depgraph-gendai-to-seikana.rb

Anthyの付属語辞書を正かなづかひ対応に改変するRubyスクリプト。
コミット a800bf8c71a688c747e72e4fc848b1436ad3306b で使用した。

Change Summary

Modification

--- /dev/null
+++ b/depgraph/anthy-depgraph-gendai-to-seikana.rb
@@ -0,0 +1,108 @@
1+#! ruby -Eeuc-jp
2+# Encoding: EUC-JP
3+#
4+# Anthyの附屬語辭書(depgraph)を正かなづかひ對應に改變するスクリプト。
5+#
6+# $Id$
7+#
8+# Copyright (C) 2012 MORIYAMA Hiroshi
9+#
10+# This library is free software; you can redistribute it and/or
11+# modify it under the terms of the GNU Lesser General Public
12+# License as published by the Free Software Foundation; either
13+# version 2 of the License, or (at your option) any later version.
14+#
15+# This library is distributed in the hope that it will be useful,
16+# but WITHOUT ANY WARRANTY; without even the implied warranty of
17+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
18+# Lesser General Public License for more details.
19+#
20+# You should have received a copy of the GNU Lesser General Public
21+# License along with this library; if not, write to the Free Software
22+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
23+
24+## Usage:
25+
26+# % for f in *.depword *.txt *.table; do \
27+# ./anthy-depgraph-gendai-to-seikana.rb "$f" >"$f".tmp && /bin/mv "$f".tmp "$f"; \
28+# done
29+
30+## Code:
31+
32+def parse_anthy_depgraph (file_or_string)
33+ entries = []
34+
35+ file_or_string.each_line do |line|
36+ items = line.strip.split(/\s/)
37+
38+ if line[0] == ?# # comment line
39+ entries << [line]
40+ next
41+ else
42+ entry = [items.shift, trans_conditions = [], trans_nodes = []]
43+ end
44+
45+ items.each do |item|
46+ if item.match(/"/)
47+ trans_conditions << item.gsub(/\A"|"\z/, '')
48+ else
49+ trans_nodes << item
50+ end
51+ end
52+
53+ entries << entry
54+ end
55+
56+ entries
57+end
58+
59+if __FILE__ == $PROGRAM_NAME
60+ depgraph_entries = parse_anthy_depgraph(ARGF)
61+
62+ depgraph_entries.each do |ent|
63+ if ent.length == 1 # comment line
64+ puts ent
65+ next
66+ end
67+
68+ head_node = ent[0]
69+ trans_conds = ent[1]
70+ new_trans_conds = trans_conds.map{|s| s.dup }
71+
72+ trans_conds.each do |s|
73+ s = s.
74+ gsub(/ぁ/, 'あ').gsub(/ぃ/, 'い').gsub(/ぅ/, 'う').gsub(/ぇ/, 'え').
75+ gsub(/ぉ/, 'お').gsub(/っ/, 'つ').gsub(/ゃ/, 'や').gsub(/ゅ/, 'ゆ').
76+ gsub(/ょ/, 'よ').gsub(/ゎ/, 'わ')
77+
78+ #
79+ # Fix conversion mistakes.
80+ #
81+ s = s.gsub(/じや/, 'ぢや') # じゃ -> ぢや
82+ s = s.gsub(/([ちぢ])やう/, '\1やふ') # ちゃう -> ちやふ
83+ s = s.gsub(/([ちぢ])やわ/, '\1やは') # きちゃわない -> きちやはない
84+ s = s.gsub(/\Aそうだつ\z/, 'さうだつ') # そうだっ-た -> さうだつ-た
85+ s = s.gsub(/でしよう/, 'でせう') # でしょ-う -> でせ-う
86+ s = s.gsub(/でしよ"/, 'でしょ"') # でしょ。-> でしょ。
87+
88+ s = 'う' if head_node == '@形容詞語幹' && s == 'ゆう' # 美しう
89+ s = 'にょ' if head_node == '@よ' && s == 'によ' # 良いにょ(良いよ)
90+
91+ if head_node == '@ます'
92+ s = 'ましょ' if s == 'ましよ'
93+ s = 'ましぇん' if s == 'ましえん'
94+ end
95+
96+ if head_node == '@ます(かもの後)'
97+ s = 'ましぇん' if s == 'ましえん'
98+ end
99+
100+ new_trans_conds << s
101+ end
102+
103+ ent[1] = new_trans_conds.uniq.map{|s| '"' + s + '"' }
104+ puts ent.flatten.join(' ')
105+ end
106+end
107+
108+## anthy-depgraph-gendai-to-seikana.rb ends here.