naoa
null+****@clear*****
Sun Oct 26 09:57:29 JST 2014
naoa 2014-10-26 09:57:29 +0900 (Sun, 26 Oct 2014) New Revision: a44c8de55b0674d05f99cbf93df359ee59f04cb6 https://github.com/groonga/groonga/commit/a44c8de55b0674d05f99cbf93df359ee59f04cb6 Merged 145db6d: Merge pull request #227 from naoa/doc-add-token-filters Message: doc reference: add token filters Added files: doc/source/example/reference/token_filters/example-table-create.log doc/source/example/reference/token_filters/stem.log doc/source/example/reference/token_filters/stop_word.log doc/source/reference/token_filters.rst Modified files: doc/locale/en/LC_MESSAGES/reference.po doc/locale/ja/LC_MESSAGES/reference.po doc/source/reference.rst Modified: doc/locale/en/LC_MESSAGES/reference.po (+70 -0) =================================================================== --- doc/locale/en/LC_MESSAGES/reference.po 2014-10-26 08:57:04 +0900 (453e3e9) +++ doc/locale/en/LC_MESSAGES/reference.po 2014-10-26 09:57:29 +0900 (1593b77) @@ -14559,6 +14559,76 @@ msgstr "" "database (sharding) or reduce each key size to handle 4GiB or more larger " "total key size." +msgid "Token filters" +msgstr "Token filters" + +msgid "Groonga has token filter module that some processes tokenized token." +msgstr "Groonga has token filter module that some processes tokenized token." + +msgid "Token filter module can be added as a plugin." +msgstr "Token filter module can be added as a plugin." + +msgid "" +"You can customize tokenized token by registering your token filters plugins " +"to Groonga." +msgstr "" +"You can customize tokenized token by registering your token filters plugins " +"to Groonga." + +msgid "" +"A token filter module is attached to a table. A token filter can have zero " +"or N token filter module. You can attach a normalizer module to a table " +"``token_filters`` option in :doc:`/reference/commands/table_create`." +msgstr "" +"A token filter module is attached to a table. A token filter can have zero " +"or N token filter module. You can attach a normalizer module to a table " +"``token_filters`` option in :doc:`/reference/commands/table_create`." + +msgid "" +"Here is an example ``table_create`` that uses ``TokenFilterStopWord`` token " +"filter module:" +msgstr "" +"Here is an example ``table_create`` that uses ``TokenFilterStopWord`` token " +"filter module:" + +msgid "Available token filters" +msgstr "Available token filters" + +msgid "Here are the list of available token filters:" +msgstr "Here are the list of available token filters:" + +msgid "``TokenFilterStopWord``" +msgstr "" + +msgid "``TokenFilterStem``" +msgstr "" + +msgid "" +"``TokenFilterStopWord`` removes stopword from tokenized token in searching " +"the documents." +msgstr "" +"``TokenFilterStopWord`` removes stopword from tokenized token in searching " +"the documents." + +msgid "" +"``TokenFilterStopWord`` can specify stopword after adding the documents, " +"because It removes token in searching the documents." +msgstr "" +"``TokenFilterStopWord`` can specify stopword after adding the documents, " +"because It removes token in searching the documents." + +msgid "The stopword is specified ``stopword`` column on lexicon table." +msgstr "The stopword is specified ``stopword`` column on lexicon table." + +msgid "Here is an example that uses ``TokenFilterStopWord`` token filter:" +msgstr "Here is an example that uses ``TokenFilterStopWord`` token filter:" + +msgid "``TokenFilterStem`` stemming tokenized token." +msgstr "``TokenFilterStem`` stemming tokenized token." + +msgid "Here is an example that uses ``TokenFilterStem`` token filter:" +msgstr "Here is an example that uses ``TokenFilterStem`` token filter:" + msgid "Tokenizers" msgstr "Tokenizers" Modified: doc/locale/ja/LC_MESSAGES/reference.po (+74 -0) =================================================================== --- doc/locale/ja/LC_MESSAGES/reference.po 2014-10-26 08:57:04 +0900 (3b7521d) +++ doc/locale/ja/LC_MESSAGES/reference.po 2014-10-26 09:57:29 +0900 (4650225) @@ -13797,6 +13797,80 @@ msgstr "" "テーブルを分割したり、データベースを分割したり(シャーディング)、それぞれの" "キーのサイズを減らしてください。" +msgid "Token filters" +msgstr "Token filters" + +msgid "Groonga has token filter module that some processes tokenized token." +msgstr "" +"Groongaにはトークナイズされたトークンに所定の処理を行うトークンフィルター" +"モジュールがあります。" + +msgid "Token filter module can be added as a plugin." +msgstr "トークンフィルターモジュールはプラグインとして追加できます。" + +msgid "" +"You can customize tokenized token by registering your token filters plugins " +"to Groonga." +msgstr "" +"トークンフィルタープラグインをGroongaに追加することでトークナイズされたトークン" +"をカスタマイズできます。" + +msgid "" +"A token filter module is attached to a table. A token filter can have zero " +"or N token filter module. You can attach a normalizer module to a table " +"``token_filters`` option in :doc:`/reference/commands/table_create`." +msgstr "" +"トークンフィルターモジュールはテーブルに関連付いています。テーブルは0個かN個" +"のトークンフィルターモジュールを持つことができます。" +":doc:`/reference/commands/table_create` の ``token_filters`` オプションで" +"テーブルにトークンフィルターオプションを関連付けることができます。" + +msgid "" +"Here is an example ``table_create`` that uses ``TokenFilterStopWord`` token " +"filter module:" +msgstr "" +"以下は ``TokenFilterStopWord`` トークンフィルターモジュールを使う" +" ``table_create`` の例です。" + +msgid "Available token filters" +msgstr "利用可能なトークンフィルター" + +msgid "Here are the list of available token filters:" +msgstr "以下は組み込みのトークンフィルターのリストです。" + +msgid "``TokenFilterStopWord``" +msgstr "" + +msgid "``TokenFilterStem``" +msgstr "" + +msgid "" +"``TokenFilterStopWord`` removes stopword from tokenized token in searching " +"the documents." +msgstr "" +"``TokenFilterStopWord`` は、文書を検索する時にトークナイズされたトークンから" +"ストップワードを除去します。" + +msgid "" +"``TokenFilterStopWord`` can specify stopword after adding the documents, " +"because It removes token in searching the documents." +msgstr "" +"``TokenFilterStopWord`` は、文書を検索する時のみトークン除去するため、文書" +"を追加した後でストップワードを指定することもできます。" + +msgid "The stopword is specified ``stopword`` column on lexicon table." +msgstr "" +"ストップワードは、語彙表の ``stopword`` カラムで指定します。" + +msgid "Here is an example that uses ``TokenFilterStopWord`` token filter:" +msgstr "以下は ``TokenFilterStopWord`` トークンフィルターを使う例です。" + +msgid "``TokenFilterStem`` stemming tokenized token." +msgstr "``TokenFilterStem`` は、トークナイズされたトークンをステミングします。" + +msgid "Here is an example that uses ``TokenFilterStem`` token filter:" +msgstr "以下は ``TokenFilterStem`` トークンフィルターを使う例です。" + msgid "Tokenizers" msgstr "" Added: doc/source/example/reference/token_filters/example-table-create.log (+7 -0) 100644 =================================================================== --- /dev/null +++ doc/source/example/reference/token_filters/example-table-create.log 2014-10-26 09:57:29 +0900 (ac93153) @@ -0,0 +1,7 @@ +Execution example:: + + table_create Terms TABLE_PAT_KEY ShortText \ + --default_tokenizer TokenBigram \ + --normalizer NormalizerAuto \ + --token_filters TokenFilterStopWord + # [[0,0.0,0.0],true] Added: doc/source/example/reference/token_filters/stem.log (+59 -0) 100644 =================================================================== --- /dev/null +++ doc/source/example/reference/token_filters/stem.log 2014-10-26 09:57:29 +0900 (790bf4b) @@ -0,0 +1,59 @@ +Execution example:: + + register token_filters/stem + # [[0,0.0,0.0],true] + table_create Memos TABLE_NO_KEY + # [[0,0.0,0.0],true] + column_create Memos content COLUMN_SCALAR ShortText + # [[0,0.0,0.0],true] + table_create Terms TABLE_PAT_KEY ShortText \ + --default_tokenizer TokenBigram \ + --normalizer NormalizerAuto \ + --token_filters TokenFilterStem + # [[0,0.0,0.0],true] + column_create Terms memos_content COLUMN_INDEX|WITH_POSITION Memos content + # [[0,0.0,0.0],true] + load --table Memos + [ + {"content": "I develop Groonga"}, + {"content": "I'm developing Groonga"}, + {"content": "I developed Groonga"} + ] + # [[0,0.0,0.0],3] + select Memos --match_columns content --query "develops" + # [ + # [ + # 0, + # 0.0, + # 0.0 + # ], + # [ + # [ + # [ + # 3 + # ], + # [ + # [ + # "_id", + # "UInt32" + # ], + # [ + # "content", + # "ShortText" + # ] + # ], + # [ + # 1, + # "I develop Groonga" + # ], + # [ + # 2, + # "I'm developing Groonga" + # ], + # [ + # 3, + # "I developed Groonga" + # ] + # ] + # ] + # ] Added: doc/source/example/reference/token_filters/stop_word.log (+62 -0) 100644 =================================================================== --- /dev/null +++ doc/source/example/reference/token_filters/stop_word.log 2014-10-26 09:57:29 +0900 (7084ec2) @@ -0,0 +1,62 @@ +Execution example:: + + register token_filters/stop_word + # [[0,0.0,0.0],true] + table_create Memos TABLE_NO_KEY + # [[0,0.0,0.0],true] + column_create Memos content COLUMN_SCALAR ShortText + # [[0,0.0,0.0],true] + table_create Terms TABLE_PAT_KEY ShortText \ + --default_tokenizer TokenBigram \ + --normalizer NormalizerAuto \ + --token_filters TokenFilterStopWord + # [[0,0.0,0.0],true] + column_create Terms memos_content COLUMN_INDEX|WITH_POSITION Memos content + # [[0,0.0,0.0],true] + column_create Terms is_stop_word COLUMN_SCALAR Bool + # [[0,0.0,0.0],true] + load --table Terms + [ + {"_key": "and", "is_stop_word": true} + ] + # [[0,0.0,0.0],1] + load --table Memos + [ + {"content": "Hello"}, + {"content": "Hello and Good-bye"}, + {"content": "Good-bye"} + ] + # [[0,0.0,0.0],3] + select Memos --match_columns content --query "Hello and" + # [ + # [ + # 0, + # 0.0, + # 0.0 + # ], + # [ + # [ + # [ + # 2 + # ], + # [ + # [ + # "_id", + # "UInt32" + # ], + # [ + # "content", + # "ShortText" + # ] + # ], + # [ + # 1, + # "Hello" + # ], + # [ + # 2, + # "Hello and Good-bye" + # ] + # ] + # ] + # ] Modified: doc/source/reference.rst (+1 -0) =================================================================== --- doc/source/reference.rst 2014-10-26 08:57:04 +0900 (80a5e87) +++ doc/source/reference.rst 2014-10-26 09:57:29 +0900 (2055254) @@ -17,6 +17,7 @@ Reference manual reference/column reference/normalizers reference/tokenizers + reference/token_filters reference/query_expanders reference/pseudo_column reference/grn_expr Added: doc/source/reference/token_filters.rst (+97 -0) 100644 =================================================================== --- /dev/null +++ doc/source/reference/token_filters.rst 2014-10-26 09:57:29 +0900 (bbe9c02) @@ -0,0 +1,97 @@ +.. -*- rst -*- + +.. highlightlang:: none + +.. groonga-command +.. database: token_filters + +Token filters +============= + +Summary +------- +Groonga has token filter module that some processes tokenized token. + +Token filter module can be added as a plugin. + +You can customize tokenized token by registering your token filters plugins to Groonga. + +A token filter module is attached to a table. A token filter can have zero or +N token filter module. You can attach a normalizer module to a table +``token_filters`` option in :doc:`/reference/commands/table_create`. + +Here is an example ``table_create`` that uses ``TokenFilterStopWord`` +token filter module: + +.. groonga-command +.. include:: ../example/reference/token_filters/example-table-create.log +.. table_create Terms TABLE_PAT_KEY ShortText --default_tokenizer TokenBigram --normalizer NormalizerAuto --token_filters TokenFilterStopWord + +Available token filters +----------------------- + +Here are the list of available token filters: + +* ``TokenFilterStopWord`` +* ``TokenFilterStem`` + +``TokenFilterStopWord`` +^^^^^^^^^^^^^^^^^^^^^^^ + +``TokenFilterStopWord`` removes stopword from tokenized token +in searching the documents. + +``TokenFilterStopWord`` can specify stopword after adding the +documents, because It removes token in searching the documents. + +The stopword is specified ``stopword`` column on lexicon table. + +Here is an example that uses ``TokenFilterStopWord`` token filter: + +.. groonga-command +.. include:: ../example/reference/token_filters/stop_word.log +.. register token_filters/stop_word +.. table_create Memos TABLE_NO_KEY +.. column_create Memos content COLUMN_SCALAR ShortText +.. table_create Terms TABLE_PAT_KEY ShortText --default_tokenizer TokenBigram --normalizer NormalizerAuto --token_filters TokenFilterStopWord +.. column_create Terms memos_content COLUMN_INDEX|WITH_POSITION Memos content +.. column_create Terms is_stop_word COLUMN_SCALAR Bool +.. load --table Terms +.. [ +.. {"_key": "and", "is_stop_word": true} +.. ] +.. load --table Memos +.. [ +.. {"content": "Hello"}, +.. {"content": "Hello and Good-bye"}, +.. {"content": "Good-bye"} +.. ] +.. select Memos --match_columns content --query "Hello and" + +``TokenFilterStem`` +^^^^^^^^^^^^^^^^^^^ + +``TokenFilterStem`` stemming tokenized token. + +Here is an example that uses ``TokenFilterStem`` token filter: + +.. groonga-command +.. include:: ../example/reference/token_filters/stem.log +.. register token_filters/stop_word +.. register token_filters/stem +.. table_create Memos TABLE_NO_KEY +.. column_create Memos content COLUMN_SCALAR ShortText +.. table_create Terms TABLE_PAT_KEY ShortText --default_tokenizer TokenBigram --normalizer NormalizerAuto --token_filters TokenFilterStem +.. column_create Terms memos_content COLUMN_INDEX|WITH_POSITION Memos content +.. load --table Memos +.. [ +.. {"content": "I develop Groonga"}, +.. {"content": "I'm developing Groonga"}, +.. {"content": "I developed Groonga"} +.. ] +.. select Memos --match_columns content --query "develops" + +See also +-------- + +* :doc:`/reference/commands/table_create` -------------- next part -------------- HTML����������������������������...Télécharger