JavaのLucene検索ツールを使用したグループ化と舗装検索結果

著者：Eve Cole 更新時間：2025-06-07 20:32:01

GroupingSearchを使用して検索結果をグループ化します
パッケージorg.apache.lucene.search.grouping説明

このモジュールは、Luceneの検索結果をグループ化でき、指定された単一値フィールドが一緒に収集されます。たとえば、「著者」フィールドによるとグループのグループは、同じ「著者」のフィールド値を持つドキュメントがグループに分割されます。

グループ化するときは、必要な情報を入力する必要があります。

1。グループフィールド：このフィールドによるとグループ。たとえば、「著者」フィールドを使用してグループ化する場合、各グループの本は同じ著者です。このドメインのないドキュメントは、別のグループに分割されます。

2。Groupsort：グループソート。

3。TOPNGROUPS：保持されるグループの数。たとえば、10は最初の10グループのみが保持されることを意味します。

4。GroupOffset：グループグループが最初にランク付けされるかを検索します。たとえば、3つのグループが7つのグループを返すことを意味します（OpnGroupが10に等しいと仮定）。ページごとに5つのグループのみが表示されるなど、ページネーションで非常に役立ちます。

5。WIDENGROUPSORT：グループでドキュメントをソートします。注：こことGroupsortの違い

6. withgroupOffset：各グループで最初にランク付けされたドキュメントの検索。

GroupingSearchを使用するのは、検索結果のグループ化を簡単に使用できます

GroupingSearch APIドキュメント紹介：

非分散環境でグループ化を実行するための利便性クラス。

分散していない環境でのグループ化

警告：このAPIは実験的であり、次のリリースで互換性のない方法で変化する可能性があります。

バージョン4.3.1はここで使用されています

いくつかの重要な方法：

GroupingSearch：SetCaching（int maxdocstocache、boolean cachescores）キャッシュ
GroupingSearch：setCachingInmb（double maxcacherammb、boolean cachescores）2回目の検索の最初の検索結果をキャッシュ
GroupingSearch：SetGroupDocslimit（int GroupDocslimit）各グループが返品したドキュメントの数を指定します。指定されていない場合、ドキュメントはデフォルトで返されます。
GroupingSearch：SetGroupSort（SORT Groupsort）グループソートを指定します

サンプルコード：

1.最初にインデックスコードを見てください

パブリッククラスindexhelper {private document document;プライベートディレクトリディレクトリ。プライベートインデックスライターインデックスライター。 public Directory getDirectory（）{directory =（directory == null）？ new Ramdirectory（）：ディレクトリ;ディレクトリを返します。 } private indexwriterconfig getConfig（）{return new indexwriterconfig（version.lucene_43、new ikanalyzer（true））; } private indexwriter getIndexWriter（）{try {return new indexwriter（getDirectory（）、getConfig（））; } catch（ioexception e）{e.printstacktrace（）; nullを返します。 }} public IndexSearcher getIndexSearcher（）throws ioException {return new indexsearcher（directoryreader.open（getDirectory（）））; } / ** *グループテストのインデックスの作成 * @param著者 * @param content * / public void createindexforgroup（int id、string著者、文字列コンテンツ）{indexwriter = getIndexwriter（）; document = new Document（）; document.add（new intfield（ "id"、id、field.store.yes））; document.add（new Stringfield（ "Author"、著者、field.store.yes））; document.add（new Textfield（ "content"、content、field.store.yes））; try {indexwriter.addddocument（document）; indexwriter.commit（）; indexwriter.close（）; } catch（ioexception e）{e.printstacktrace（）; }}}

2。グループ化：

 Public Class GrouptestPublic void Group（IndexSearcher IndexSearcher、String GroupField、String Content）Throws IOException、parseexception {serounsearch groupingsearch = new Groupingsearch（Groupfield）; GroungingSearch.setGroupSort（new SORT（SORTFIELD.FIELD_SCORE））; GroupingSearch.SetFillSortFields（true）; GroupingSearch.setCachingInmb（4.0、True）; GroupingSearch.setAllGroups（true）; //groupingsearch.setallGrouphead（True）; GroupingSearch.setGroupDocslimit（10）; queryparser parser = new QueryParser（version.lucene_43、 "content"、new Ikanalyzer（true））;クエリquery = parser.parse（content）; topgroups <bytesref> result = groupingsearch.search（indexsearcher、query、0、1000）; system.out.println（ "検索ヒット：" + result.totalhitcount）; System.out.println（ "検索結果グループ化：" + result.groups.length）;ドキュメントドキュメント。 for（groupdocs <bytesref> groupdocs：result.groups）{system.out.println（ "group：" + groupdocs.groupValue.utf8toString（））; system.out.println（ "イングラープレコード：" + groupDocs.totalhits）; //system.out.println("groupdocs.scoredocs.length： " + groupdocs.scoredocs.length）; for（scoredoc scoredoc：groupdocs.scoredocs）{system.out.println（indexsearcher.doc（scoredoc.doc））; }}}

3。簡単なテスト：

 public static void main（string [] args）throws ioexception、parseexception {indexhelper indexhelper = new indexhelper（）; indexhelper.CreateIndexforGroup（1、 "SwootTotue"、 "Open Source China"）; indexhelper.CreateIndexForGroup（2、 "SwootPotaty"、 "Open Source Community"）; indexhelper.createIndexforgroup（3、 "soottoty"、 "code design"）; indexhelper.CreateIndexForGroup（4、 "SwootTotue"、 "Design"）; indexhelper.createIndexforgroup（5、 "Jiexian"、 "Lucene Development"）; indexhelper.createIndexforgroup（6、 "Jiexian"、 "Lucene Practical Combat"）; indexhelper.createIndexforgroup（7、 "Jiexian"、 "Open Source Lucene"）; indexhelper.createIndexforgroup（8、 "jiexian"、 "open solr"）; indexhelper.createIndexforgroup（9、 "Sanxian"、 "Sanxian Open Source Lucene"）; indexhelper.createIndexforgroup（10、 "Sanxian"、 "Sanxian Open Source Solr"）; indexhelper.createIndexforgroup（11、 "Sanxian"、 "Open Source"）; Grouptest grouptest = new Grouptest（）; grouptest.group（indexhelper.getIndexsearcher（）、 "author"、 "open source"）; }}

4。テスト結果：

ページングの2つの方法
ルーセンにはページングの2つの方法があります。

1.検索結果を直接ページングします。この方法は、データボリュームが比較的小さいときに使用できます。ページングコードのコアリファレンスは次のとおりです。

 scoredoc [] sd = xxx; // query query start record position int begin = pagesize *（currentpage -1）; // query exterine redinate redinate exterminate exterminate ent int ent = math.min（begin+pagesize、sd.length）;

2。SearchAfterを使用（...）

ルーセンは、必要に応じて使用できる5つのオーバーロード方法を提供します

scororedoc after：最後の検索結果のスコアタークの合計量を1倍に減らします。

クエリクエリ：クエリメソッド

int n：クエリごとに返された結果、つまりページごとの結果の総数

使用の簡単な例：

 //必要な検索結果を保存するためにマップを使用できます<文字列、オブジェクト> resultMap = new hashmap <string、object>（）; scoredoc after = null; query query = xxtopdocs td = search.searchafter（query、size）; //ヒット番号resultmap.put（ "num"、td.totalhits）を取得します。 scoredoc [] sd = td.scoredocs; for（scoredoc scoredoc：sd）{// Classic Search result Processing} //検索結果scoredocs合計量は1 after = sd [td.scoredocs.length -1]; //次の検索の後に保存します。つまり、次のページはresultmap.put（ "after"、after）を開始します。 return resultmap;