MHTextSearch download MHTextSearch源代碼下載

mhtextsearch

一個用Objective-C編寫的快速和最小嵌入式全文索引庫，該庫以Objective-LevelDB之上構建。

安裝

到目前為止，將該庫集成在項目中的最簡單方法是使用Cocoapods。

如果您還沒有安裝Cocoapods
在您的Podfile中，添加行
```
 pod 'MHTextSearch'
```
運行pod install
將libc++.dylib框架添加到您的項目中。

簡單的API

創建一個嵌入式文本索引

MHTextIndex *index = [MHTextIndex textIndexInLibraryWithName: @" my.awesome.index " ];

索引任何對象

您可以告訴MHTextIndex實例索引您的對象（任何對象）

[ index indexObject: anyObjectYouWant];
[ index updateIndexForObject: anotherPreviousIndexedObject];
[ index removeIndexForObject: anotherPreviousIndexedObject];

但是，要為此，您需要告訴我們NSData *標識符可以用來唯一地參考此對象。

[ index setIdentifier: ^ NSData *(MyCustomObject *object){
    return object. indexID ; // a NSData instance
}];

您還需要向我們提供有關對象的詳細信息，例如要索引的文本片段

[ index setIndexer: ^MHIndexedObject *(MyCustomObject *object, NSData *identifier){
    MHIndexedObject *indx = [MHIndexedObject new ];
    indx. strings = @[ object.title, object.description ]; // Indexed strings
    indx. weight = object. awesomenessLevel ;                // Weight given to this object, when sorting results
    indx. context = @{ @" title " : object. title };             // A NSDictionary that will be given alongside search results
    return indx;
}];

最後，如果您想在獲得搜索結果時可以輕鬆參考原始對象，可以告訴我們如何為您做到這一點

[ index setObjectGetter: ^MyCustomObject *( NSData *identifier){
    return [MyCustomObject customObjectFromIdentifier: identifier];
}];

就是這樣！這就是您要獲得全文索引所需的全部。正如您所期望的那樣，MhtextSearch將文本分為單詞，考慮了所有內容和資本化，這些內容和資本化都將其分解為文字（嗯，基金會在這裡完成大部分工作）。

然後，您可以開始搜索：

[ index enumerateObjectsForKeyword: @" duck " options: 0 withBlock: ^(MHSearchResultItem *item, 
                                                                NSUInteger rank, 
                                                                NSUInteger count, 
                                                                BOOL *stop){
                                                                    
    item. weight ;      // As provided by you earlier
    item. rank ;        // The effective rank in the search result
    item. object ;      // The first time it is used, it will use the block
                            // you provided earlier to get the object
    item. context ;     // The dictionary you provided in the "indexer" block
    item. identifier ;  // The object identifier you provided in the "identifier" block

    NSIndexPath *token = item. resultTokens [ 0 ]; 
    /* This is an NSArray of NSIndexPath instances, each containing 3 indices:
     *   - mh_string : the string in which the token occured 
     *                 (here, 0 for the object's title)
     *   - mh_word : the position in the string where the word containing
     *               the token occured
     *   - mh_token : the position in the word where the token occured
     */
                        
    NSRange tokenRange = [item rangeOfToken: token];
    /* This gives the exact range of the matched token in the string where it was found.
     *
     * So, according to the example setup I've been giving from the start,  
     * if token.mh_string == 0, that means the token was found in the object's "title",
     * and [item.object.title substringWithRange:tokenRange] would yield "duck" (minus 
     * capitalization and diacritics).
     */
}];

您也可以MHSearchResultItem

 NSArray *resultSet = [ index searchResultForKeyword: @" duck "
                                           options: NSEnumerationReverse];

子類

如果給出指定行為的塊不是您的事，您也可以覆蓋以下方法：

-[MHTextIndex getIdentifierForObject:]默認情況下，使用identifier塊
-[MHTextIndex getIndexInfoForObject:andIdentifier:]默認情況下，使用indexer塊
-[MHTextIndex compareResultItem:withItem:reversed:] ，用於訂購搜索結果集

與核心數據一起使用

您可以使用NSManagedObject生命週期方法來觸發文本索引的更改。以下示例是從http://www.adevelopingstory.com/blog/2013/04/adding-full-text-search-cearch-core-core-data.html出現的，並適合於此項目：

- ( void )prepareForDeletion
{
    [ super prepareForDeletion ];

    if (self. indexID . length ) {
        [textindex deleteIndexForObject: self .indexID];
    }
}

+ ( NSData *)createIndexID {
    NSUUID *uuid = [ NSUUID UUID ];
    uuid_t uuidBytes;
    [uuid getUUIDBytes: uuidBytes];
    return [ NSData dataWithBytes: uuidBytes length: 16 ];
}

- ( void )willSave
{
    [ super willSave ];

    if (self. indexID . length ) {
        [textindex updateIndexForObject: self .indexID];
    } else {
        self. indexID = [[ self class ] createIndexID ];
        [textindex indexObject: self .indexID];
    }
}

操作和隊列

MHTextIndex使用引擎蓋下的NSOperationQueue來協調索引操作。它被視為名為indexingQueue屬性。因此，您可以將其maxConcurrentOperationCount屬性設置為控制索引的同時。由於執行I/O的基礎數據庫庫是線程安全，因此並發並不是問題。這也意味著您可以明確等待使用索引操作完成：

[ index .indexingQueue waitUntilAllOperationsAreFinished ];

三種索引方法-[MHTextIndex indexObject:] ， -[MHTextIndex updateIndexForObject:] ， -[MHTextIndex removeIndexForObject:]所有返回NSOperation實例，如果需要的話，可以利用其completionBlock屬性或[nsoperation property或-[NSOperation waitUntilFinished] 。

搜索也是並發的，但它使用dispatch_queue_t （尚未暴露或可調）。

性能和微調

您可以玩一些旋鈕，以使Mhtextseach更好地滿足您的需求。

MHTextIndex實例具有skipStopWords boolean屬性，默認情況下是真實的，並且避免索引非常常見的英語單詞。（ Todo ：與其他語言一起工作）
它還具有默認情況下等於2的minimalTokenLength 。這為令牌需要索引需要的字母數量最少。這也極大地最小化了索引的大小以及索引和搜索時間。它跳過索引單字母單詞和每個單詞的最後一個字母，設置為2 。
當索引長的形式文本（文檔而不是簡單的名稱）時，您可以在MHTextIndex上打開discardDuplicateTokens布爾屬性布爾屬性。這使得索引僅考慮給定文本的每個索引令牌的第一次事件。如果您只知道一個令牌是否出現在文本中，而不是每個事件出現的地方都可以，則可以在索引時間中獲得速度顛簸，以3至5倍。

以下圖顯示了索引和搜索時間（以秒為單位），這是文本索引的大小的函數，範圍從500 kb到約10 mb。基準測試是在iPhone 5上運行的。

測試

如果要運行測試，則需要Xcode 5，因為測試套件使用了新的XCtest。

克隆這個存儲庫，一旦進入

$ cd MHTextSearch  iOS  Tests
$ pod install
$ cd .. && open * .xcworkspace

目前，所有測試均設置為使用iOS測試套件。

執照

根據MIT許可分發

展開

MHTextSearch

mhtextsearch

安裝

簡單的API

創建一個嵌入式文本索引

索引任何對象

子類

與核心數據一起使用

操作和隊列

性能和微調

測試

執照

Google Dorks

shepherd

mongo express

hidusbf

Free Algorithms Books

markdownpedia

chat.petals.dev

GPT Prompt Templates

GPTyped

Google Dorks

shepherd

mongo express

Google Dorks

shepherd

mongo express