Librarian
1.0.0

Encontraré lo que había estado en tu mente, pero no ha estado allí.
Comandos CLI simples en sus archivos de texto que desea buscar de manera inteligente aplicando una técnica de búsqueda de texto completo sin software engorroso.
Las abstracciones importables expuestas como comandos CLI están diseñados para reutilizarse en su necesidad. Puede conectarlos a su servicio.
Utilidad de un solo archivo . Python y Sqlite solamente.
Python$ ./configure --enable-loadable-sqlite-extensions -Python cumplió binarios de SQLite con extensiones habilitadas para cargar fts5 . No se requiere más configuración en el código.fts5stemmer.so - Biblioteca compartida compilada para indexar archivos de destino con su idioma preferible. $ python librarian.py -h
usage: librarian.py [-h] [--db DB] [--table TABLE] [--debug] [--sql-trace] {index,match,update} ...
optional arguments:
-h, --help show this help message and exit
--db DB DB file path. (default: librarian.db)
--table TABLE Table name to store files content. (default: documents)
--debug Flag of print additional events. (default: False)
--sql-trace Flag of print sqlite statements. (default: False)
available commands:
Use -h with each of them to get help.
{index,match,update}
index Command to build a db and index. Have to be run once.
match Command to run query on indexed files.
update Command to check if content is changed and update in the database.$ python librarian.py index -h
usage: librarian.py index [-h] [--file-extensions string] [--language LANGUAGE] target
positional arguments:
target Directory or a file to build an index on.
optional arguments:
-h, --help show this help message and exit
--file-extensions string
List of file extensions separated by space which to scan only. (default: frozenset({ ' .md ' }))
--language LANGUAGE list of available languages https://snowballstem.org/algorithms/ (default: russian)
$ python librarian.py match -h
usage: librarian.py match [-h] [--limit LIMIT] [--fields field,...] [--format {raw,csv}] [--snippet 1, ' ' , ' ' , ' ' , ' ' ,10] query
positional arguments:
query Sqlite query term executed by " MATCH " statement. Syntax can be found on https://sqlite.org/fts5.html#full_text_query_syntax.
optional arguments:
-h, --help show this help message and exit
--limit LIMIT Max count of results. (default: 5)
--fields field,... List of document fields to retrieve separated by comma, order is preserved. Choices: ( ' path ' , ' extension ' , ' size ' , ' created ' , ' modified ' , ' hash ' , ' rank ' , ' snippet ' , ' rowid ' ). (default:
( ' path ' , ' snippet ' ))
--format {raw,csv} Choose a results output format. (default: csv)
--snippet 1, ' ' , ' ' , ' ' , ' ' ,10
Snippet properties settings https://sqlite.org/fts5.html#the_snippet_function (default: None)$ python librarian.py update -h
usage: librarian.py update [-h] [--clean]
optional arguments:
-h, --help show this help message and exit
--clean Delete missing file records (default: False)Vamos a construir un DB.
$ python librarian.py --debug --sql-trace index ~ /Text-files-documents
SELECT fts5(NULL)
CREATE REAL TABLE IF NOT EXISTS documents
USING FTS5(path, content, extension, sizee, created, modified, tokenize= ' snowball russian ' ) ;
Excluded: /home/user/Text-files-documents/Today
Excluded: /home/user/Text-files-documents/Yesterday
Indexed: /home/user/Text-files-documents/Project/README.md
...Y pruébalo.
$ python librarian.py match --fields created,path тесты
[(1620583081.3579128,
' /home/user/Text-files-documents/Project/README.md ' ,
' Для запуска тестов нужно поднимать докер-сервисыn nn### Описание запуска ' ),
...