Aspen permite pesquisar um grande corpus de arquivos de texto simples através do navegador.
Put all your files in one place, like ~/ebooks/ :
$ tree ~/ebooks
/Users/ian/ebooks
└── Project Gutenberg/
├── Beowulf.txt
├── Dracula.txt
├── Frankenstein.txt
$ docker-compose up -d
Creating network "aspen_default" with the default driver
Creating elasticsearch ... done
Creating aspen ... done
Use the included convert utility, which wraps Apache Tika, to convert them to plaintext. Passe um nome de arquivo em relação ao seu diretório de dados:
$ ls ~/ebooks
Project Gutenberg Test.docx
$ docker-compose run aspen convert Test.docx
Starting elasticsearch ... done
Test.docx doesn't exist, trying /data/Test.docx
Creating /data/Test.txt...
...
OK
$ ls ~/ebooks
Project Gutenberg Test.docx Test.txt
Comece redefinindo o Elasticsearch para garantir que tudo esteja funcionando:
$ docker-compose run aspen es-reset
Starting elasticsearch ... done
Results from DELETE: { acknowledged: true }
✓ Done.
Now import all .txt documents. The import script will try to figure out the title of the document automatically:
$ docker-compose run aspen import
Starting elasticsearch ... done
→ Base directory is /app/public/data
▲ Ignoring non-text path: Test.docx
→ Test.txt → Test Document
→ Project Gutenberg/Beowulf.txt → The Project Gutenberg EBook of Beowulf
→ Project Gutenberg/Dracula.txt → The Project Gutenberg EBook of Dracula, by Bram Stoker
→ Project Gutenberg/Frankenstein.txt → Project Gutenberg's Frankenstein, by Mary Wollstonecraft (Godwin) Shelley
✓ Done!
You can also run import with a directory or file name relative to the data directory. For example, import Project Gutenberg or import Project GutenbergDracula.txt .
Às vezes, os documentos de texto simples agem estranhamente. Maybe bin/import can't extract a title or maybe the search highlights are off. O arquivo pode ter os finais de linha errados ou um daqueles cabeçalhos Bom BOM, UTF-8 irritantes. Tente executar o DOS2UNIX em seus arquivos de texto para corrigi -los.
Vá para http: // localhost: 3000/e comece a pesquisar!
É mais fácil usar o Elasticsearch via Docker.
You can get Node and Yarn via Homebrew on Mac, or you can download Node.js v8.5 or later and npm install -g yarn to get Yarn.
For document conversation ( bin/convert ) you'll want:
On macOS you can brew install node tika unrtf par .
$ git clone [email protected]:statico/aspen.git
$ cd aspen
$ yarn install
Consulte as etapas 1-4 na seção "Usando Docker" acima. In short, get your text files together in one place, set up Elasticsearch, and import them with the bin/import command.
Aspen é construído usando o Next.js, que é o nó + ES6 + Express + React + Recarregamento a quente + muito mais. Basta correr:
$ yarn run dev
... e vá para http: // localhost: 3000
If you are working on server.js and want automatic server restarting, do:
$ yarn global add nodemon
$ nodemon -w server.js -w lib -x yarn -- run dev
tree command