markov_text - A Text Generator Based on Higher-Order Markov Chainscmake -B buildcmake --build buildWrite ./build/markov_text -h for help.
An example usage is given below, where first the construction command is done:
./build/markov_text -c corpus -O 3 -o outwhich will construct an order-3 Markov chain based on the large text file corpus and save it as four files, starting with out. Note that -O 3 (order 3) and -o out (output file path out) are the default and can be omitted. Thus, calling ./build/markov_text -c corpus will be equivalent to the command above.
Then to generate text, run:
./build/markov_text -g out -s 100which will generate at most 100 tokens based on the chain that is stored in the files starting with out. Note that the value -s 100 (generate at most 100 tokens) is the default value and can be omitted. Thus, calling ./build/markov_text -g out is equivalent to the command above.
N tokens is that if the Markov Chain has no next state then the text generation process ends. This can happen when the current sequence of tokens is a unique sequence that appears at the end of the input text file. This can be produced be creating a file when K unique tokens then generating N < K tokens. In this case, at most K tokens will be produced.Contributions and feedback are more than welcome!