Auto-i18n: Automatic multilingual translation tool using ChatGPT
Auto-i18n is a tool that automatically translates Markdown files into multilingual using ChatGPT. It implements full automation of blog post i18n (Internationalization). You only need to push your blog post to your GitHub repository to automatically translate to multiple languages with GitHub Actions. (English, Spanish and Arabic are currently supported, and more language support will be provided in the future)
The main features of Auto-i18n:
- Batch multilingual translation : Auto-i18n provides the function of batch translation, allowing you to translate all Markdown documents in a whole path into multilingual languages at one time, greatly improving the efficiency of multilingual projects.
- Compatible with Front Matter : Auto-i18n Compatible with Markdown Front Matter syntax, you can customize translation or replacement rules for different fields.
- Fixed content replacement : Auto-i18n also supports fixed content replacement. If you want the translation of some duplicate fields in the document to remain unchanged, this feature can help you achieve consistent documentation.
- Automated workflow : You can use GitHub Actions to implement an automated translation process. Without manual intervention, translation work will automatically proceed and update documents, allowing you to focus more on content.
Get started quickly
- Cloning the repository locally, renaming
env_template.py to env.py and providing your ChatGPT API. If you don't have your own API, you can apply for a free one in GPT_API_free; you can also use go-chatgpt-api to convert the web version of ChatGPT to API to use. - Install the required modules:
pip install -r requirements.txt . - Run the
python auto-translater.py command to run the program, which will automatically process all Markdown files in the test directory testdir/to-translate , and batch translate it into English, Spanish, and Arabic. (More language support will be provided in the future)
Detailed description
The running logic of the program auto-translater.py is as follows:
- The program will automatically process all Markdown files in the test directory
testdir/to-translate . You can exclude files that do not need to be translated in the exclude_list variable. - The processed file name will be recorded in the automatically generated
processed_list.txt . The processed files will not be translated again the next time the program is run. - For articles originally written in English, the program will not re-translate into English, nor will it be translated back to Chinese, but will be translated into other languages. You need to add fields in the article
> This post was originally written in English. (Note that there is a blank line in each upper and lower) for the program to recognize. Please refer to the test article_en.md. - If you need to re-translate the specified article (for example, the translation results are inaccurate, or the content of the article has changed, etc.), you can add a field
[translate] to the article (also you need to leave a blank line at the top and bottom). This will ignore the exclude_list and processed_list rules and force translation processing. Please refer to the test article_force-mark.md. - If the Markdown file contains Front Matter, the following processing method will be selected according to the rules in the program
front_matter_translation_rules :- Automatic translation: translated by ChatGPT. Applicable to article title or article description field.
- Fixed field replacement: Applicable to category or label fields. For example, if the same Chinese label name does not want to be translated into different English labels, it causes index errors.
- No processing is done: If the field does not appear in the above two rules, the original text will be retained and no processing will be done. Applicable to date, url, etc.
GitHub Actions Automation Guide
You can create .github/workflows/ci.yml under your project repository. When the GitHub repository is detected, you can use GitHub Actions to automatically translate and process it and automatically commit back to the original repository.
The content of ci.yml can be referenced as template: ci_template.yml
You need to add two secrets in the repository Settings - Secrets and variables - Repository secrets : CHATGPT_API_BASE and CHATGPT_API_KEY , and comment out the import env statement in the program auto-translater.py .
Error Troubleshooting
- If you need to verify the availability of ChatGPT API key, you can use the program verify-api-key.py to test it. If you use the official API in the country, you need to have a local proxy.
- If Front Matter in Markdown cannot be recognized normally, you can use the program detect_front_matter.py to test.
- When you encounter problems using GitHub Actions, prioritize checking that the path reference is correct (for example,
dir_to_translate dir_translated_en dir_translated_es dir_translated_ar processed_list ).
Problems to be solved
- In some special circumstances, the translation may be inaccurate or some fields are not translated. It is recommended to manually verify the article after translation.
- (Solved)
If Front Matter is included in Markdown, the original content of Front Matter is retained. Front Matter Some parameter translation functions are under development.
contribute
Welcome to participate in the improvement of this project! If you want to contribute code, report questions, or make suggestions, check out the Contribution Guide.
Copyright and License
This project adopts a MIT license.
Questions and support
If you encounter any problems with Auto-i18n or need technical support, please feel free to submit the problem.
My blog uses Auto-i18n to implement multilingual support, and you can go to the Power's Wiki to view the demo effects.
Acknowledgements
- Thanks to chatanywhere/GPT_API_free for the free ChatGPT API key.
- Thanks to linweiyuan/go-chatgpt-api for the method to convert ChatGPT to API on the web.