Document Converter
Currently, four official parsing engines are available:
Engine Name | Supported Types |
---|---|
MinerU | Webpages / Files |
MinerU API | Webpages / Files |
Jina | Webpages |
Markitdown | Webpages / Files |
The currently provided Jina engine only supports webpage parsing, not file conversion. If you’re parsing file-type documents, please choose one of the other three engines.
Note that after clicking the configuration button for an installed engine, the corresponding configuration options will appear at the bottom of the pop-up window. You can follow the example to set the parameters.
It is recommended to set the openai_api_key
when using the Markitdown engine, especially for parsing image files. Without this key, image parsing will not work. You can apply for the key on the OpenAI official website: https://openai.com
For the MinerU API engine, you will need to apply for an API Key yourself. For detailed instructions, visit the official site: https://mineru.net
Jina also requires an API Key. You can apply for one from the official website: https://jina.ai
If you are using the MinerU API engine, please note that MinerU officially does not support requests from non-China IPs. If you encounter errors while using this engine, make sure your local proxy is enabled and check whether your local HTTP_PROXY
/ HTTPS_PROXY
environment variables are set. If they are, please remove them.
Once you’ve installed the desired engine, you can proceed with the corresponding configuration. After configuration, the engine will be ready for document conversion.