Help me! How to improve retrieval performance? #7805

ericwu0930 · 2025-05-23T01:08:58Z

ericwu0930
May 23, 2025

The retrieval process is too slow. Are there some methods to improve it?

I have read the relevant issues, and I did not turn on the rerank button. And I use ES as the doc engine.

What's more, there are around 700 files in the dataset. Among them, 400 files have an average of 10 chunks each, with each chunk containing 512 tokens. The remaining files have an average of 4 chunks each, also with 512 tokens per chunk.

Here is the list of elapsed time.

What's more, in my own AI agent, the retrieval processs from ragflow via /api/v1/retrieval endpoint is also unacceptable slow. It usually takes more than 6 seconds, sometimes even longer to get the result. So, I don't think this issue is related to the configuration of the chat assistant.

@dosu-bot · 2025-05-23T01:12:29Z

dosubot[bot]
bot May 23, 2025

Here are several methods to improve retrieval performance in RAGFlow when using Elasticsearch (ES):

1. Disable features that slow down retrieval:

Make sure the reranker is off (which you already did).
Turn off "Use RAPTOR to enhance retrieval" in your knowledge base configuration.
Disable "Auto-keyword" and "Auto-question" features, as these depend on the LLM and can be time-consuming.
If you are not using knowledge graphs, ensure GraphRAG is disabled, as it adds significant overhead [source].

2. Use a GPU for embedding:
If you have access to a GPU, use it to reduce embedding time, which can speed up both indexing and retrieval [source].

3. Adjust retrieval parameters:

Lower the similarity threshold in the Retrieval Test settings to increase the number of retrieved chunks, which can sometimes help with speed if the system is filtering too aggressively [source].
Tune the keyword similarity weight (default is 0.7 for keyword, 0.3 for vector cosine). Adjusting these can impact retrieval efficiency and result quality [source].

4. Optimize Elasticsearch configuration:

Ensure you are using Elasticsearch version 8 or higher.
Confirm that your mapping file exists at conf/mapping.json.
Check your ES container or service settings for memory and disk watermarks. For example, in the default Docker Compose, ES is set with:
- cluster.routing.allocation.disk.watermark.low=5gb
- cluster.routing.allocation.disk.watermark.high=3gb
- cluster.routing.allocation.disk.watermark.flood_stage=2gb
- Adjust memory limits (mem_limit) as needed for your hardware [source].

5. Identify bottlenecks:
Use the lamp icon in the RAGFlow interface to analyze which step is causing the slowdown (e.g., embedding, retrieval, or post-processing). This can help you target your optimizations more effectively [source].

6. Use the simplest document parser when possible:
If your documents are plain text and do not require OCR or layout analysis, select "Naive" instead of "DeepDoc" in the Document parser dropdown to reduce parsing and indexing time [source].

7. Keep RAGFlow updated:
Performance improvements are made regularly, so ensure you are running the latest version [source].

If you have already tried these and retrieval is still slow, it may be helpful to share more about your ES hardware setup, network latency, or specific retrieval settings for deeper troubleshooting.

To continue talking to Dosu, mention @dosu-bot.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other}

1 reply

ZhenhangTung May 23, 2025
Collaborator

@ericwu0930 Got it. Will take a deeper look.

ZhenhangTung · 2025-05-23T03:39:36Z

ZhenhangTung
May 23, 2025
Collaborator

@ericwu0930 Could you share some use cases about how are you using RAGFlow? How much files would you like to put in RAGFlow in the long run?

1 reply

ericwu0930 May 25, 2025
Author

We have developed an intelligent fund customer service agent that integrates RAGFlow as a knowledge base tool. We use the api/v1/retrieval endpoint to query the knowledge base.

For certain funds, we have created corresponding knowledge base documents—each fund has its own document. Each file is approximately 15KB in size, totaling around 700 documents.

Based on the logs and source code analysis, the major time-consuming steps in the retrieval process are: question embedding, fetching answers from es, and re-ranking.

In a typical 5-second retrieval operation:
Embedding the question takes 300ms
Fetching answers from ES takes 200ms
The remaining 4+ seconds are spent on re-ranking the retrieved results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InfiniFlow

Help me! How to improve retrieval performance? #7805

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

InfiniFlow

Help me! How to improve retrieval performance? #7805

Uh oh!

ericwu0930 May 23, 2025

Replies: 2 comments · 2 replies

Uh oh!

dosubot[bot] bot May 23, 2025

Uh oh!

ZhenhangTung May 23, 2025 Collaborator

Uh oh!

ZhenhangTung May 23, 2025 Collaborator

Uh oh!

ericwu0930 May 25, 2025 Author

ericwu0930
May 23, 2025

Replies: 2 comments 2 replies

dosubot[bot]
bot May 23, 2025

ZhenhangTung May 23, 2025
Collaborator

ZhenhangTung
May 23, 2025
Collaborator

ericwu0930 May 25, 2025
Author