-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
Milestone
Description
Summary
Enhance the RAG4EIC chatbot interface to allow users to choose which vector database(s) to query: arxiv, indico, or both. Add two checkboxes, one for arxiv and one for indico. If both are checked, the chatbot should search across both databases for relevant responses.
Requirements
- Add two checkboxes to the chatbot UI (likely in the sidebar or top of the chat area):
- "Search arxiv database"
- "Search indico database"
- If both are checked, the query must be searched through both databases, and results merged for response generation.
- If only one is checked, search only the selected database.
- If neither is checked, show a warning or disable question submission until a selection is made.
- Results from both databases must be combined, deduplicated (if necessary), and passed as context to the LLM.
- Indicate to the user which database(s) were used in generating the answer (e.g., via a message or icon in the chat).
Implementation Plan
-
UI Changes
- In the chatbot UI (likely
streamlit_app/pages/2_RAG-ChatBot.py), add two checkboxes for "arxiv" and "indico" (usingst.checkbox). - Store their states in
st.session_state["use_arxiv"]andst.session_state["use_indico"]. - If neither is checked, disable the chat input or display a warning.
- In the chatbot UI (likely
-
Backend Logic
- Instantiate or select the appropriate retriever(s) based on user selection:
- If both are checked, run the query on both retrievers and merge results.
- If only one is checked, run only that retriever.
- For merging, combine results from both databases, deduplicate by unique document ID or content, and preserve source info for citations.
- Ensure downstream code (LLM prompt, citation, etc.) can handle the merged results and source metadata.
- Instantiate or select the appropriate retriever(s) based on user selection:
-
Indico Database Integration
- If not already, implement or update vectorstore/retriever logic for the indico database, similar to arxiv.
- Ensure both retrievers provide results in compatible formats.
-
User Feedback
- Display a message or icon in the chat indicating which database(s) were searched for each response.
- Optionally, allow users to filter or expand/collapse results by source in the chat history.
-
Testing
- Test all combinations: arxiv only, indico only, both checked, neither checked.
- Check result merging, deduplication, and source indication in answers.
Future Improvements
- Allow users to prioritize one database over the other (e.g., arxiv first, then indico fallback)
- Add support for more databases in future (use a list of checkboxes or a multi-select)
- Allow user to see search results from each database separately before fusion
References
- See
app_utilities.pyandpages/2_RAG-ChatBot.pyfor retriever and chat logic - Streamlit checkbox docs: https://docs.streamlit.io/library/api-reference/widgets/st.checkbox
- Vectorstore integration patterns (LangChain docs)
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Todo