Skip to content

[Feature] Add Database Selection Checkboxes (arxiv, indico, both) to Chatbot #37

@karthik18495

Description

@karthik18495

Summary

Enhance the RAG4EIC chatbot interface to allow users to choose which vector database(s) to query: arxiv, indico, or both. Add two checkboxes, one for arxiv and one for indico. If both are checked, the chatbot should search across both databases for relevant responses.

Requirements

  • Add two checkboxes to the chatbot UI (likely in the sidebar or top of the chat area):
    • "Search arxiv database"
    • "Search indico database"
  • If both are checked, the query must be searched through both databases, and results merged for response generation.
  • If only one is checked, search only the selected database.
  • If neither is checked, show a warning or disable question submission until a selection is made.
  • Results from both databases must be combined, deduplicated (if necessary), and passed as context to the LLM.
  • Indicate to the user which database(s) were used in generating the answer (e.g., via a message or icon in the chat).

Implementation Plan

  1. UI Changes

    • In the chatbot UI (likely streamlit_app/pages/2_RAG-ChatBot.py), add two checkboxes for "arxiv" and "indico" (using st.checkbox).
    • Store their states in st.session_state["use_arxiv"] and st.session_state["use_indico"].
    • If neither is checked, disable the chat input or display a warning.
  2. Backend Logic

    • Instantiate or select the appropriate retriever(s) based on user selection:
      • If both are checked, run the query on both retrievers and merge results.
      • If only one is checked, run only that retriever.
    • For merging, combine results from both databases, deduplicate by unique document ID or content, and preserve source info for citations.
    • Ensure downstream code (LLM prompt, citation, etc.) can handle the merged results and source metadata.
  3. Indico Database Integration

    • If not already, implement or update vectorstore/retriever logic for the indico database, similar to arxiv.
    • Ensure both retrievers provide results in compatible formats.
  4. User Feedback

    • Display a message or icon in the chat indicating which database(s) were searched for each response.
    • Optionally, allow users to filter or expand/collapse results by source in the chat history.
  5. Testing

    • Test all combinations: arxiv only, indico only, both checked, neither checked.
    • Check result merging, deduplication, and source indication in answers.

Future Improvements

  • Allow users to prioritize one database over the other (e.g., arxiv first, then indico fallback)
  • Add support for more databases in future (use a list of checkboxes or a multi-select)
  • Allow user to see search results from each database separately before fusion

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    Projects

    Status

    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions