Skip to content

TODO list #5

@shenwei356

Description

@shenwei356
  • Search: supporting specifying TaxID, e.g., only searching from a species or a genus.
    • No need to rebuild the index. Just filter matches in the seed-matching step .
    • Existing information:
      • genomes.map.bin stores genome id - internal id pairs.
    • File needed:
      • a genome accession -> taxid mapping file
      • taxdump files from NCBI or created by TaxonKit
    • Relationship: internal id -> taxid
    • Check: isAChild = LCA(taxid_target, taxid_test) == taxid_target.
    • implemented: https://github.com/shenwei356/LexicMap/tree/search-by-taxid
  • Create a table to explain the changes and compatibility of the index format.
  • Add a daemon process for searching via RESTful API.
  • Add a utility tool to edit genome names in the index via a regular expression, which only needs to edit the file genomes.map.bin.
  • utils subseq: accept search result as input, for batch sequence extraction.
    • parallelise it.
  • add a new command to combine multiple indexes (dozens)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions