- [x] Search: supporting specifying TaxID, e.g., only searching from a species or a genus. - No need to rebuild the index. Just filter matches in the [seed-matching step](https://github.com/shenwei356/LexicMap/blob/v0.7.0/lexicmap/cmd/lib-index-search.go#L1037) . - Existing information: - `genomes.map.bin` stores genome id - internal id pairs. - File needed: - a genome accession -> taxid mapping file - taxdump files from NCBI or created by TaxonKit - Relationship: internal id -> taxid - Check: `isAChild = LCA(taxid_target, taxid_test) == taxid_target`. - implemented: https://github.com/shenwei356/LexicMap/tree/search-by-taxid - [x] Create a table to explain the changes and compatibility of the index format. - [ ] <s>Add a daemon process for searching via RESTful API.</s> - [x] Add a utility tool to edit genome names in the index via a regular expression, which only needs to edit the file `genomes.map.bin`. - [x] `utils subseq`: accept search result as input, for batch sequence extraction. - [x] parallelise it. - [ ] add a new command to combine multiple indexes (dozens)
genomes.map.binstores genome id - internal id pairs.isAChild = LCA(taxid_target, taxid_test) == taxid_target.Add a daemon process for searching via RESTful API.genomes.map.bin.utils subseq: accept search result as input, for batch sequence extraction.