I'm looking for the exact invocation used to generate the 16SMicrobial database that you can download from here:
https://ftp.ncbi.nlm.nih.gov/blast/db/
I'm hoping to create the same type of blastdb with the same type of metadata with custom sequences.
Platform isn't an issue, but let's say on ubuntu 14.04 or 16.04.
I would like to replicate the creation of the database as closely as possible. The most important feature is the taxonomic information as can be seen here:
The databases on the FTP site contain taxonomic information for each sequence, include the identifier indices for lookups, and can be up to four times smaller than the FASTA. The original FASTA can be generated from the BLAST database using blastdbcmd
Creation of a blastdb using makeblastdb from a set of a fasta sequences is not an issue and can be achieved via:
makeblastdb -in <your_file.fasta> -dbtype nucl -out <database_name>
My question is specifically about the invocation NCBI uses to add the metadata that is present in the NCBI's 16SMicrobial blast database as I am keen to make sure I have replicated the process as closely as possible.