Methodology
Under the hood, AuriClass conceptually works as follows:
Note
Whether input files are FastQ or FastA is guessed by default based on pyfastx
parsing. This can be overridden by specifying --fastq
or --fasta
.
Steps that differ between FastQ and FastA data:
- Sketching is performed differently: filtering for minimal kmer coverage is required for FastQ.
- Genome size is estimated for FastQ using
mash sketch
, and parsed from FastA usingpyfastx
FastqAuriclass
and FastaAuriclass
classes are based on BasicAuriclass
which handles most of the attribute setting. The specific classes define some format-specific methods and how all methods should be called by the run()
method.