Skip to content

Methodology

Under the hood, AuriClass conceptually works as follows:

Note

Whether input files are FastQ or FastA is guessed by default based on pyfastx parsing. This can be overridden by specifying --fastq or --fasta.

Steps that differ between FastQ and FastA data:

  • Sketching is performed differently: filtering for minimal kmer coverage is required for FastQ.
  • Genome size is estimated for FastQ using mash sketch, and parsed from FastA using pyfastx

FastqAuriclass and FastaAuriclass classes are based on BasicAuriclass which handles most of the attribute setting. The specific classes define some format-specific methods and how all methods should be called by the run() method.