.TH YAHA "1" "July 2018" "YAHA version 0.1.83" "User Commands" .SH NAME YAHA \- find split-read mappings on single-end queries .SH DESCRIPTION .PP Usage (Default parameter values shown in parenthesis): .PP To create an index: yaha \fB\-g\fR genomeFilename [\-H maxHits (65525)] [\-L wordLen (15)] [\-S Skip\-distance (1)] The genome file can be a FASTA file, or a nib2 file (created by a previous yaha index operation). .PP To align queries: yaha \fB\-x\fR yahaIndexFile [\-q queryFile|(stdin)] [\-o8|(\fB\-osh\fR)|\-oss outputFile|(stdout)][AdditionalOptions] The query file can be either a FASTA file or a FASTQ file. \fB\-o8\fR produces alignment output in modified Blast8 format. \fB\-osh\fR produces alignment output in SAM format with hard clipping. \fB\-oss\fR produces alignment output in SAM format with soft clipping. [\-t numThreads (1)] .SS "Additional General Alignment Options:" .IP [\-BW BandWidth (5)] [\-G maxGap (50)] [\-H maxHits (650)] [\-M minMatch (25)] [\-MD MaxDesert (50)] [\-P minPercent\-identity (0.9)] [\-X Xdropoff (25)] .PP [\-AGS (Y)|N] controls use of Affine Gap Scoring. If \fB\-AGS\fR is off, a simple edit distance calculation is done. If on, the following are used: .IP [\-GEC GapExtensionCost (2)] [\-GOC GapOpenCost (5)] [\-MS MatchScore (1)] [\-RC ReplacementCost (3)] .PP [\-OQC (Y)|N] controls use of the Optimal Query Coverage Algorithm. If \fB\-OQC\fR if off, all alignments meeting above criteria are output. If on, a set of "primary" alignments are found that optimally cover the length of the query, using the following options: .IP [\-BP BreakpointPenalty (5)] [\-MGDP MaxGenomicDistancePenalty (5)] [\-MNO MinNonOverlap (minMatch)] .PP The cost of inserting a breakpoint in the Optimal Coverage Set is BP*MIN(MGDP, Log10(genomic distance between reference loci)). .PP [\-FBS Y|(N)] controls inclusion of "secondary" alignments similar to a primary alignment found using OQC. If \fB\-FBS\fR is on, the following are used. A "secondary" alignemnt must satisfy BOTH criteria. .IP [\-PRL PercentReciprocalLength (0.9)] [\-PSS PercentSimilarScore (0.9)] .PP Additional experimental parameters: To compress a FASTA file to a nib2 file without creating an index: yaha \fB\-g\fR fastaGenomeFile \fB\-c\fR To uncompress a nib2 file back into a FASTA file: yaha \fB\-g\fR nib2GenomeFile \fB\-u\fR For finer control of alignments: [\-I maxIntron (maxGap)] allows separate control of max deletion size vs. maxGap for insertion size. For a more detailed help message, type yaha \fB\-xh\fR. .PP \fB\-\-version\fR is not a valid option. .PP Usage (Default parameter values shown in parenthesis): .PP To create an index: yaha \fB\-g\fR genomeFilename [\-H maxHits (65525)] [\-L wordLen (15)] [\-S Skip\-distance (1)] The genome file can be a FASTA file, or a nib2 file (created by a previous yaha index operation). .PP To align queries: yaha \fB\-x\fR yahaIndexFile [\-q queryFile|(stdin)] [\-o8|(\fB\-osh\fR)|\-oss outputFile|(stdout)][AdditionalOptions] The query file can be either a FASTA file or a FASTQ file. \fB\-o8\fR produces alignment output in modified Blast8 format. \fB\-osh\fR produces alignment output in SAM format with hard clipping. \fB\-oss\fR produces alignment output in SAM format with soft clipping. [\-t numThreads (1)] .SS "Additional General Alignment Options:" .IP [\-BW BandWidth (5)] [\-G maxGap (50)] [\-H maxHits (650)] [\-M minMatch (25)] [\-MD MaxDesert (50)] [\-P minPercent\-identity (0.9)] [\-X Xdropoff (25)] .PP [\-AGS (Y)|N] controls use of Affine Gap Scoring. If \fB\-AGS\fR is off, a simple edit distance calculation is done. If on, the following are used: .IP [\-GEC GapExtensionCost (2)] [\-GOC GapOpenCost (5)] [\-MS MatchScore (1)] [\-RC ReplacementCost (3)] .PP [\-OQC (Y)|N] controls use of the Optimal Query Coverage Algorithm. If \fB\-OQC\fR if off, all alignments meeting above criteria are output. If on, a set of "primary" alignments are found that optimally cover the length of the query, using the following options: .IP [\-BP BreakpointPenalty (5)] [\-MGDP MaxGenomicDistancePenalty (5)] [\-MNO MinNonOverlap (minMatch)] .PP The cost of inserting a breakpoint in the Optimal Coverage Set is BP*MIN(MGDP, Log10(genomic distance between reference loci)). .PP [\-FBS Y|(N)] controls inclusion of "secondary" alignments similar to a primary alignment found using OQC. If \fB\-FBS\fR is on, the following are used. A "secondary" alignemnt must satisfy BOTH criteria. .IP [\-PRL PercentReciprocalLength (0.9)] [\-PSS PercentSimilarScore (0.9)] .PP Additional experimental parameters: To compress a FASTA file to a nib2 file without creating an index: yaha \fB\-g\fR fastaGenomeFile \fB\-c\fR To uncompress a nib2 file back into a FASTA file: yaha \fB\-g\fR nib2GenomeFile \fB\-u\fR For finer control of alignments: [\-I maxIntron (maxGap)] allows separate control of max deletion size vs. maxGap for insertion size. For a more detailed help message, type yaha \fB\-xh\fR. .SH "SEE ALSO" https://github.com/GregoryFaust/yaha