README.markdown @ 125faa034b9a default tip

Fix README typo
author Steve Losh <steve@stevelosh.com>
date Tue, 24 Mar 2026 10:51:35 -0400
parents e78e0ce23d7a
children (none)
Sometimes you just want to make a quick FASTQ file for testing.

Usage:

1. Write your FASTQ spec in a `foo.lisp` file (see below for the syntax).
2. `quick-fastq foo.lisp` (or `cat foo.lisp | quick-fastq` if you prefer) to dump a random FASTQ on stdout.

## Syntax

`quick-fastq` will read two Common Lisp forms (using the standard reader for
now, so don't run it on untrusted data).  The format of the input is:

    bindings
    expr

`expr` is an expression describing how to generate a random read.

* A literal string like `"ATCG"` generates those bases with random quality scores.
* An integer like `123` generates that many random bases with random quality scores.
* A vector like `#(expr1 expr2 …)` evaluates each expression and concatenates the results.
* A symbol like `x` looks up the value in the bindings (see below).
* A list performs some operation on the form inside, depending on the symbol at
  the head of the list:
  * `(qN expr)` where `N` is 0-90 evaluates `expr` and sets its quality scores to `N`, e.g. `(q12 500)` will generate 500
  random bases with a qscore of `12`.
  * `(rev expr)` reverses `expr` (you can also use `(r expr)` as a shortcut).
  * `(comp expr)` complements `expr` (you can also use `(c expr)` as a shortcut).
  * `(revcomp expr)` is equivalent to `(rev (comp expr))` (you can also use `(rc expr)` as a shortcut)
  * `(first n expr)` takes the first `n` bases of `expr` (you can also use `(f n expr)` as a shortcut).
  * `(last n expr)` takes the last `n` bases of `expr` (you can also use `(l n expr)` as a shortcut).
  * `(rep n expr)` concatenates `n` copies of `expr` (you can also use `(tr n expr)` as a shortcut).
  * `(snp freq expr)` modifies `expr` to add SNPs at a rate of `freq` (`freq` must be between 0 and 1).
  * `(ins freq expr)` modifies `expr` to insert bases at a rate of `freq` (`freq` must be between 0 and 1).
  * `(del freq expr)` modifies `expr` to delete bases at a rate of `freq` (`freq` must be between 0 and 1).
  * `(err freq expr)` is equivalent to `(ins freq (del freq (snp freq expr)))` (`freq` must be between 0 and 1).

Bindings must be a (possibly empty) list of bindings, each of the form `(symbol
expr)`.  `expr` will be evaluated and bound to `symbol`.  Bindings are performed
in order as if by `let*`.  Several keyword symbols have special meanings:

* Binding `:entries` to an integer `n` will generate that many FASTQ entries instead of just a single one.
* Binding `:seed` to an integer will seed the RNG with a specific seed, to make runs reproducible.

## Examples

Generate a random 1000bp read:

    ()
    1000

Generate a read with the same 100bp beginning and end, with 500bp of random
bases in the middle:

    ((x 100))
    #(x 500 x)

Generate a gapped foldback chimeric read, with the second half having a lower
quality than the first:

    ((x (q40 1000))
     (f (q20 (revcomp x))))
    #(x 25 f)

Generate a read with a tandem repeat in the middle:

    ()
    #(1000 (rep 200 "ATTT") 1000)

Generate a foldback chimeric read with a double tandem duplication in the
foldback strand, with simulated sequencing error, and small chunks of
low-quality bases to make the transitions between sections as a hack:

    ((x 1000)
     (lq (q1 10))
     (a (first 800 x))
     (b (last 200 x))
     (dup (last 150 a))
     (f (revcomp #(lq a lq dup lq (rc dup) lq dup lq b))))

    (err 0.01 #(x f))