Skip to content

Tomahawk aggregate

Tomahawk

Home
Introduction
Installation
Technical details
Technical details
Tutorial
Tutorial
- CLI
- R
- Docker
Command line interface
Command line interface
- aggregate aggregate
  Table of contents
- calc
- concat
- import
- sort
- view
- haplotype
- decay
API documentation
Developer documentation



CLI: `aggregate`¶

Synopsis¶

Aggregate TWO data into a rasterized matrix of size [x,y] for plotting.

Examples¶

1
2
3

tomahawk aggregate -f r2 -r mean -c 50 -i input.twk -O b -o agg.twa
tomahawk aggregate -f r2 -r mean -c 50 -i input.twk > agg.mat
tomahawk aggregate -f r2 -r mean -c 50 -i input.twk -x 4000 -y 4000 -c 50 -t 8 -O b -o agg.twa

Options¶

Options:
  -i   FILE   input TWO file (required)
  -o   FILE   output file path (default: -)
  -O   <b,u>  b: compressed binary representation, u: uncompressed matrix (default: b)
  -x,y INT    number of X/Y-axis bins (default: 1000)
  -f   STRING aggregation function: can be one of (r2,r,d,dprime,dp,p,hets,alts,het,alt)(required)
  -r   STRING reduction function: can be one of (mean,count,n,min,max,sd,total)(required)
  -I   STRING filter interval <contig>:pos-pos (TWK/TWO) or linked interval <contig>:pos-pos,<contig>:pos-pos
  -c   INT    min cut-off value used in reduction function: value < c will be set to 0 (default: 5)
  -t   INT    number of parallel threads: each thread will use 40(x*y) bytes

Valid aggregation and reduction function names (case insensitive):

Aggregation	Description
`R`	Pearson correlation coefficient
`R2`	Squared pearson correlation coefficient
`D`	Coefficient of linkage disequilibrium
`Dprime`	Scaled coefficient of linkage disequilibrium
`Dp`	Alias for `dprime`
`P`	Fisher's exact test P-value of the 2x2 haplotype contigency table
`Hets`	Number of (0,1) or (1,0) associations
`Alts`	Number of (1,1) associations
`Het`	Alias for `hets`
`Alt`	Alias for `alts`

Reduction	Description
`Mean`	Mean number of aggregate
`Max`	Largest number in aggregate bin
`Min`	Smallest number in aggregate bin
`Count`	Total number of records in a bin
`N`	Alias for `count`
`Total`	Sum total of aggregated number in a bin