Skip to content

CLI: aggregate

Synopsis

Aggregate TWO data into a rasterized matrix of size [x,y] for plotting.

Examples

1
2
3
tomahawk aggregate -f r2 -r mean -c 50 -i input.twk -O b -o agg.twa
tomahawk aggregate -f r2 -r mean -c 50 -i input.twk > agg.mat
tomahawk aggregate -f r2 -r mean -c 50 -i input.twk -x 4000 -y 4000 -c 50 -t 8 -O b -o agg.twa

Options

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
Options:
  -i   FILE   input TWO file (required)
  -o   FILE   output file path (default: -)
  -O   <b,u>  b: compressed binary representation, u: uncompressed matrix (default: b)
  -x,y INT    number of X/Y-axis bins (default: 1000)
  -f   STRING aggregation function: can be one of (r2,r,d,dprime,dp,p,hets,alts,het,alt)(required)
  -r   STRING reduction function: can be one of (mean,count,n,min,max,sd,total)(required)
  -I   STRING filter interval <contig>:pos-pos (TWK/TWO) or linked interval <contig>:pos-pos,<contig>:pos-pos
  -c   INT    min cut-off value used in reduction function: value < c will be set to 0 (default: 5)
  -t   INT    number of parallel threads: each thread will use 40(x*y) bytes

Valid aggregation and reduction function names (case insensitive):

Aggregation Description
R Pearson correlation coefficient
R2 Squared pearson correlation coefficient
D Coefficient of linkage disequilibrium
Dprime Scaled coefficient of linkage disequilibrium
Dp Alias for dprime
P Fisher's exact test P-value of the 2x2 haplotype contigency table
Hets Number of (0,1) or (1,0) associations
Alts Number of (1,1) associations
Het Alias for hets
Alt Alias for alts
Reduction Description
Mean Mean number of aggregate
Max Largest number in aggregate bin
Min Smallest number in aggregate bin
Count Total number of records in a bin
N Alias for count
Total Sum total of aggregated number in a bin