CLI: aggregate
Synopsis
Aggregate TWO data into a rasterized matrix of size [x,y] for plotting.
Examples
| tomahawk aggregate -f r2 -r mean -c 50 -i input.twk -O b -o agg.twa
tomahawk aggregate -f r2 -r mean -c 50 -i input.twk > agg.mat
tomahawk aggregate -f r2 -r mean -c 50 -i input.twk -x 4000 -y 4000 -c 50 -t 8 -O b -o agg.twa
|
Options
| Options:
-i FILE input TWO file (required)
-o FILE output file path (default: -)
-O <b,u> b: compressed binary representation, u: uncompressed matrix (default: b)
-x,y INT number of X/Y-axis bins (default: 1000)
-f STRING aggregation function: can be one of (r2,r,d,dprime,dp,p,hets,alts,het,alt)(required)
-r STRING reduction function: can be one of (mean,count,n,min,max,sd,total)(required)
-I STRING filter interval <contig>:pos-pos (TWK/TWO) or linked interval <contig>:pos-pos,<contig>:pos-pos
-c INT min cut-off value used in reduction function: value < c will be set to 0 (default: 5)
-t INT number of parallel threads: each thread will use 40(x*y) bytes
|
Valid aggregation and reduction function names (case insensitive):
Aggregation |
Description |
R |
Pearson correlation coefficient |
R2 |
Squared pearson correlation coefficient |
D |
Coefficient of linkage disequilibrium |
Dprime |
Scaled coefficient of linkage disequilibrium |
Dp |
Alias for dprime |
P |
Fisher's exact test P-value of the 2x2 haplotype contigency table |
Hets |
Number of (0,1) or (1,0) associations |
Alts |
Number of (1,1) associations |
Het |
Alias for hets |
Alt |
Alias for alts |
Reduction |
Description |
Mean |
Mean number of aggregate |
Max |
Largest number in aggregate bin |
Min |
Smallest number in aggregate bin |
Count |
Total number of records in a bin |
N |
Alias for count |
Total |
Sum total of aggregated number in a bin |