top | item 14415143

(no title)

gwu78 | 8 years ago

"The task is to sum the values for each key and print the key with the largest sum."

What is the smart way to do this in kdb+?

This is my naive, sloppy 15min approach.

Warning: Noob. May offend experienced k programmers.

   k)`t insert+:`k`v!("CI";"\t")0:`:tsvfile
   k)f:{select (*:k),(sum v) from t where k=x}
   k)a:f["A"]
   k)b:f["B"]
   k)c:f["C"]
   k)select k from a,b,c where v=(max v)

discuss

order

qesa|8 years ago

Using the file from the original,

    1#desc sum each group (!/) (" II";"\t") 0: `:tsvfile
Took about 3 seconds, 2.5 of which was reading the file

EDIT:

    q)\ts d: (!/) (" II";"\t") 0: `:tsvfile
    2489 134218576
    q)\ts 1#desc sum each group d
    486 253055104

gwu78|8 years ago

I was using the first example with a char in the first column.

   A 4
   B 5
   B 8
   C 9
   A 6
How to solve with only a dict?

Regarding the 1gram file at https://storage.googleapis.com/books/ngrams/books/googlebook...

This is the result I got

   3| 1742563279
using

   q)\ts d:(!/)(" II";"\t")0:`:1gram
   q)\ts 1#desc sum each group d
   1897 134218176
   371 238872864
or

   k)\ts d:(!/)(" II";"\t")0:`:1gram
   k)\ts desc:{$[99h=@x;(!x)[i]!r i:>r:. x;0h>@x;'`rank;x@>x]}
   k)\ts 1#desc (sum'=:d)
   1897 134218176
   0 3152
   372 238872864
No doubt I must be doing some things wrong.