top | item 23058526

(no title)

I'm awaiting delivery of a printed Mastering Dyalog APL book while reading this! I landed on kdb+ after seeking a reasonable alternative to the so called "best in class" Elastic Search/Kibana tooling, fell in love with K once I understood that the syntactic terseness is all for the sake of fitting the entire interpreter into L2 cache, and have now landed at the decision that learning Iverson's classic is the only way to satisfy my desire to live a life free of them dang stinking loops!

discuss

dan-robertson|5 years ago

I don’t believe that the terseness of k is necessary to fit into the I$. I think you could have reasonably longer operators and do fine.

I think it’s partly about style, partly about having a small number of operators (which compose well together), partly on using simple data structures (it doesn’t take much code to iterate an array.

I’m not even particularly convinced that fitting in the instruction cache is a trick that magically makes everything fast anyway. Most of your memory accesses in a typical data processing program (ie the kind of program one would write in k) will be to the data and hopefully these will be nearly always linear and mostly sequential accesses.

snapdangle|5 years ago

The original K design was laid out in the 1980s when the constraints were even tighter than what they are today. The utilization of very short operators means not only the interpreter easily fits into cache but also the custom function definitions you have written will as well.

When dealing with high performance computing or real time processing of high volumes of data, any fetch to RAM for loading a function call to dispatch is going to have _some_ impact in a tight loop. Add that up for all the libraries you have loaded for your application verses a ground up implementation in K... Does that whole thing live in L3 along with the VM or intepreter + dependencies underneath it? It's doubtful.

My experience was simply using their Kx's free Developer IDE and experiencing the performance differential on datasets myself. YMMV but my (admittedly limited) experience leads me to believe that there is a serious case to be made for the performance advantages of having all your computational logic living as close to your computational cores as possible.

See also the PhD by author of the OP article where he presents language where:

"The entire source code to the compiler written in this method requires only 17 lines of simple code compared to roughly 1000 lines of equivalent code in the domain-specific compiler construction framework, Nanopass, and requires no domain specific techniques, libraries, or infrastructure support."

Linked from the article, available here: https://scholarworks.iu.edu/dspace/handle/2022/24749