top | item 39990091

(no title)

kamov | 1 year ago

Sorry, I don't want to be that guy, but I'm curious why use Java and not something like Rust? Is it because of the Lucene ecosystem?

discuss

order

marginalia_nu|1 year ago

The short story is I'm productive in Java, and I enjoy its mature ecosystem and stable APIs. Systems programming is awkward, but it's also a very small part of the project and for what it is, an interesting challenge.

What I'm building is what Lucene does (i.e. document indexing) and then the rest of the search engine as well including crawling and serving traffic.

int_19h|1 year ago

Have you tried C# for those kinds of things? It's normally the same level of abstraction as Java, and has similarly mature ecosystem with stable APIs, but it also has unsigned types, unmanaged data and function pointers (with pointer arithmetic), unions, the equivalent of alloca etc for when you need to go low-level.

YoshiRulz|1 year ago

Have you ever tried Kotlin? For non-system-level stuff on the JVM, it's IMO a direct upgrade to Java. `value class` for type aliases, `when`, and better null handling. But then, modern Java is doing all that stuff too, I hear.

mrkeen|1 year ago

I did this.

I previously worked at a search engine which did its index building and querying in C, with its higher-level stuff (web-apps, scheduling, tooling, etc.) in Java.

Later when I built my own version, I started with C for the low-level and Haskell for the high-level. I made a few iterations in C, but eventually rewrote it in Rust, and I was pretty happy with that choice.

I was more familiar with C, and it was a really good fit for what I was writing. Terms and Documents just become termIds and docIds (numbers) sitting in indexes (arrays). Memory-mapping is a really comfortable way to do things: files are just arrays; let the OS sort out when to page things in and out of disk.

But where C fell down for me was in the changing the code. The meaning of the data&code were lost in nested for-loops and void-function-pointers, etc. Rust gave me a better shot at both writing and rewriting the code.

Java for the low-level was a non-starter for me for a few reasons, but the biggest two were startup time and difficulty of the mmap api (31-bits of address-space for a file? c'mon!).

marginalia_nu|1 year ago

> Java for the low-level was a non-starter for me for a few reasons, but the biggest two were startup time and difficulty of the mmap api (31-bits of address-space for a file? c'mon!).

Neither of these are issues anymore though. Especially not with graal's native images.