Show HN: Fahmatrix – A Lightweight, Pandas-Like DataFrame Library for Java
46 points| mousomashakel | 9 months ago |github.com
After working extensively with Python’s data stack, I often ran into limitations related to speed, especially in larger or long-running data workflows. So I built Fahmatrix from scratch to offer similar APIs for manipulating CSVs, performing summary statistics, slicing rows/columns, and more — but all in Java.
Features:
Lightweight and dependency-free
CSV/TSV import with auto-headers
Series/DataFrame structures (like pandas)
describe(), mean(), stdDev(), percentile() and more
Fast parallel operations on numeric columns
Java 17+ support
Docs: https://moustafa-nasr.github.io/Fahmatrix/ GitHub: https://github.com/moustafa-nasr/fahmatrix
I’d love feedback from the Java and data communities — especially if you’ve ever wanted a simple dataframe utility in Java without needing full-scale ML libraries.
Happy to answer any questions!
uwemaurer|9 months ago
https://github.com/jtablesaw/tablesaw
https://github.com/dflib/dflib
My preferred way is just use duckdb java API. I didn't see anything better in performance/efficiency. Also a SQL query is often easier to write
mousomashakel|9 months ago
theanonymousone|9 months ago
rickette|9 months ago
mousomashakel|9 months ago
skanga|9 months ago
mousomashakel|9 months ago
owlstuffing|9 months ago
I’m currently using manifold-sql with duckdb for this.
mousomashakel|9 months ago
gitroom|9 months ago
[deleted]
jurgenaut23|9 months ago
[deleted]