ellimilial's comments

ellimilial | 3 years ago | on: The fastest tool for querying large JSON files is written in Python (benchmark)

If it fits on a single machine - jq, flat files, JSON lines / avro if relatively flat. Change to a tabular format if when nesting not required.

Postgres JSONB works, but it requires maintaining a heavy server process. So does Lucene/elasticsearch.

I have been yearning for embeddable store (in line with SQLite the support that both works and also keeps the data compressed like JSONB). I know there were some attempts, tried some of it those, mostly monstrosities).

ellimilial | 4 years ago | on: Dual use of artificial-intelligence-powered drug discovery

Try a nociception, drug design or pharmacy group at your local university. Chances are they’ve been doing this for years.

ellimilial | 4 years ago | on: Who Founded Venice? (2021)

Great content, but incredibly difficult to trawl through.

ellimilial | 4 years ago | on: DynamoDB 10 years later

It's always great fun compounding new, manual 'indexes' when you discover you need another query.

ellimilial | 4 years ago | on: “Coding is basically just ifs and for loops.”

With a lot of fuzziness, some state and temporal stuff.

ellimilial | 4 years ago | on: Show HN: A generator of Fake Italian Coffee names

Yes please, it took me over 5 secs to figure out what to click, felt like ages.

ellimilial | 4 years ago | on: Show HN: A generator of Fake Italian Coffee names

Diffusione Urino

Sure it's delish, very appealing.

ellimilial | 4 years ago | on: Designing Low Upkeep Software

For https://biokeanos.com (biomedical data catalog and discovery tool), keeping things on disk and versioning by pushing to repo saved me months already on dubugging and maintenance only.

ellimilial | 4 years ago | on: Why I Use Nim instead of Python for Data Processing

Isn’t this counting the lines with any G/C in them vs the total number of G/C literals?

ellimilial | 4 years ago | on: Why I Use Nim instead of Python for Data Processing

A context might be useful.

From what I gather, the author is a researcher in bioinformatics related field. This may indicate that they tend to work either alone or in a relatively small group. The domain is small scope data processing/manipulation, research/exploratory code, ,likely short-lived or even one-off.

The progress in this context will possibly be governed by sheer processing speed (e.g. it’s unlikely anyone will delve deep into the code, a lot of iterations to ‘just get it done’ instead of testing etc.).

If this is more or less correct, the point that Nim might be more useful than Python for the author sounds very sensible to me. It’s a nice spot between command line tools and more functionality-loaded languages.

ellimilial | 4 years ago | on: Turing Oversold?

The team being 'all-British' for obvious security reasons. Which I imagine might have felt like and insult to an injury to the 'little people', who, despite cracking the code, were not permitted to continue working on it. Making them, you know, 'little people'.

ellimilial | 4 years ago | on: A dubious writing style emerging in science

The wording of the abstract indicates that this article itself has been auto-generated, right? :)

ellimilial | 4 years ago | on: CRISPR Editing in Primates

A fair amount of drugs inhibiting it had to be recalled after https://pubmed.ncbi.nlm.nih.gov/16554806 and subsequent regulators actions.

ellimilial | 4 years ago | on: CRISPR Editing in Primates

Quite possible. Nobody wants another hERG fiasco.

ellimilial | 4 years ago | on: The data model behind Notion's flexibility

@setr explained it really well. A side note, NoSQL also includes graph databases, dedicated to this type of node/relationship traversal.

ellimilial | 4 years ago | on: Flat Data

Hi Jason, thank you very much for the background and the explanation. It is fascinating to see the progress in this direction.

I started raising my eyebrow (in the best possible sense) upon seeing parts of tooling very similar to ours but simpler and more importantly - without moving parts. We operate in biomedical data space and deal with flat/static data a lot, for example we power https://biokeanos.com with data-in-repo, so Flat Data was immediately interesting.

It is really inspiring to see GitHub actions to having a foray in this direction, definitely something to keep an eye on.

ellimilial | 4 years ago | on: Flat Data

Thank you for the response and clearing up the 'billion rows' / surly bonds confusion I had from reading project's Why Flat Data? section. I think I understand the target use case slightly better now.

One of the strong arguments for object-like storage (S3 etc) in the context of plain / flat data is scalability and availability for large scale processing frameworks. Databases are only occasionally relevant.

ellimilial | 4 years ago | on: Flat Data

Very interesting how Github comes with more and more interesting 'actions' to turn repos into 'platforms' and moves us closer to serverless future.

@idan how does it scale with the size (including storage)? Is 'a billion rows' a goal or an actual tested use case?

ellimilial | 4 years ago | on: Try This One Weird Trick Russian Hackers Hate

[...] all currently have favorable relations with the Kremlin, including [...] Georgia, [...] Ukraine.

One might wonder how unfavourable relations with Kremlin look like then.

ellimilial | 4 years ago | on: Ask HN: What lessons did you learn from your best or worst colleagues?

This really stuck a cord, thank you for sharing.