al_james's comments

al_james | 7 years ago | on: Intel Announces Optane DIMMs

Thats very very interesting. Many thanks for sharing.

So its still a serialize/deserialize cycle, but the access libs built on top of the persistent memory look interesting.

al_james | 7 years ago | on: Intel Announces Optane DIMMs

What I can't work out from both the article and the comments here: from an application point of view, do I use this like I use memory, or do I use it like I use a disk?

No matter how fast a disk is, using it means either some expensive serialization/deserialization step (and also the associated memory access to create the 'working' object that my logic actually works on) or writing my algorithms to forego in memory objects (and the associated features offered by my programming language, e.g. classes / objects or whatever) and working from the raw byte values.

What I really want, and would be a game changer as to how we use things, would be that my programming languages heap can be made persistent (or at least a part of it). In this case instead of:

  var mything = new Thing();
  load_thing_from_disk(mything);

I might have:

  persistent var mything = new Thing();

Done. However this also introduces more questions, like transactional commits to memory etc (as few apps are coded to ensure consistency of memory across reboots).

However I cant help thinking that some way to harness persistent fast memory without needed some complex disk->logic mapping would be a game changer.

Edited: spelling and wording

al_james | 7 years ago | on: EC2 Instance Update – C5 Instances with Local NVMe Storage

Yup. We run Postgres on i3 instances on their native SSDs and its way faster for OLAP use cases. Check out avien.io for a hosted RDS style solution that does this.

al_james | 7 years ago | on: Ask HN: Who is hiring? (May 2018)

Ometria.com | London UK | FULL-TIME ONSITE | Several roles: Backend Python, Frontend Javascript, Data science, QA, Machine learning

Ometria's mission is to help retailers create marketing experiences their customers will love. We understand the challenges that retailers face, and we offer them a very innovative solution that provides insights on their customers, and tools to reach them more effectively across numerous channels.

Backed by top VC funds and successful entrepreneurs, and working alongside over a hundred of the fastest growing retailers, we are now looking for a more developers to join our small but growing engineering team.

We are hiring for:

- Backend python developers

- Frontend javascript developers (Ampersand JS, but considering moving to React)

- Data Scientist (Python stack)

- Machine Learning engineers

- Engineering manager

- VP engineering

- QA engineer

https://www.ometria.com/careers/ (Not all jobs are on that page yet, feel free to contact me personally at "al <at> ometria.com")

al_james | 8 years ago | on: Building Real Time Analytics APIs at Scale

Yeah, thats a good point. Redshift does not have the same level of 'probabilistic counting' functions, that can be used from rollups. Redshift does have HLL (SELECT APPROXIMATE COUNT(*)) however that can only be applied when scanning the full data, I am not sure its possible to store a HLL object in a rollup and later aggregate them.

al_james | 8 years ago | on: Building Real Time Analytics APIs at Scale

Thanks Ozgun. Thats a great video in 3.

al_james | 8 years ago | on: Building Real Time Analytics APIs at Scale

A great article, and I am a big fan of algolia, Citus and Redshift. However this article ends up making an odd apples to oranges comparison.

They state that "However, achieving sub-second aggregation performances on very large datasets is prohibitively expensive with RedShift", this suggests that they want to do sub-second aggregations across raw event data. However, later in the article, the solution they build is to use rollup tables for sub-second responses.

You can also do rollup tables in Redshift, and I can assure you (if you enable the fast query acceleration option) you can get sub-second queries from the rolled up lower-cardinality tables. If you want even better response times, you can store the rollups in plain old Postgres and use something like dblink or postgres_fdw to perform the periodic aggregations on Redshift and insert into the local rollup tables (see [1]). In this model the solution ends up being very similar to their solution with Citus.... and I would predict that this is cheaper than Citus Cloud as Redshift really is a great price point for a hosted system.

So the question of performing sub-second aggregations across the raw data remains unanswered... however that really is the ideal end game as you can then offer way more flexibility in terms of filtering than any rollup based solution.

Right now, research suggests Clickhouse, Redshift or BigQuery are probably the fastest solutions for that. Not sure about Druid, I dont know it. GPU databasees appear to the be the future of this. I would be interested to see benchmarks of Citus under this use case. I should imagine that Citus is also way better if you have something like a mixed OLAP and OLTP workload (e.g. you need the analytics and the row data to match exactly at all times).

Aside: It would be great to see Citus benchmarked against the 1.1 billion taxi rides benchmark by Mark Litwintschik. Any chance of that?

[1] https://aws.amazon.com/blogs/big-data/join-amazon-redshift-a... [2] http://tech.marksblogg.com/benchmarks.html

al_james | 8 years ago | on: Ask HN: Who is hiring? (April 2018)

Ometria.com | London UK | FULL-TIME ONSITE | Several roles: Backend Python, Frontend Javascript, QA, Machine learning

Ometria's mission is to help retailers create marketing experiences their customers will love. We understand the challenges that retailers face, and we offer them a very innovative solution that provides insights on their customers, and tools to reach them more effectively across numerous channels.

Backed by top VC funds and successful entrepreneurs, and working alongside over a hundred of the fastest growing retailers, we are now looking for a more developers to join our small but growing engineering team.

We are hiring for:

- Backend python developers

- Frontend javascript developers (Ampersand JS, but considering moving to React)

- Machine Learning engineers

- Engineering manager

- VP engineering

- QA engineer

https://www.ometria.com/careers/ (Not all jobs are on that page yet, feel free to contact me personally at "al <at> ometria.com")

al_james | 8 years ago | on: A new storage engine for PostgreSQL to provide better control over bloat

We use pg_repack and hit that exact problem (left over triggers etc). In the end we resorted to creating and dropping the pg_repack extension after each run.

(edit: Missed an "and")

al_james | 8 years ago | on: Ask HN: Who is hiring? (March 2018)

Ometria.com | London UK | FULL-TIME ONSITE | Several roles: Backend Python, Frontend Javascript, QA, Machine learning Ometria's mission is to help retailers create marketing experiences their customers will love. We understand the challenges that retailers face, and we offer them a very innovative solution that provides insights on their customers, and tools to reach them more effectively across numerous channels.

Backed by top VC funds and successful entrepreneurs, and working alongside over a hundred of the fastest growing retailers, we are now looking for a more developers to join our small but growing engineering team.

We are hiring for:

- Backend python developers - Frontend javascript developers (Ampersand JS, but considering moving to React) - Machine Learning engineers - Engineering manager - QA engineers

https://www.ometria.com/careers/ (Not all jobs are on that page yet, feel free to contact me personally at "al <at> ometria.com")

al_james | 8 years ago | on: Ask HN: Who is hiring? (February 2018)

Ometria.com | London UK | FULL-TIME ONSITE | Several roles: Backend Python, Frontend Javascript, QA, Machine learning

Ometria's mission is to help retailers create marketing experiences their customers will love. We understand the challenges that retailers face, and we offer them a very innovative solution that provides insights on their customers, and tools to reach them more effectively across numerous channels.

Backed by top VC funds and successful entrepreneurs, and working alongside over a hundred of the fastest growing retailers, we are now looking for a more developers to join our small but growing engineering team.

We are hiring for:

- Backend python developers

- Frontend javascript developers (Ampersand JS, but moving to React)

- Machine Learning engineers

- Engineering managers

- QA engineers

https://www.ometria.com/careers/ (Not all jobs are on that page yet, feel free to contact me personally at "al <at> ometria.com")

al_james | 8 years ago | on: Ask HN: Who is hiring? (January 2018)

Ometria.com | London | Full time | ONSITE

Ometria's mission is to help retailers create marketing experiences their customers will love. We understand the challenges that retailers face, and we offer them a very innovative solution that provides insights on their customers, and tools to reach them more effectively across numerous channels.

Backed by top VC funds and successful entrepreneurs, and working alongside over a hundred of the fastest growing retailers, we are now looking for a more developers to join our small but growing engineering team.

We are hiring for:

- Backend python developers

- Frontend javascript developers (Ampersand JS, but moving to React)

- Machine Learning engineers

- Engineering managers

- QA engineers

https://www.ometria.com/careers/ (Not all jobs are on that page yet, feel free to contact me personally at al <at> our domain)

(Edited: Formatting)

al_james | 8 years ago | on: The State of Vacuum in Postgres

Sadly it's a bit more complex than that. Because currently running different transactions may hold a view of the database as it was before the transaction commits, the actual view of what does can be reclaimed depends on what transactions are currently active. Thus upon commit, it may be that some tuples release by the committed transaction are still visible to others.

Vacuum actually works by looking for tuples where no running transactions can see them anymore. Postgres, in effect, maintains a minimum and maximum transaction ID that any tuple is visible for, and vacuum scans over all of those that have a max visible transaction ID (suggesting it's available to be reclaimed) and which is less than the minimum active transaction ID.

al_james | 9 years ago | on: Going Multi-Cloud with Google Cloud Endpoints and AWS Lambda

No exactly right.

As Google seem to be getting more serious about attracting aws converts and dual cloud deployments (eg in built aws vpc peering) I wonder if they will offer some kind of cut price data transfer to aws networks. That would be an interesting move.

al_james | 9 years ago | on: Show HN: Typr.club, realtime gif-based chat rooms

Can you talk about the tech stack you are using to power this?

al_james | 11 years ago | on: Show HN: GreatDJ – Create and save playlists that sync across devices

Congrats on shipping! I was at a party the other day and we needed a way to all contribute to the same youtub playlist. Now we have a solution!

al_james | 12 years ago | on: Svg.js

Can anyone give an overview of why this is different / better than raphaeljs [1]?

[1] http://raphaeljs.com/

al_james | 13 years ago | on: IOS 5 is dead

You mean, "IOS 5 is dead for services that use mixpanel". These "trends" in no way tell the story of the wider internet or other device usage. Consider the "mobile v desktop" report [1], supposedly mobile usage fell off over the first few weeks of March and then came back. Rubbish, what it says is that mixpanel has a changing client base that skew their stats.

[1] https://mixpanel.com/trends/#report/desktop_vs_mobile

al_james | 13 years ago | on: Semantics3 (YC W13) Is A Massive Consumer Products Database To Rule Them All

If its being used for competitive price analysis, I wonder if any retail sites will simply block their crawler? I am assuming that they are (correctly) announcing their crawler by its user agent, so could be blocked via robots.txt.

al_james | 13 years ago | on: Amazon's homepage was down

Lol, yes typo.