asksol's comments

asksol | 7 years ago | on: What Is Idempotence?

I'm not sure operators are a good example, they can be idempotent or not depending on the implementation. The operator is a function that takes two values and return a result: result = op(left, right).

asksol | 7 years ago | on: Questions

But that safety net doesn't extend to startup founders does it? AFAIR In Norway, you need to have worked in a company as a regular employee for at least one or two years before you can claim unemployment benefits. Health insurance is always there, and so is basic social benefits (but the unemployment one is the only you can live on).

asksol | 7 years ago | on: Faust: Stream Processing for Python

In Faust we've had the ability to take a very different approach towards quality control. We have a cloud integration testing setup, that actually runs a series of Faust apps in production. We do chaos testing: randomly terminating workers, randomly blocking network to Kafka, and much more. All the while monitoring the health of the apps, and the consistency of the results that they produce.

If you have a Faust app that depends on a particular feature we strongly suggest you submit it as an integration test for us to run.

Hopefully some day Celery will be able to take the same approach, but running cloud servers cost money that the project does not have.

asksol | 7 years ago | on: Faust: Stream Processing for Python

Yes, they do serve different purposes but they also share similarities. You could easily write a task queue on top of Faust.

It's important to remember that users had difficulty understanding the concepts behind Celery as well, perhaps it's more approachable now that you're used to it.

Using an asynchronous iterator for processing events enables us to maintain state. It's no longer just a callback that handles a single event, you could do things like "read 10 events at a time", or "wait for events on two separate topics and join them".

asksol | 7 years ago | on: Faust: Stream Processing for Python

Faust is a library that you can import into your Python program, and all it requires is Kafka. Most other stream processing systems require additional infrastructure. Kafka Streams has similar goals, but Faust additionally enables you to use Python libraries and perform async I/O operations while processing the stream.

asksol | 7 years ago | on: Faust: Stream Processing for Python

We had support for it initially, but we ended up not using the schema registry server. We want to add support for this back in if it provides value.

asksol | 7 years ago | on: Faust: Stream Processing for Python

There's an examples/django project in the distribution. I think they removed the gevent bridge from PyPI for some reason, but you can still use the eventlet one. Gevent is production quality and the concept of bridging them is sound, so hope someone will work on it.

asksol | 7 years ago | on: Faust: Stream Processing for Python

We added two levels of abstractions for this purpose:

- A Stream iterates over a channel

- A Channel implements: `channel.send` and `channel.__aiter__`.

- A topic is a "named" channel backed by a Kafka topic

- Further the topic is backed by a Transport

- Transport is very Kafka specific

To implement support for AMQP the idea is you only need to implement a custom channel.

If you open an issue we can consider how to best implement it.

asksol | 7 years ago | on: Faust: Stream Processing for Python

Faust can serve the data over HTTP/websockets and other transports, so you can query it directly!

asksol | 7 years ago | on: Faust: Stream Processing for Python

I'm looking at Redis Streams and it seems to lack the partitioning component of Kafka. One of the properties that we rely on is the ability to manually assign partitions such that a worker assigned partition 0 of one topic, will also be assigned partition 0 of other topics that it consumes from.

Simplicity is of course a goal, but this may mean we have to sacrifice some features when Redis Streams is used as a backend.

I'm glad you like Celery, this project in many ways realize what I wanted it to be.

asksol | 7 years ago | on: Faust: Stream Processing for Python

My name is Ask and I am one of the co-creators along with Vineet.

Thanks for pointing this out. Fixed the links. You can also find the docs here: http://faust.readthedocs.io/en/latest/

Faust uses Kafka for message passing. The new messages you create can be pushed to a new topic and you could have another agent consuming from this new topic. Check out the word count example here: https://github.com/robinhood/faust/blob/9fc9af9f213b75159a54...

Also note that the Table is persisted in a log compacted Kafka topic. This means, we are able to recover the state of the table in the case of a failure. However, you can always write to any other datastore while processing a stream within an agent. We do have some services that process streams and storing state in Redis and Postgres.

asksol | 7 years ago | on: Faust: Stream Processing for Python

My name is Ask and I'm the co-creator of this project, along with Vineet.

Yes, we're definitely interested in supporting Redis Streams!

Faust is designed to support different brokers, but Kafka is the only implementation as of now.

asksol | 7 years ago | on: Faust: Stream Processing for Python

My name is Ask and I'm co-creator on this project, along with Vineet Goel.

This answer sums up why we wanted to use Python for this project: it's the most popular language for data science and people can learn it quickly.

Performance is not really a problem either, with Python 3 and asyncio we can process tens of thousands of events/s. I have seen 50k events/s, and there are still many optimizations that can be made.

That said, Faust is not just for data science. We use it to write backend services that serve WebSockets and HTTP from the same Faust worker instances that process the stream.

asksol | 8 years ago | on: Equifax terms of service may include binding arbitration clause

You don't think Equifax have lawyers? Of course they do, and they know that the clause exists. Just having the clause in the first place is a conspiracy to abuse consumers.

asksol | 9 years ago | on: Celery 4.0

Celery was never meant as a replacement for cron, it was simply a nice bonus that fits the messaging pattern well. Writing a task queue is actually very simple using for example Redis, but that doesn't necessarily mean Celery is over-engineered IMHO. It's very easy to forget the support required once your system is in production.

Disclaimer: I'm a contributor

asksol | 9 years ago | on: Celery 4.0

Thank you for the kind words, it's so very appreciated :)

I have merged many features, like broker transports, result backends, etc, and while the initial contribution was great, it ends up being unmaintained with issues that nobody fixes.

If there's any feature that you really want back, chances are the problems with that feature are not super difficult to fix, so please reach out!

asksol | 9 years ago | on: Association Between Cesarean Birth and Risk of Obesity

Definitely questionable, but maybe it's related to germs transferred during birth: http://well.blogs.nytimes.com/2016/02/01/post-cesarean-bacte...

asksol | 9 years ago | on: Writing an editor in less than 1000 lines of code, just for fun

I use hjkl consistently in vim and in the shell. Just think about how long it takes you to move your fingers to the cursor keys. You should spend some effort getting used to it, as using hjkl means you can keep your hands in the middle of the keyboard. I have also never had any problems with strain, and I suspect this is part of the reason.

asksol | 9 years ago | on: San Francisco man fights eviction after rent increase from $1,800 to $8,000

I agree it has some advantages, but citing commonality seems weird when it's increasingly unlikely for new to be homeowners to afford a home in cities like London, SF, etc. I doubt most people waste thousands of dollars each month for the ability to relocate quickly?

asksol | 9 years ago | on: Senate rejects FBI bid for warrantless access to internet browsing histories

Does it really apply to just browser histories? Does it mean they can take your computers without warrant to get to this history, or does it mean ISPs will have to sniff traffic to extract URLs visited and keep a log of them (for how long?).

It doesn't make much sense to me, unless they are carefully wording this into something that seems reasonable to the public ("I'll just use private mode, no deal") when it really means monitoring all our internet traffic. But then why is the media repeating it?