bob_roboto | 4 years ago | on: Being on-call is working. FULL STOP
bob_roboto's comments
bob_roboto | 4 years ago | on: Being on-call is working. FULL STOP
- Being on-call is compensated, although at a much lower rate. The compensation is for the inconvenience of having to carry you computer with you, not being able to get drunk, not sleeping as well (if that applies to you), etc.
- When you get paged, you can generously compensate that time as TOIL. E.g. if you get paged and you realise it's a false positive and go back to bed, you can still compensate 2h as you need to fall back asleep and likely will not be as rested. If you have to do something for 4 hours, take off a day because you likely had to cancel some personal plans or have not had any meaningful sleep at night.
bob_roboto | 4 years ago | on: Ask HN: What's the most life-changing blog post you've ever read?
Also, there are undeniably some hard truths hidden in there. However, my experience in almost 20 years in the tech/software/product industry often paints a different picture. Yes, our brains keep us from changing and evolving and yes, obviously you need skills to be successful in life and your career. But in my industry in particular, hard and soft skills are not the dominant factor that keeps individuals from succeeding or progressing. I'm lucky enough to work with an abundance of talent and skill, and yet, one of the major factors of dissatisfaction is lack of "progression". One of the main factors is self-confidence and in the more severe cases even mental health issues. Some of the most skilled and knowledgeable engineers I worked with struggled to realise their potential because of it. If the leaders in your organisation think they can just shout at them to "learn self-confidence as a skill" and get over it you're going to have a bad time. It will attract a certain type and employee that thrives in that environment and disengage everyone else. Wasting talent, wasting skills and ultimately a lot of money. Creating an environment and learning how to tease the potential out of skilled and talented individuals is not a "hippie/hipster" thing to do, it is good for business.
bob_roboto | 4 years ago | on: Show HN: Open-source A/B testing framework
bob_roboto | 4 years ago | on: Ask HN: Has anyone fully embraced an event-driven architecture?
However, I am an advocate of the pattern and have seen it used successfully repeatedly. The largest scale as the data lead for a product maintained by 100-200 developers and several thousand transactions per second.
To answer your specific questions
>handling breaking schema changes or failures in an elegant way, and keeping engineers and other data consumers happy enough?
We did not allow for breaking schema changes. If there is a breaking change, it's a new event/topic. We used Kafka and every topic needed to have a compatibility scheme defined (see https://docs.confluent.io/platform/current/schema-registry/a...) to clarify what constitutes a breaking change. Even though some claim that producers and consumers can be fully decoupled, you will need to have a good idea who your consumers are and the time horizon of the data they consume. Application engineers are usually easier to keep happy than machine learning practitioners and other data consumers that want to consume events emitted over a long time period, potentially years.
> As a trivial example, everybody talks about dead-letter queues but nobody really explains how to handle messages that end up in one.
Dead letter queues are a tool you can use when the context demands it, applying it wholesale is likely creating too much overhead. But to provide you with a specific example. Some emitted events will be revenue impacting and depending on your setup, you actually want to use the events for financial reporting (careful! some more info later). In this specific use-case, if you can't process a record, the last thing you want to do is throw the message away. Somebody will need to have a look at these records, fix the cause and then either re-emit the records based on what you know about them from the header or fix the records in the DLQ. So think about the guarantees you need to provide and decide whether a DLQ makes sense for your use-case.
Some other thoughts and considerations.
- Topics more or less directly become analytics tables. Almost creating a unified view on your application's data otherwise difficult to create.
- How are the messages emitted. Are the messages emitted from the application logic? If so, what guarantees do you need? What happens if the app crashes (e.g. after a DB transaction happens and before the event was emitted). Depending on what you need, have a look at the transaction outbox pattern.
bob_roboto | 4 years ago | on: EU bans Belarusian airlines from European skies
bob_roboto | 5 years ago | on: Signal community: Reminder: Please be nice
bob_roboto | 6 years ago | on: Google to Acquire Looker
bob_roboto | 7 years ago | on: New for AWS Lambda: Use Any Programming Language and Share Common Components
bob_roboto | 7 years ago | on: Healthcare.gov confirms hackers stole income, immigration and tax data
bob_roboto | 7 years ago | on: Quitting my job has been the best thing I've done for my career
bob_roboto | 7 years ago | on: Facebook Is Giving Advertisers Access To Your Shadow Contact Information
bob_roboto | 7 years ago | on: A year later, Equifax has faced little fallout from losing data
bob_roboto | 7 years ago | on: Query Parquet files in SQLite
bob_roboto | 8 years ago | on: Google Maps' Moat
bob_roboto | 8 years ago | on: How to Design a Scalable Rate Limiting Algorithm
bob_roboto | 8 years ago | on: Show HN: All Politicians in Switzerland
bob_roboto | 8 years ago | on: Ask HN: Writing cover letters for tech jobs
bob_roboto | 8 years ago | on: Microsoft Adds an OpenSSH Client to Windows 10
bob_roboto | 8 years ago | on: Microsoft Adds an OpenSSH Client to Windows 10
[0]https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.ht...
Asking for being on-call for 7 days to be equivalent of 168h of work, i.e. after you have been on-call for a week you go on vacation for 3 weeks is just as unreasonable in most situations as companies asking their staff to be on-call without additional compensation.