avleenvig's comments

avleenvig | 7 years ago | on: Ops School Curriculum (2012)

Hi everyone! I really appreciate the feedback here.

It's true, the project is 5+ years old. The industry was very different back then, we were barely starting to think about containers at scale. The cloud was a thing for sure, but the main contributors were mostly working in places with physical infrastructure.

Over the years we've discussed how we can reboot the project. There is always interest and desire, but time is hard to find :) The project will stay up as it is because we still feel there is value in learning the basics.

Over time the understanding of what is "basics" is evolving. This year I'm chairing the SREcon conference in Singapore. The program has a "Core Principles" track which talks about things like deep dives on linux memory management, database indices, how BGP works, etc. There is a lot of desire to still have these super deep dives into underlying technology. I hope we can find a way to turn this into a stronger curriculum with labs and teaching exercises one day :)

- One of the original OpsSchool founders

avleenvig | 8 years ago | on: The Senior Engineer’s Guide to Helping Others Make Decisions

Indeed. As your business grows (it is growing, right - that's why you hired more engineers?) you need to scale yourself. If you don't, eventually you'll become the bottleneck.

Following this process has a definite cost up front. You have to give up doing some work in order to help someone else grow and learn the system. But once they do, your business now has an extra version of you. Not as experienced, but hopefully still pretty good. Let that person handle more work, trust them, and hire another person. Rinse and repeat.

The truth is that when you get more senior, you almost have to stop focusing solely on knocking out code or fixing problems yourself, and commit to helping others learn how to do it.

If you become the bottleneck, the business will start to work around you and then you will become irrelevant.

avleenvig | 8 years ago | on: The Senior Engineer’s Guide to Helping Others Make Decisions

Thanks, glad you liked it :-)

There's definitely a strong bias in our industry towards "greenfield" things - everyone wants to do the fun, exciting, initial work. But once the new shiny coating has worn off there's a ton of really hard work to do with resilience, robustness, scaling, etc. Too many people just give up at that point, get bored, and do something else.

avleenvig | 8 years ago | on: The Senior Engineer’s Guide to Helping Others Make Decisions

I completely agree, and in writing this post it was one of the pieces of feedback that came up. Ultimately we decided to go ahead with the verbiage because it would be the most easily understood, but the _message_ here is not about junior vs senior or organisational hierarchies. It's about peers and interactions between more and less experienced persons.

I interact with people daily who are far more "senior" to me on specific topics, and I to them on other topics.

avleenvig | 13 years ago | on: Bashttpd - An http server in bash

There are many benefits to projects like this. I would argue that this thread is the biggest benefit - people are sharing knowledge on shell scripting, and I'll bet you one hamburger that at least 5 people learn something new as a result.

That makes it worth it :-)

It also gets people thinking, being creative and doing something "fun". Sometimes you just have to do fun things and see where they lead.

avleenvig | 13 years ago | on: Bashttpd - An http server in bash

It depends where you draw the line. CGI's were passed many variables and an environment by the web server. In that respect, this is closer to a web server - the only thing it doesn't handle is the very lowest level network listening.

avleenvig | 13 years ago | on: Bashttpd - An http server in bash

It's likely I'll want to do some more complex stuff in bash, rather than pure sh. My preferred shell is zsh but of the more advanced shells, bash is the most prevalent.

I've pushed the change to use /usr/bin/env bash :-)

avleenvig | 13 years ago | on: What powers Etsy

RO filesystems can be bad, but usually they're soft failures for us: * Memcache can still work just fine * Db servers stop responding (and the app handles that fairly gracefully) * Web servers serve files from a RAM disk, so they keep working

No reason against 10G copper specifically - we haven't had to address the problem in detail yet. When we do, we might choose copper. Depends what happens when it happens :-)

We had a very nasty incident a few months ago, where the drive in one of the LDAP servers died. Well, it sort-of died. It started to time out a lot but didn't go fully offline. openldap kept running, but when you connected to it, the TCP connection would open and hang. This meant that all of our servers saw the server as "OK", but LDAP stopped working and caused all kinds of brilliant havoc :-)

avleenvig | 13 years ago | on: What powers Etsy

Honestly, we find it's much easier to spread the load between many switches, and have enough capacity that one switch failure is a non-event. Switches are expensive, it's true. After a certain point it does become less expensive and complex to just add more servers and switches. There's a strong advantage to keeping things as simple as possible. Bonding isn't really complicated, but how many not-complicated things can you add before things become complicated? :-)

avleenvig | 13 years ago | on: What powers Etsy

That sounds really good in theory, but in practice it's less good. There are a number of different types of costs to consider: 1. The cost of server hardware 2. The cost of unused hardware capacity 3. The administration cost (people, skills, etc) 4. Opportunity cost

The Cloud(tm) excels at addressing some of these. You don't have to pay /as/ highly for staff to manage you machines and network (the provider does much of that for you). You can run smaller, cheaper instances much closer to their limits.

The downside is that you pay a premium to the provider (even if you are your own provider). Additionally you lose a great deal of opportunity cost. If you use the "excess" capacity on the weekend for hadoop jobs, and need to stop them because you have a large burst of traffic, you've hurt yourself. Your non-production hadoop jobs can also have unexpected and unintended impact on your production web servers, causing your users pain.

Given these things, if you focus is on keeping the best experience for users then you should split things apart.

The actual gain you get from cramming as much as you can onto one piece of hardware is much, much lower than you might expect. It ends up being easier just to get more hardware and dedicate that hardware for specific tasks.

- Avleen Vig, Staff Operations Engineer, Etsy

page 1