top | item 5904361

(no title)

> However, using xen or kvm solves that problem, by giving each guest their own ram that nobody else can fuck with.

It gets hard to say whether a shared page cache is a good thing or not, even though it may be unfair. I say this because I/O bandwidth getting exhausted is a huge issue.

For example, using cgroups to limit memory allocations for groups or processes seemed like a great way to fairly distribute memory. But, doing so forced such cgroups into swapping when they tried to exceed their limits, even when there was available memory on the host system. The swapping was so bad in terms of saturating disk I/O that we had two choices to maintain quality of service: (1) set the hard limit and OOM kill (or equivalent) within the cgroup when it gets exceeded or (2) not treat it as a hard limit and monitor usage separately. We chose the latter.

So, I honestly wonder, is it better to enforce separate page caches in the cause of fairness, even if it results in less efficient disk I/O? Or is it better to have a unified page cache and dedicate system-wide resources to increasing effective disk I/O bandwidth? (Do we focus on slicing the pie more fairly or increasing the size the pie while cutting sloppily?)

discuss

lsc|12 years ago

>Do we focus on slicing the pie more fairly or increasing the size the pie while cutting sloppily?

In my case, (well, really in the flat-rate multi-tenant case in general) fairness is what I care about, more than overall efficiency. From a business perspective, it's okay to not be all things to all people, if you are a cheap-ish flat-rate service, it's okay if your heavy users don't get as much performance as they'd like, as long as the reasonable usage customers still feel like they are getting what they are paying for.

My core customer, the hobbyist who wants something like a shell for IRC idling, personal mail server, DNS for personal domains and a place to experiment or run a development project... generally buys enough ram to cache all the disk they normally access. I mean, 512MiB goes a long way for your average unix sysadmin. So even if the disk is getting thrashed by heavy users, light users? once their pagecache is warm, they have a fairly responsive system for everyday sorts of things. Under Xen or something else that doesn't share pagecache? once the pagecache is warm, it stays warm. They can login a week later (assuming they aren't running a bunch of background stuff that's reading/writing disk and churning pagecache) and their /etc/shadow is still in pagecache, and the login is pretty fast.

Back when I was using FreeBSD jails? if the user hadn't logged in for a few hours, the /etc/shadow in their jail had been flushed from cache, and they had to read it from disk again. They were getting terrible service.

From a economic standpoint? in general, if you have a flat-rate service, your light users are where the money is. They pay just as much as your heavy users (as it is flat rate) and they use fewer resources. And really, I think it's fair that if you are a light user and you are paying just as much as the heavy user, you should promptly get your resources on the occasions that you do need them. Heavy users on flat-rate services are going to have less than perfect experiences, and this is fair too, I think, so long as expectations of service level were set ahead of time. Heavy users use flat-rate services because it's cheaper than pay per use services, usually... but the downside of that is that they then only get as much resources as they can get without impacting the light users.

Now, in a pay per use model? the opposite is true. In a pay per use model (like, say, google app engine) the heavy users are the real customers. The light users are sales prospects. So yeah; in that case, you focus on the heavy users (and charge accordingly.)

Personally, I believe this is why shared hosting is seen as so far inferior to platform as a service; shared hosting is usually billed flat-rate, so the service provider really only has incentive to look after the light users; they are better off if the heavy users go elsewhere. (and really, for five bucks a month, what do you want?) - whereas platform as a service usually means you get billed per use; in that case, it makes a lot of sense for the provider to focus almost all it's energy on making things better for the heavy users. Of course, they also charge those heavy users accordingly.