top | item 24006780

(no title)

spacewander | 5 years ago

One of my friends brought me up this post in the morning. The post is awesome and inspirational (caused a discussion in our chant group), though I can't agree with some trivial points.

> Nginx performance without stats collections is on part with Envoy, but our Lua stats collection slowed Nginx on the high-RPS test by a factor of 3. This was expected given our reliance on lua_shared_dict, which is synchronized across workers with a mutex.

The `a factor of 3` is quite large to me. Maybe you put all your stats in lua_shared_dict? You don't need to synchronize the stats every time. Since the collection regularly happens in per-minute frequency, you can put the stats as Lua table, and synchronize them once per 5/10 seconds.

It look like that the compared Nginx is configured with a system which has been survived for years and not up-to-date. The company I worked with used a single virtual server to hold all traffic and routed them with Lua code dynamically. And the upstream is chosen by Lua code too. There is no need to reload Nginx when a new route/upstream is added. We even implemented 'Access Log Service' like feature so that each user can have her favorite access log (by modifying the Nginx core, of course).

However, I don't think this post is incorrect. What Envoy surpasses Nginx is that it has a more thriving developer community. There are more features added into Envoy than Nginx in the recent years. Not only that, opening discussion of Nginx development is rare.

Nginx is an old, slow giant.

discuss

order

SaveTheRbtz|5 years ago

We've made a note about how inefficient our solution was and what was the plan to fix it. Sadly, to get proper stats in nginx we needed two things:

* C interface for stats, so we can would have access to from C code.

* Instrument all `ngx_log_error` calls so we would have access not only to per-request stats but also various internal error conditions (w/o parsing logs.)

That said, we could indeed just improve our current stat collection in the short term (e.g. like you suggested with a per-worker collection and periodic lua_shared_dict sync.) But that would not solve the longterm problem of lacking internal stats. We could even go further and pour all the resources that were used for Envoy migration into nginx customizations but that would be a road with no clear destination because we would unlikely to succeed in upstreaming any of that work.

rolls-reus|5 years ago

> The `a factor of 3` is quite large to me. Maybe you put all your stats in lua_shared_dict? You don't need to synchronize the stats every time. Since the collection regularly happens in per-minute frequency, you can put the stats as Lua table, and synchronize them once per 5/10 seconds.

Any pointers on how to achieve this for someone just starting out with lua and openresty? I have the exact same thing (lua_shared_dict) for stats collection, would love to learn a better way.

alinspired|5 years ago

nginx had cold (for American standards) and conservative community to begin with, commercial version and F5 ownership likely "closed" it even more

it's a pity that community never evolved with nginx growth and success