(no title)
brentroose | 4 days ago
That's why I built a performance challenge for the PHP community
The goal of this challenge is to parse 100 million rows of data with PHP, as efficiently as possible. The challenge will run for about two weeks, and at the end there are some prizes for the best entries (amongst the prize is the very sought-after PhpStorm Elephpant, of which we only have a handful left).
I hope people will have fun with it :)
Tade0|4 days ago
A Wordpress instance will happily take over 20 seconds to fully load if you disable cache.
rectang|4 days ago
If you're talking about a WordPress instance with arbitrary plugins running an arbitrary theme, then sure — but that's an observation about those plugins and themes, not core.
As someone who has to work with WordPress, I have all kinds of issues with it, but "20 seconds to load core with caching disabled" isn't one of them.
embedding-shape|4 days ago
tracker1|4 days ago
It seems like something like vercel/cloudflare could host the content-side published as a worker for mostly-static content from a larger application and that would be more beneficial and run better with less risk, for that matter. Having the app editing and auth served from the same location is just begging for the issues WP and plugins have seen.
monkey_monkey|4 days ago
rkozik1989|4 days ago
The_President|3 days ago
gib444|4 days ago
That's a huge improvement! How much was low hanging fruit unrelated to the PHP interpreter itself, out of curiosity? (E.g. parallelism, faster SQL queries etc)
brentroose|4 days ago
A couple of things I did:
- Cursor based pagination - Combining insert statements - Using database transactions to prevent fsync calls - Moving calculations from the database to PHP - Avoiding serialization where possible
Joel_Mckay|4 days ago
Depending on the SQL engine, there are many PHP Cursor optimizations that save moving around large chunks of data.
Clean cached PHP can be fast for REST transactional data parsing, but it is also often used as a bodge language by amateurs. PHP is not slow by default or meant to run persistently (low memory use is nice), but it still gets a lot of justified criticism.
Erlang and Elixir are much better for clients/host budgets, but less intuitive than PHP =3
NorwegianDude|3 days ago
contingencies|4 days ago
Other than the obvious point that writing an enormous JSON file is a dubious goal in the first place (really), while PHP can be very fast this is probably faster to implement in shell with sed/grep, or ... almost certainly better ... by loading to sqlite then dumping out from there. Your optimization path then likely becomes index specification and processing, and after the initial load potentially query or instance parallelization.
The page confirms sqlite is available.
If the judges whinge and shell_exec() is unavailable as a path, as a more acceptable path that's whinge-tolerant, use PHP's sqlite feature then dump to JSON.
If I wanted to achieve this for some reason in reality, I'd have the file on a memory-backed blockstore before processing, which would yield further gains.
Frankly, this is not much of a programming problem, it's more a system problem, but it's not being specced as such. This shows, in my view, immaturity of conception of the real problem domain (likely IO bound). Right tool for the job.
ge96|4 days ago
What takes 5 days to run
hosteur|4 days ago
slopinthebag|4 days ago
CyberDildonics|4 days ago
lofaszvanitt|4 days ago
user3939382|4 days ago
Where do I get my prize? ;)
brentroose|4 days ago
onion2k|4 days ago
When people say leetcode interviews are pointless I might share a link to this post. If that sort of optimization is possible there is a structures and algorithms problem in the background somewhere.
nicoburns|4 days ago
They tend to be about the implementation details of specific algorithms and data structures. Whereas the important skill in most real-world scenarios would be to understand the trade-offs between different algorithms and data structures so that you pick an appropriate off-the-shelf implementation to use.
tuetuopay|4 days ago
slopinthebag|4 days ago