top | item 41743484

(no title)

Felk | 1 year ago

The slowdown wasn't due to a lot of permutations, but mostly because a) wget just takes a considerable amount of time to process large HTML files with lots of links, and b) MyBB has a "threaded mode", where each post of a thread geht's a dedicated page with links to all other posts of that thread. The largest thread had around 16k posts, so that's 16k² URLs to parse.

In terms of possible permutations, MyBB is pretty tame thankfully. Only the forums are sortable, posts only have the regular and the aforementioned threaded mode to view them. Even the calender widget only goes from 1901-2030, otherwise wget might have crawled forever.

I originally considered excluding threaded mode using wget's `--reject-regex` and then just adding an nginx rule later to redirect any incoming such links to the normal view mode. Basically just saying "fuck it, you only get this version". That might be worth a try for your case

discuss

order

No comments yet.