The package with most versions still listed on PyPI is spanishconjugator [2], which consistently published ~240 releases per month between 2020 and 2024.
Regarding spanishconjugator, commit ec4cb98 has description "Remove automatic bumping of version".
Prior to that commit, a cronjob would run the 'bumpVersion.yml' workflow four times a day, which in turn executes the bump2version python module to increase the patch level. [0]
Tangential, but I've only heard about BigQuery from people being surprised with gargantuan bills for running one query on a public dataset. Is there a "safe" way to use it with a cost limit, for example?
I decided my life could not possibly go on until I knew what "elvisgogo" does, so I downloaded the tarball and poked around. it's a pretty ordinary numpy + pandas + matplotlib project that makes graphs from csv. one line jumped out at me:
str_0 = ['refractive_index','Na','Mg','Al','Si','K','Ca','Ba','Fe','Type']
the university of st. andrews has a laser named "elvis" that goes on a remote controlled submarine: https://www.st-andrews.ac.uk/~bds2/elvislaser.htm
I was hoping it'd be about go-go dancing to elvis music, but physics experiments on light in seawater is pretty cool too.
> spanishconjugator [2], which consistently published ~240 releases per month between 2020 and 2024
They also stopped updating major and minor versions after hitting 2.3 in Sept 2020. Would be interesting to hear the rationale behind the versioning strategy. Feels like you might as well use a datetimestamp for the version.
The author has run into the same problem that anyone who wants to do analysis on the NPM registry runs into, there's just no good first party API for this stuff anymore.
It seems this was their first time going down this rabbit hole, so for them and anyone else, I'd urge you to use the deps.dev Google BigQuery dataset [0] for this kind of analysis. It does indeed include NPM and would have made the author's work trivial.
I hate to deride the entire community, but many of the collective community decisions are smells. I think that the low barrier to entry means that the community has many inexperienced influential people.
The Julia General registry is locally stored as a tar.gz and has version info for all registered packages, so I tried this out for Julia packages. The top 5 are:
So, no crazy numbers or random unknown packages, all are major packages that have just had a lot of work and history to them. Out of the top 10, pretty much half were from the SciML ecosystem.
Caveats/constraints: Like the post, this ignores non-SemVer packages (which mostly used date-based versions) and also jll (binary wrapper) packages which just use their underlying C libraries' versions. Among jlls, the largest that isn't a date afaict is NEO_jll with 25.31.34666+0 as its version.
Incidentally I once ran into a mature package that had lived in the 0.0.x lane forever and treated every release as a patch, racking up a huge version number, and I had to remind the maintainer that users depending with caret ranges won't get those updates automatically. (In semver caret ranges never change the leftmost non-zero digit; in 0.0.x that digit is the patch version, so ^0.0.123 is just a hard pin to 0.0.123). There may occasionally be valid reasons to stay on 0.0.x though (e.g. @types/web).
Anthony Fu’s epoch versioning scheme (to differentiate breaking change majors from "marketing" majors) could yield easy winners here, at least on the raw version number alone (not the number of sequential versions released):
The "winner" just had its 3000th release on GitHub, already a few patch versions past the version referenced in this article (which was published today): https://github.com/wppconnect-team/wa-version
I made a fairly significant (dumb) mistake in the logic for extracting valid semver versions. I was doing a falsy check, so if any of major/minor/patch in the version was a 0, the whole package was ignored.
Brief reminder/clarification that these tools are used to circumvent WhatsApp ToS, and that they are used to:
1- Spam
2- Scam
3- Avoid paying for Whatsapp API (which is the only form of monetization)
And that the reason this thing gets so many updates is probably because of a mouse and cat game where Meta updates their software continuously to avoid these types of hacks and the maintainers do so as well, whether in automated or manual fashion.
> Time to fetch version data for each one of those packages: ~12 hours (yikes)
The author could improve the batching in fetchAllPackageData by not waiting for all 50 (BATCH_SIZE) promises to resolve at once. I just published a package for proper promise batching last week: https://www.npmjs.com/package/promises-batched
Just spin up a loop of 50 call chains. When one completes you just do the next on next tick. It's like 3 lines of code. No libraries needed. Then you're always doing 50 at a time. You can still use await.
async work() { await thing(); nextTick(work); }
for(to 50) { work(); }
then maybe a separate timer to check how many tasks are active I guess.
> I was recently working on a project that uses the AWS SDK for JavaScript. When updating the dependencies in said project, I noticed that the version of that dependency was v3.888.0. Eight hundred eighty eight. That’s a big number as far as versions go.
It also isn’t the first AWS SDK. A few of us in… 2012 IIRC… wrote the first one because AWS didn’t think node was worth an SDK.
There are plenty of larger ones and plenty of ones that used the date as the version, but I was mainly curious about packages that followed semver.
Any package version that didn't follow the x.y.z format was excluded, and any package that had less published versions than their largest version number was excluded (e.g. a package at version 1.123.0 should have at least 123 published versions)
Well, we are looking at npm packages, where every package is supposed to follow semantic versioning. The fact that we don't have date as version number means everyone is a good citizen.
Haha, good luck finding a real project that holds that title. It's always some squatted name, a dependency confusion experiment, or a troll publishing a package with version 99999.99999.99999 just to see what breaks. The "king" of that hill changes all the time. Just another day in the NPM circus.
`latentflip-test` is from the same fellow who did the "What the heck is the event loop anyway?" JSConf EU talk that many have seen. https://youtu.be/8aGhZQkoFbQ
stabbles|5 months ago
The package with most versions still listed on PyPI is spanishconjugator [2], which consistently published ~240 releases per month between 2020 and 2024.
[1] https://console.cloud.google.com/bigquery?p=bigquery-public-...
[2] https://pypi.org/project/spanishconjugator/#history
Rygian|5 months ago
Prior to that commit, a cronjob would run the 'bumpVersion.yml' workflow four times a day, which in turn executes the bump2version python module to increase the patch level. [0]
Edit: discussed here: https://github.com/Benedict-Carling/spanish-conjugator/issue...
[0] https://github.com/Benedict-Carling/spanish-conjugator/commi...
breakingcups|5 months ago
thesystemisbust|5 months ago
The underlying dataset is hosted at sql.clickhouse.com e.g. https://sql.clickhouse.com/?query=U0VMRUNUIGNvdW50KCkgICBGUk...
disclaimer: built this a a while ago but we maintain this at clickhouse
oh and rubygems data is also there.
passivegains|5 months ago
n4r9|5 months ago
They also stopped updating major and minor versions after hitting 2.3 in Sept 2020. Would be interesting to hear the rationale behind the versioning strategy. Feels like you might as well use a datetimestamp for the version.
0x500x79|5 months ago
jonchurch_|5 months ago
It seems this was their first time going down this rabbit hole, so for them and anyone else, I'd urge you to use the deps.dev Google BigQuery dataset [0] for this kind of analysis. It does indeed include NPM and would have made the author's work trivial.
Here's a gist with the query and the results https://gist.github.com/jonchurch/9f9283e77b4937c8879448582b...
[0] - https://docs.deps.dev/bigquery/v1/
bapak|5 months ago
This is insane
BobbyTables2|5 months ago
dotancohen|5 months ago
I hate to deride the entire community, but many of the collective community decisions are smells. I think that the low barrier to entry means that the community has many inexperienced influential people.
sundarurfriend|5 months ago
Caveats/constraints: Like the post, this ignores non-SemVer packages (which mostly used date-based versions) and also jll (binary wrapper) packages which just use their underlying C libraries' versions. Among jlls, the largest that isn't a date afaict is NEO_jll with 25.31.34666+0 as its version.
dotancohen|5 months ago
int_19h|5 months ago
aragonite|5 months ago
robin_reala|5 months ago
BobbyTables2|5 months ago
jve|5 months ago
franky47|5 months ago
https://antfu.me/posts/epoch-semver
bapak|5 months ago
I wonder why. Conventions that are being broken, maybe.
nosefurhairdo|5 months ago
genshii|5 months ago
I made a fairly significant (dumb) mistake in the logic for extracting valid semver versions. I was doing a falsy check, so if any of major/minor/patch in the version was a 0, the whole package was ignored.
The post has been updated to reflect this.
TZubiri|5 months ago
1- Spam 2- Scam 3- Avoid paying for Whatsapp API (which is the only form of monetization)
And that the reason this thing gets so many updates is probably because of a mouse and cat game where Meta updates their software continuously to avoid these types of hacks and the maintainers do so as well, whether in automated or manual fashion.
oconnore|5 months ago
whilenot-dev|5 months ago
The author could improve the batching in fetchAllPackageData by not waiting for all 50 (BATCH_SIZE) promises to resolve at once. I just published a package for proper promise batching last week: https://www.npmjs.com/package/promises-batched
winrid|5 months ago
Just spin up a loop of 50 call chains. When one completes you just do the next on next tick. It's like 3 lines of code. No libraries needed. Then you're always doing 50 at a time. You can still use await.
async work() { await thing(); nextTick(work); }
for(to 50) { work(); }
then maybe a separate timer to check how many tasks are active I guess.
1gn15|5 months ago
genshii|5 months ago
nailer|5 months ago
It also isn’t the first AWS SDK. A few of us in… 2012 IIRC… wrote the first one because AWS didn’t think node was worth an SDK.
athrowaway3z|5 months ago
> carrot-scan -> 27708 total versions
> Command-line tool for detecting vulnerabilities in files and directories.
I can't help but feel there is something absurd about this.
Taek|5 months ago
EdSchouten|5 months ago
genshii|5 months ago
Any package version that didn't follow the x.y.z format was excluded, and any package that had less published versions than their largest version number was excluded (e.g. a package at version 1.123.0 should have at least 123 published versions)
rs186|5 months ago
https://docs.npmjs.com/about-semantic-versioning
unknown|5 months ago
[deleted]
tedk-42|5 months ago
If this was an actual measurement of productivity that bot deserves a raise!
bmn__|5 months ago
Bigliest, boomiest version is 3735928560 from https://metacpan.org/dist/Acme-Boom
geetee|5 months ago
tantalor|5 months ago
But what if it was "all-the-package-names-that-do-not-reference-themselves"?
joeyhage|5 months ago
AWS still made the top 50
zastai0day|5 months ago
paulirish|5 months ago
kubatyszko|5 months ago
huflungdung|5 months ago
[deleted]