Why does it not suffice to examine 'pip install Django' metrics from PyPI? That would be a reliable indicator of the relative popularity of the package against other packages in a level playing field.
While it would overcount the number of true installations of projects using Django, judging by the number of times I spin up a VM for testing, I would still argue that would be a better metric than a custom GA integration for which you'd have no relevant point of comparison. Even if they were to make this opt-out, what would they compare it to?
A: "Based on our custom GA developer tracking, we count 400,000 new Django projects this month."
B: "Django is the 4th most frequently installed third party Python package, based on the Python package index."
Personally I'd trust statement B more than A. No one can independently verify statement A.
As others note, the issue with PyPI and other download metrics is there are tools which frequently download the full set of requirements for a project in order to run their tasks, and those downloads shouldn't be counted.
At first before reading the article I was very much against it. But after reading it, it seems a little bit more reasonable but I would really strongly prefer two things if this were to ever get implemented: 1) don't use Google. There must be services that will provide analytics pro-bono/really cheaply for open source/non-profits that aren't tied to a company with terrible privacy track records like Google. 2) make it abundantly clear that this will happen and explicitly give the opt-out instructions the first time (if it was indeed opt-out). As in, "We are enabling user tracking for better usage statistics. If you would like to opt-out, please type <...>." I know that I rarely read changelogs and that if things are not presented to me at installation time they would probably sneak in, through the fault of the Django team or not, but I'd worry that such a thing (notifying users directly) isn't easily possible through pip installations/setup.py.
However, would it be very useful statistically compared to the Pypi installation numbers? Sure, Python is different than NPM because NPM almost always locally installs packages whereas Python installs globally by default, but the numbers must still be high as Django is likely one of the highest installed packages from Pypi and in Python-land in general and as czep points out, because they would only be tracking themselves, it would be hard to compare numbers to anything. It would be useful from a total amount perspective but it wouldn't have any use in comparing to other packages because the kind of data would be different: Django would have usage statistics whereas Pypi has installation/download statistics.
I'm also surprised this is even necessary, since the main purpose of this is supposedly to be able to talk to potential investors for the DSF with concrete numbers. Is Django being basically familiar with every Python developer not enough? I'd really want to know specifically if investors have said they want usage data explicitly, rather than the nebulous idea that it may help make it easier to raise money before I'm more open to the proposal.
> 1) don't use Google. There must be services that will provide analytics pro-bono/really cheaply for open source/non-profits that aren't tied to a company with terrible privacy track records like Google.
As an occasional Django user -- 100% on this. It's nothing difficult to store and persist some key-value pairs from a POST request, certainly doesn't require Google Analytics.
I use django professionally, and if tracking usage helps guide development or attract sponsors to achieve higher quality -- I'm all for it.
There is a problem to be solved (how to make OSS sustainable), and I'm both interested in solving that problem and trying different approaches to solve it.
(edited for less use of the phrase "I'm all for it")
Personally I'm not opposed to a popcon-style thing that just lets us estimate "X million people use Django". But it's increasingly looking like it's impossible to put together such a thing in a way that's both A) useful and B) not going to cause privacy issues.
So even if we do have an accurate usage count, say 10 millions, so what? What's the Foundation's plan to get funding?
I think they should run annual campaign like Mozilla and Wikipedia. The spend of the money should be 100% transparent. I am not really sure why we need a Foundation. I get the hosting cost, and rewarding people to work on very difficult features and enhancements, but what else? Conference cost & scholarship? What else.
I use Django and don't mind being tracked if it helps development. However, the proposed tracking sounds like hit tracking which doesn't give you any meaningful numbers only trends. So I think tracking pip installs would give you the same trends.
I agree though. The Django (and Python) community in my experience has been good at actually debating issues on their merit, and trying to keep own feelings/opinions with no facts to back them up out of it. Of course this doesn't always work and there's always going to be some comments that don't follow those principles, especially with more controversial topics.
jezdez' proposal seems to be rather reasonable: just force the user to explicitly select yes or no - that gets over the objection that people will be too lazy to opt-in, since the effort is the same. And it removes another source of bias, which is the disabling of the tracking by redistributers like Debian, since the user does provide explicit permission.
If forced on screen with a honest message, people will just opt out in droves and make the numbers as useless as the PyPI-download ones.
This seems such a huge waste of time and effort. If they can't get funding by showing massive PyPI numbers, they won't get funding by showing massive startapp numbers.
I allow both Eclipse and Firefox DE to collect usage and bug information during my use of those systems ... I feel there are a few keys to making this decision for both platforms:
- I can opt out if I want to
- I can see what's sent if I want to
- The information is anonymized and aggregated
I would assume that Django developers would feel the same way as I do if there were these guarantees - that it's also in my interest for the software to improve.
What if, instead of tracking, they added micropayments? Have a very simple way to donate $1 every time you run startapp or something like that, and boom, profit.
> FWIW I (as the editor of LWN and the author of the article) do not mind the posting of this link. It has brought in 16,000 people (at last count), many of whom are probably unfamiliar with LWN. Some subscriptions have been sold in the process.
> Certainly I don't want large amounts of our content to be distributed this way, but an occasional posting that puts an LWN article at #1 on HN is going to do us far more good than harm.
I use Django quite a bit, and would immediately disable any such tracking mechanism, even going to the extent of maintaining my own fork if necessary.
Having this on tools (like brew) is sort of OK because you can disable it and not risk having it deployed to production. Having it on a library is senseless, risky in many regards and likely to get it banned from, say, public contracts.
It is also a likely hook for exploitation, but I'll need to see an implementation first. Which I sure hope won't happen.
The problem is Django itself as a framework and Python as a slow infrastructure for it are getting too old with time. I love Django but it grows too restrictive as projects get more complicated (ORM and template rendering for example), not to mention the slow performance compared to new languages like Go and Elixir, which is actually Python's responsibility not Django.
Django is a monolithic framework that wants to do everything while there are good and even superior alternatives(SQlAlchemy, Jinja2, WTForms), which makes things harder for its developers.
i dont know what that has to do with embedding tracking in code.
if youre trying to make the point that funding is made more difficult and therefore exacerbates the analytics problem, sure. but isn't that out of scope?
[+] [-] czep|9 years ago|reply
While it would overcount the number of true installations of projects using Django, judging by the number of times I spin up a VM for testing, I would still argue that would be a better metric than a custom GA integration for which you'd have no relevant point of comparison. Even if they were to make this opt-out, what would they compare it to?
A: "Based on our custom GA developer tracking, we count 400,000 new Django projects this month."
B: "Django is the 4th most frequently installed third party Python package, based on the Python package index."
Personally I'd trust statement B more than A. No one can independently verify statement A.
[+] [-] ubernostrum|9 years ago|reply
[+] [-] folz|9 years ago|reply
[+] [-] kdeldycke|9 years ago|reply
[+] [-] yladiz|9 years ago|reply
However, would it be very useful statistically compared to the Pypi installation numbers? Sure, Python is different than NPM because NPM almost always locally installs packages whereas Python installs globally by default, but the numbers must still be high as Django is likely one of the highest installed packages from Pypi and in Python-land in general and as czep points out, because they would only be tracking themselves, it would be hard to compare numbers to anything. It would be useful from a total amount perspective but it wouldn't have any use in comparing to other packages because the kind of data would be different: Django would have usage statistics whereas Pypi has installation/download statistics.
I'm also surprised this is even necessary, since the main purpose of this is supposedly to be able to talk to potential investors for the DSF with concrete numbers. Is Django being basically familiar with every Python developer not enough? I'd really want to know specifically if investors have said they want usage data explicitly, rather than the nebulous idea that it may help make it easier to raise money before I'm more open to the proposal.
[+] [-] forgotpwtomain|9 years ago|reply
As an occasional Django user -- 100% on this. It's nothing difficult to store and persist some key-value pairs from a POST request, certainly doesn't require Google Analytics.
[+] [-] was_boring|9 years ago|reply
There is a problem to be solved (how to make OSS sustainable), and I'm both interested in solving that problem and trying different approaches to solve it.
(edited for less use of the phrase "I'm all for it")
[+] [-] ubernostrum|9 years ago|reply
[+] [-] rtpg|9 years ago|reply
The information doesn't seem valuable given the context of this project.
It being GA is a bit bothersome, though it does extract a lot of useful info
[+] [-] msane|9 years ago|reply
edit: why downvote? that's what it says:
> the developer commands: startproject, startapp, runserver
[+] [-] yeukhon|9 years ago|reply
I think they should run annual campaign like Mozilla and Wikipedia. The spend of the money should be 100% transparent. I am not really sure why we need a Foundation. I get the hosting cost, and rewarding people to work on very difficult features and enhancements, but what else? Conference cost & scholarship? What else.
[+] [-] rokosbasilisk|9 years ago|reply
[+] [-] cyberpanther|9 years ago|reply
[+] [-] Walkman|9 years ago|reply
"It is encouraging to see that a community can discuss such issues without heating up too much and shows great maturity for the Django project."
[+] [-] daenney|9 years ago|reply
[+] [-] unknown|9 years ago|reply
[deleted]
[+] [-] icebraining|9 years ago|reply
[+] [-] toyg|9 years ago|reply
This seems such a huge waste of time and effort. If they can't get funding by showing massive PyPI numbers, they won't get funding by showing massive startapp numbers.
[+] [-] twsted|9 years ago|reply
[+] [-] Lazare|9 years ago|reply
Certainly it seems more practical than any of the proposed alternatives suggested here. (Eg, micropayments. Come on, that's not even plausible...)
[+] [-] smoyer|9 years ago|reply
- I can opt out if I want to
- I can see what's sent if I want to
- The information is anonymized and aggregated
I would assume that Django developers would feel the same way as I do if there were these guarantees - that it's also in my interest for the software to improve.
[+] [-] Rondom|9 years ago|reply
[+] [-] toyg|9 years ago|reply
[+] [-] cauterized|9 years ago|reply
How many $10 frameworks would you be willing to pay for if you didn't know you were going to use them?
Would you pay $10 to install django to spin up a new env to build a pluggable library for it that you intend to open source?
What about $10 to populate your environment each time you run a build on circleci?
[+] [-] pryelluw|9 years ago|reply
[+] [-] rantanplan|9 years ago|reply
The very next minute a fork of a free version would ensue.
In an era that almost all similar frameworks are for free, charging for it seems like a really bad idea for its future.
[+] [-] JupiterMoon|9 years ago|reply
[+] [-] ris|9 years ago|reply
[+] [-] DanBC|9 years ago|reply
> FWIW I (as the editor of LWN and the author of the article) do not mind the posting of this link. It has brought in 16,000 people (at last count), many of whom are probably unfamiliar with LWN. Some subscriptions have been sold in the process.
> Certainly I don't want large amounts of our content to be distributed this way, but an occasional posting that puts an LWN article at #1 on HN is going to do us far more good than harm.
> (That said, I do appreciate your concern!)
https://news.ycombinator.com/item?id=3793183#3793448
[+] [-] cx1000|9 years ago|reply
[+] [-] icebraining|9 years ago|reply
[+] [-] rcarmo|9 years ago|reply
Having this on tools (like brew) is sort of OK because you can disable it and not risk having it deployed to production. Having it on a library is senseless, risky in many regards and likely to get it banned from, say, public contracts.
It is also a likely hook for exploitation, but I'll need to see an implementation first. Which I sure hope won't happen.
[+] [-] myf01d|9 years ago|reply
Django is a monolithic framework that wants to do everything while there are good and even superior alternatives(SQlAlchemy, Jinja2, WTForms), which makes things harder for its developers.
[+] [-] kirkdouglas|9 years ago|reply
[+] [-] wheelerwj|9 years ago|reply
if youre trying to make the point that funding is made more difficult and therefore exacerbates the analytics problem, sure. but isn't that out of scope?