The folks doing the ripping off here probably don't even see it as such. While it's easy to attribute something like this to malice, it's saner to attribute it to staggering incompetence and lack of introspection. Pity the poor fools who are so unable to do something original that they must stoop to claiming someone else's originality as their own.
I've seen exactly this sort of thing before in all sorts of walks of life, from school work to academia, from corporate environments to international politics - a bunch of mediocre, confused, out of their depth but don't know it types take someone (singular or plural) else's idea, barely repackage it, claim it as their own, wholeheartedly convince themselves that it is their own, and run with it - far too often somehow winning the battle of hearts and minds, as when the original author comes along going "Oi!", their readership or allies go "sore loser".
I've learned by this point in life to take people using my work, even if they do so spitefully or in ignorance, as a compliment. Fabricators usually get shown up sooner or later.
They are probably the same people who complained when their teachers in college gave them "C"'s because they didn't properly attribute, also. I guess some people don't ever learn to be nice citizens.
(Note: Yes, I know the MIT license doesn't require it. I can read. But reading the Medium article, they KNEW they were replacing minimaxir, etc with "The Author". Doing this is really showing their ass.)
> The folks doing the ripping off here probably don't even see it as such.
But I'm pretty sure many of them would if they saw it happening to their original works.
A fair number of times I've seen people accused of copying stuff that someone else copied form elsewhere (certain petty contacts I have on facebook who find jokes, repeat them without attribution, and don't like them being repeated again similarly without attribution). It often amuses me more than the original joke...
Originally there was the concept of reblogging where one site would publish something original and others would publish a small excerpt with a clearly defined link to them.
From there came the idea of the [via] link where another site published its own article based on information from the original publisher and attributed a small link.
And from there it seems like there is just so much republished content that sites think of the information as just public information and glean from wherever without any sort of attribution.
One thing I've noticed is that those who come out of the woodwork to mitigate the impact of thieving tend to be of the same mentality as the thief.
Yes, a person can normally defend many matters without being attacked as identifying with any one matter. Yet this is one of those subjects, like plagiarism, where you would have to be ignorant or a thug to justify it.
I worked at Broderbund many years ago. In the interview my future boss asked me if I had any game ideas. I mentioned I'd not seen a good game with the Smokey the Bear. When I started a couple of weeks later, as I was introducing myself to the other programmers, one mentioned that he's working on a Smokey the Bear pitch that came out of the blue from the boss. I mentioned it was my idea from the interview, and this programmer insisted it couldn't have been because the boss had brought it as his idea.
Fast forward a few months, I've presented 23 game concepts and they were all turned down as "not good game ideas. Two weeks later i stumbled into a meeting where one of those that turned down my idea, an artist, was presenting 3 of my ideas as his own. I companied to the boss who said: "Ideas are free". My retort was: "Sure, but the credit isn't".
That was many years ago and I still find those two people to be disgusting in their attitude.
>" I mentioned I'd not seen a good game with the Smokey the Bear. When I started a couple of weeks later, as I was introducing myself to the other programmers, one mentioned that he's working on a Smokey the Bear pitch that came out of the blue from the boss."
You could have turned the whole thing into a game. If he's willing to steal a Smokey the Bear idea, just how ridiculous of an idea is he willing to steal?
Maybe this is a lesson for the future for us all: Don't give away ideas unless you've been hired, or someone has signed an NDA. Otherwise he's kinda right: They stole your idea because you had no legal claim to it. Morally they're dicks but hopefully you've learned from this.
Maybe in this situation, if you're going to divulge an idea to someone, do it very publicly (Twitter, company Slack, etc) so all can see you as the source and judge the idea-thieves for themselves.
> I saw no mention of Max Woolf, minimaxir, or any mention of the original visualization by the post author.
I agree that it is good practice to show attribution, and not directly lie saying you wrote something yourself. However the MIT license doesn't require that.
There are a lot of open source licenses that give more protections to the original author. I have a theory that too many people choose MIT just because it is simple, but don't think through what they actually want from a license.
I think that's his point - he doesn't want a license to protect this, at the end he even explicitly states in spite of this he won't change his process - he just wishes people wouldn't be jerks about this kind of thing. It felt, to me, like he doesn't think he should need a license to enforce this kind of behavior, and instead is going to just call out that the guy is being a jerk.
It's also kind of stupid to rip someone off. Assuming all the information here is accurate, this could really come back to bite the guy who ripped him off - someone is going to google what you've done and will end up with this blog post slamming you. If it was an honest mistake, the author should probably take the time to fix the posts and apologize.
Well, actually, the MIT license requires you not strip the attribution if it was present:
"The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
"
In this case, it was present in the LICENSE file in the github repository he links to.
Not preserving it is a clear license violation in this case, since they took substantial portions of the software.
That's why I preface the post title with "please." I know that attribution isn't required and I can't enforce it, but pretending the original analysis doesn't exist is spiteful.
He acknowledges that MIT doesn't prevent this in the conclusion:
> As my code is MIT-licensed, there’s nothing legally preventing the use of my code this way. However, I’m not going to change my workflow; I will still open-source my data visualization code, and it will have the same licensing. It wouldn’t be fair to punish others who have made very good use of my code.
> But outright using the code, without any attribution, and claiming it as your own when it blatantly isn’t, is a jerk move.
I've noticed there is a sub-population within open source who believe there is an etiquette to be followed that goes above and beyond the strict license terms. Of course there is no legal obligation to behave according to the oft-unwritten polite rules of behavior around things like attribution and not just making a near carbon copy for personal gain or aggrandizement when these terms are not in the license, and so many ignore them, perhaps even literally being ignorant of these social norms they are violating to the irritation of the original author.
> I agree that it is good practice to show attribution, and not directly lie saying you wrote something yourself. However the MIT license doesn't require that.
False! This is the license:
>>
The MIT License (MIT)
Copyright (c) 2015 Max Woolf
Permission is hereby granted [blabla do whatever you want] subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
>>
That copyright notice must be included and it wasn't.
In meteorology and oceanography we call this type of plot a Hovmöller diagram [0], and I'm sure it goes by many other names, because it's hardly a difficult idea to arrive at. Perhaps you should credit one of those prior authors?
I would normally try to avoid the snark, but, setting the code-duplication issue aside, the rest of your complaints seem a bit thin-skinned, put in the light of the kinds of criticisms you find in normal scientific literature.
See aakilfernandes' comment here [1], for the appropriate way to look at this other person's comments on your analysis.
That was my thought as well. It's a very typical calendar chart. I'm sure there are many implementation in d3 and ggplot. The author is being a bit too dramatic about it.
Author of the medium post here. First I want to apologize to Max for not linking back to his blog, as he said I referenced his work and image multiple times and did not provide his name or blog in my post. I was in no way trying to pass this off as a novel idea as I said multiple times in the write up.
As for what was perceived to be critiques of his work I was merely trying to emphasize what I was doing differently in my analysis and project. His project is great, I just used a slightly different approach and wanted to highlight the differences. I definitely did not mean to come off as malicious or mocking, but upon rereading I definitely see how it did come off that way.
I stumbled across the code to make the day of week and times here https://github.com/minimaxir/hn-heatmaps/blob/master/hn_heat... after I had started the project. I had posted on StackOverflow http://stackoverflow.com/questions/33263015/converting-day-o... and had gotten an answer that seemed close but not quite perfect. So I researched it a bit, and happened to find that GitHub code. I figured that a couple of lines of code probably weren't worth attributing, but I can see now that I should have been more thorough in my attribution. I have since taken down the article.
I've no intention to address the credit issue, but I think you're getting a little worked up over not much, regarding the mockery part. For example, in /u/dleybzon's comment,
"Since this is a link to my blog post explaining how the visualization was created, I'm not sure if it's necessary to include a comment, but here goes:
"I used BigQuery to query this dataset, and then made the visualization in R using ggpot2. I normalized the data by dividing the total score for that hour of the week by the number of posts posted in that hour. For more info check out the article, or the commented code at the bottom."
I'm unable to see anything that calls for your comment, "As shown above, the quotes from the article itself were written with just as much unnecessary ego."
Yeah, I actually read that quote twice to be sure I wasn't missing anything and didn't detect any of the alleged ego. I had trouble taking the article seriously after that.
I have to agree on this point. If everyone thought the original analysis was the final word, then there would be no need to release the source. Other people are going to think that there was a better way to do things, and that was the point of releasing the code, to facilitate that exploration.
On an unrelated note, some comments on the visualization...
By visualizing the temporal position of posts matching a particular criteria, I suspect the author has accidentally just visualized the temporal position of all posts. There is no correlation between the criteria and the visualization.
It's a variant of the heatmap visualization error explained by xkcd[1]
Just to throw another perspective in here. I once wrote a tool for a teacher to detect copied code for CS assignments for some extra credit. Every step you take to dig deeper just led to more similarity -- things like normalizing variable names. At one point it looked like 80% of the class had turned in the same assignment. Granted these where toy problems.
The heatmap layout you're saying people copied has been done over and over again by almost everyone who's done data visualization on time series data. Is not it possible you looked at the same StackOverflow post?
Even if they where copying you I'd be ecstatic - you've inspired a bunch of people to pick up a dataset on go play around with it. That's wonderful thing, because the world needs more people doing this.
1) The Medium author fucked up and should of credited you. Even if not legally required, its the decent thing to do.
2) Complaining about your analysis being 'insulted' comes off as petty. Ideas/analysis aren't sacred and they should be insulted if someone disagrees with them.
I didn't get the feeling he was insulted that the author disagreed with him, but because the author dismissed his analysis as "useless" while then using all his work as a base. Clearly it was not useless and saying so was an unnecessary insult.
> 2) Complaining about your analysis being 'insulted' comes off as petty. Ideas/analysis aren't sacred and they should be insulted if someone disagrees with them.
I have zero problem with criticizing my analysis with contradicting information, but the manner it comes across is relevant.
You can say something is wrong without calling it "a key mistake."
They absolutely should not be insulted if someone disagrees with them. Presumably the people involved are adults, and should be able to disagree in a civil manner.
Max Woolf is not the kind of person I would want to work with, and his complaints are almost incomprehensible. He is very emotional over something trivial, and his accusation that there was "ego" involved in this "incident" has no basis.
I'm surprised and disappointed - Woolf's victim complex is truly something to behold, and I hope it's not a sign of larger changes in our culture.
It is tough to decide whether or not to call out someone when they blatantly steal your code.
On one hand, you want to take the high road and not have to squeal. But on the other hand, if you don't speak up, that person will just keep on stealing from others, in theory.
Personally, I think the best thing to do is to call them out in a public but quiet way that doesn't make it sound like you are offended, e.g. posting a link to your work in the comments with a "Glad you could use my code: (link)". That way they know that you know that they did it, and that anyone who looks at the comments will see but maybe not make the connection that they stole it. This way you get the point across without having to publicly beat them over the head with it.
Think about the type of person motivated to do something like this. They probably aren't happy, they probably lack self esteem. This is a way to validate themselves. Not saying anything about this is right or moral, just saying that maybe you don't put the full name of the guy out there for anyone to blast. I really doubt this came from a malicious place.
edit: guess this is an unpopular opinion. I don't understand why HN lately is so fond of public shaming, as seen in this thread.
I have to be missing something here. Someone mind helping me out?
The code posted in the "infringing" article does not appear to be a copy of the code in the OP's repository. Nor do the images (OPs are green, the "infringing" articles are blue).
The OP's code also does not have the header in the file, which makes innocent infringement a lot easier to do.
It really does look like the "infringing" author was inspired by the OP... but there's nothing being stolen that I can see here.
>This is off-topic, but wait, what? #1138 is a joke about subsets of populations which look the same as the general population because there is no statistically-significant difference between geographies. It’s not a normalization joke. Normalizing per capita would still make the maps look same.
Wait, wouldn't normalization make the map look uniform, and not a population map?
Normalising "subscribers to Martha Stewart Living" and "consumers of furry pornography" by population would give you per-capita values, showing you places where those properties had high incidence per-person (to which you'd have to apply significance tests). I'm pretty sure it is a normalisation joke.
This reminds me of the argument among comics about joke stealing. Stealing code, ideas or jokes from working programmers or comics can negatively impact the original author, sometimes even causing them to be accused of theft! For a fascinating discussion about the psychology of a thief see Marc Maron's WTF podcast interview with Carlos Mencia (part 1 and 2 - 2 is where it gets interesting.) It's a pathological condition.
A good friend of mine is a comic, and through him I've met other comics. They'll often riff off of each other. Upon hearing something really funny, you'll often hear, "Hey, that's good! Can I steal that?" This is totally accepted and expected behavior among this group of comics. So stealing jokes, per se, is not the problem.
The troublesome part is that there's not a good way for other people outside this group of comics to join in and ask for permission. All they can do is listen to the material live, and then kinda take it if they think they can get away with it.
I'm not sure if this is a problem that needs to be solved, but it's easy to see where these misunderstandings might arise. Everyone copies and steals. Sometimes it's unclear or uncool to ask for permission.
This is all separate from the main article here, where I think it's clear the copier is a huge asshole. I also need to listen to that podcast.
Good read. This has made me realize I should probably reevaluate the license of my open source work and instead of using MIT I think I may transition to use Apache License 2.0 (which sounds like a better fit for the author...I think; I am not a licensing expert).
The offending medium post has been taken down, and the "copy" author removed the post from his facebook (or at least made the post private). Looks like he may have deleted his reddit account too.
And from one comment on medium, the guy was also trying to get help covering his tracks on Stack Overflow by cleaning up some code [0]. Seems like he outsourced pretty much every part of this ripoff.
[+] [-] madaxe_again|10 years ago|reply
I've seen exactly this sort of thing before in all sorts of walks of life, from school work to academia, from corporate environments to international politics - a bunch of mediocre, confused, out of their depth but don't know it types take someone (singular or plural) else's idea, barely repackage it, claim it as their own, wholeheartedly convince themselves that it is their own, and run with it - far too often somehow winning the battle of hearts and minds, as when the original author comes along going "Oi!", their readership or allies go "sore loser".
I've learned by this point in life to take people using my work, even if they do so spitefully or in ignorance, as a compliment. Fabricators usually get shown up sooner or later.
[+] [-] Terr_|10 years ago|reply
[0] http://nedroidcomics.tumblr.com/post/41879001445/the-interne...
[+] [-] JustSomeNobody|10 years ago|reply
(Note: Yes, I know the MIT license doesn't require it. I can read. But reading the Medium article, they KNEW they were replacing minimaxir, etc with "The Author". Doing this is really showing their ass.)
[+] [-] dbbolton|10 years ago|reply
I.e. Hanlon's razor: https://en.wikipedia.org/wiki/Hanlon%27s_razor
[+] [-] dspillett|10 years ago|reply
But I'm pretty sure many of them would if they saw it happening to their original works.
A fair number of times I've seen people accused of copying stuff that someone else copied form elsewhere (certain petty contacts I have on facebook who find jokes, repeat them without attribution, and don't like them being repeated again similarly without attribution). It often amuses me more than the original joke...
[+] [-] stevesearer|10 years ago|reply
Originally there was the concept of reblogging where one site would publish something original and others would publish a small excerpt with a clearly defined link to them.
From there came the idea of the [via] link where another site published its own article based on information from the original publisher and attributed a small link.
And from there it seems like there is just so much republished content that sites think of the information as just public information and glean from wherever without any sort of attribution.
[+] [-] hitekker|10 years ago|reply
Yes, a person can normally defend many matters without being attacked as identifying with any one matter. Yet this is one of those subjects, like plagiarism, where you would have to be ignorant or a thug to justify it.
[+] [-] FLengyel|10 years ago|reply
[+] [-] Macsenour|10 years ago|reply
Fast forward a few months, I've presented 23 game concepts and they were all turned down as "not good game ideas. Two weeks later i stumbled into a meeting where one of those that turned down my idea, an artist, was presenting 3 of my ideas as his own. I companied to the boss who said: "Ideas are free". My retort was: "Sure, but the credit isn't".
That was many years ago and I still find those two people to be disgusting in their attitude.
[+] [-] djloche|10 years ago|reply
You could have turned the whole thing into a game. If he's willing to steal a Smokey the Bear idea, just how ridiculous of an idea is he willing to steal?
[+] [-] Overtonwindow|10 years ago|reply
[+] [-] prawn|10 years ago|reply
[+] [-] vinceyuan|10 years ago|reply
[+] [-] thomasahle|10 years ago|reply
I agree that it is good practice to show attribution, and not directly lie saying you wrote something yourself. However the MIT license doesn't require that.
There are a lot of open source licenses that give more protections to the original author. I have a theory that too many people choose MIT just because it is simple, but don't think through what they actually want from a license.
[+] [-] lexicalscope|10 years ago|reply
It's also kind of stupid to rip someone off. Assuming all the information here is accurate, this could really come back to bite the guy who ripped him off - someone is going to google what you've done and will end up with this blog post slamming you. If it was an honest mistake, the author should probably take the time to fix the posts and apologize.
[+] [-] DannyBee|10 years ago|reply
In this case, it was present in the LICENSE file in the github repository he links to.
Not preserving it is a clear license violation in this case, since they took substantial portions of the software.
[+] [-] mintplant|10 years ago|reply
That's not true.
https://github.com/minimaxir/reddit-bigquery/blob/master/LIC...
> Copyright (c) 2015 Max Woolf
[...]
> The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
[+] [-] minimaxir|10 years ago|reply
[+] [-] ohitsdom|10 years ago|reply
> As my code is MIT-licensed, there’s nothing legally preventing the use of my code this way. However, I’m not going to change my workflow; I will still open-source my data visualization code, and it will have the same licensing. It wouldn’t be fair to punish others who have made very good use of my code.
> But outright using the code, without any attribution, and claiming it as your own when it blatantly isn’t, is a jerk move.
[+] [-] patja|10 years ago|reply
[+] [-] madisp|10 years ago|reply
[+] [-] IkmoIkmo|10 years ago|reply
False! This is the license:
>> The MIT License (MIT)
Copyright (c) 2015 Max Woolf
Permission is hereby granted [blabla do whatever you want] subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. >>
That copyright notice must be included and it wasn't.
[+] [-] mrow84|10 years ago|reply
I would normally try to avoid the snark, but, setting the code-duplication issue aside, the rest of your complaints seem a bit thin-skinned, put in the light of the kinds of criticisms you find in normal scientific literature.
See aakilfernandes' comment here [1], for the appropriate way to look at this other person's comments on your analysis.
[0] https://en.wikipedia.org/wiki/Hovm%C3%B6ller_diagram
[1] https://news.ycombinator.com/item?id=10459559
[+] [-] rodionos|10 years ago|reply
[+] [-] apologizer|10 years ago|reply
Author of the medium post here. First I want to apologize to Max for not linking back to his blog, as he said I referenced his work and image multiple times and did not provide his name or blog in my post. I was in no way trying to pass this off as a novel idea as I said multiple times in the write up.
As for what was perceived to be critiques of his work I was merely trying to emphasize what I was doing differently in my analysis and project. His project is great, I just used a slightly different approach and wanted to highlight the differences. I definitely did not mean to come off as malicious or mocking, but upon rereading I definitely see how it did come off that way.
I stumbled across the code to make the day of week and times here https://github.com/minimaxir/hn-heatmaps/blob/master/hn_heat... after I had started the project. I had posted on StackOverflow http://stackoverflow.com/questions/33263015/converting-day-o... and had gotten an answer that seemed close but not quite perfect. So I researched it a bit, and happened to find that GitHub code. I figured that a couple of lines of code probably weren't worth attributing, but I can see now that I should have been more thorough in my attribution. I have since taken down the article.
[+] [-] mcguire|10 years ago|reply
"Since this is a link to my blog post explaining how the visualization was created, I'm not sure if it's necessary to include a comment, but here goes:
"I used BigQuery to query this dataset, and then made the visualization in R using ggpot2. I normalized the data by dividing the total score for that hour of the week by the number of posts posted in that hour. For more info check out the article, or the commented code at the bottom."
I'm unable to see anything that calls for your comment, "As shown above, the quotes from the article itself were written with just as much unnecessary ego."
[+] [-] davesque|10 years ago|reply
[+] [-] GhotiFish|10 years ago|reply
[+] [-] minimaxir|10 years ago|reply
[+] [-] IanDrake|10 years ago|reply
Do we really need more public shaming without at least a civil discussion between the belligerents? Remember Adria Richards anyone?
[+] [-] cjensen|10 years ago|reply
By visualizing the temporal position of posts matching a particular criteria, I suspect the author has accidentally just visualized the temporal position of all posts. There is no correlation between the criteria and the visualization.
It's a variant of the heatmap visualization error explained by xkcd[1]
[1] https://xkcd.com/1138/
[+] [-] hharnisch|10 years ago|reply
The heatmap layout you're saying people copied has been done over and over again by almost everyone who's done data visualization on time series data. Is not it possible you looked at the same StackOverflow post?
Even if they where copying you I'd be ecstatic - you've inspired a bunch of people to pick up a dataset on go play around with it. That's wonderful thing, because the world needs more people doing this.
[+] [-] aakilfernandes|10 years ago|reply
2) Complaining about your analysis being 'insulted' comes off as petty. Ideas/analysis aren't sacred and they should be insulted if someone disagrees with them.
[+] [-] ViViDboarder|10 years ago|reply
[+] [-] minimaxir|10 years ago|reply
I have zero problem with criticizing my analysis with contradicting information, but the manner it comes across is relevant.
You can say something is wrong without calling it "a key mistake."
[+] [-] s73v3r|10 years ago|reply
[+] [-] lazzlazzlazz|10 years ago|reply
I'm surprised and disappointed - Woolf's victim complex is truly something to behold, and I hope it's not a sign of larger changes in our culture.
[+] [-] Gigablah|10 years ago|reply
There, I paraphrased your post and threw it back at you. How do you feel now?
[+] [-] lighthawk|10 years ago|reply
On one hand, you want to take the high road and not have to squeal. But on the other hand, if you don't speak up, that person will just keep on stealing from others, in theory.
Personally, I think the best thing to do is to call them out in a public but quiet way that doesn't make it sound like you are offended, e.g. posting a link to your work in the comments with a "Glad you could use my code: (link)". That way they know that you know that they did it, and that anyone who looks at the comments will see but maybe not make the connection that they stole it. This way you get the point across without having to publicly beat them over the head with it.
[+] [-] thieving_magpie|10 years ago|reply
edit: guess this is an unpopular opinion. I don't understand why HN lately is so fond of public shaming, as seen in this thread.
[+] [-] falcolas|10 years ago|reply
The code posted in the "infringing" article does not appear to be a copy of the code in the OP's repository. Nor do the images (OPs are green, the "infringing" articles are blue).
The OP's code also does not have the header in the file, which makes innocent infringement a lot easier to do.
It really does look like the "infringing" author was inspired by the OP... but there's nothing being stolen that I can see here.
[+] [-] iamwil|10 years ago|reply
[+] [-] ikeboy|10 years ago|reply
Wait, wouldn't normalization make the map look uniform, and not a population map?
[+] [-] mrow84|10 years ago|reply
[+] [-] michaelwww|10 years ago|reply
[+] [-] verisimilidude|10 years ago|reply
The troublesome part is that there's not a good way for other people outside this group of comics to join in and ask for permission. All they can do is listen to the material live, and then kinda take it if they think they can get away with it.
I'm not sure if this is a problem that needs to be solved, but it's easy to see where these misunderstandings might arise. Everyone copies and steals. Sometimes it's unclear or uncool to ask for permission.
This is all separate from the main article here, where I think it's clear the copier is a huge asshole. I also need to listen to that podcast.
[+] [-] bachmeier|10 years ago|reply
[+] [-] BinaryIdiot|10 years ago|reply
[+] [-] mayoff|10 years ago|reply
[+] [-] ohitsdom|10 years ago|reply
And from one comment on medium, the guy was also trying to get help covering his tracks on Stack Overflow by cleaning up some code [0]. Seems like he outsourced pretty much every part of this ripoff.
[0] http://stackoverflow.com/questions/33263015/converting-day-o...