top | item 17540393

Show HN: I ran sentiment analysis on Show HN comments and got the meanest ones

78 points| walz | 7 years ago |hn.walzr.com | reply

64 comments

order
[+] Nasrudith|7 years ago|reply
It is funny how polite mentions of crashes are regarded so negatively by it compared to messages. Now I wonder what can be said to get an overwhelmingly positive sentiment while being not just rude but downright horrifying messages. Say "I hope you enjoy eating your family."

Reminds me of the one account of an essay question grading program's unique flaw where it scored higher with every mention of orangutans with no regard to relevance or possibily even grammar.

[+] tahw|7 years ago|reply
Do you have a source on the orangutan grader? That sounds like an hilarious read.
[+] eruci|7 years ago|reply
I wrote another sentiment analysis tool. https://goo.gl/UoVo51

Ran it across your comment with these results:

Sentiment Analysis: This text is: positive (+0.5)

funnyregardedcrashespolitewonderoverwhelminglyhorrifyingrudepositiveenjoyflawregard The full range of sentiments in this text is: positive 0.25, trust 0.178571428571429, surprise 0.142857142857143, anticipation 0.107142857142857, joy 0.107142857142857, sadness 0.0714285714285714, negative 0.0714285714285714, fear 0.0357142857142857, anger 0, disgust 0,

[+] Crespyl|7 years ago|reply
The grading program may have been written by a librarian.
[+] walz|7 years ago|reply
When people copy and paste error messages into the comments, they usually contain stuff like "kill", "fatal" or "stopped" which the analysis thinks is negative
[+] brudgers|7 years ago|reply
Looks like a cool project, but I can't scroll very far down the site before my browser crashes. I've reproduced this several times, here's the terminal output if it helps: $ conkeror https://alpha.trycarbide.com ...... JavaScript strict warning: https://alpha.trycarbide.com/, line 603: SyntaxError: test for equality (==) mistyped as assignment (=)? fault....

To me, this is a good ShowHN comment...on the other hand, it might say something about software error messages.

[+] dsfyu404ed|7 years ago|reply
The further down you scroll the more the negativity becomes hit or miss.
[+] NathanKP|7 years ago|reply
It's not a good comment if you consider that the browser crash they were experiencing wasn't even the website's fault, it was the browser being buggy. The JavaScript error that they copied from the console was an error from inside the Conkeror browser itself (that browser is written mainly in JS). So the OP was complaining to the Show HN creator about the failings of a buggy, non-maintained (in the last 6 years) non-mainstream browser that they choose to use to view the website.

Unfortunately I have seen that type of comment quite a bit on some of the Show HN threads: someone complains that the site doesn't work without JavaScript, or doesn't work with their bizarre non standard web browsing setup.

[+] CodeWriter23|7 years ago|reply
Looks like your algorithm is classifying some neutral-toned problem solving type feedback as “mean”. Personally, that’s exactly why I would do a Show HN. Examples

“Looks like a cool project, but I can't scroll very far down the site before my browser crashes. I've reproduced this several times, here's the terminal output if it helps: $ conkeror https://alpha.trycarbide.com ...... JavaScript strict warning: https://alpha.trycarbide.com/, line 603: SyntaxError: test for equality (==) mistyped as assignment (=)? fault.... -16 sentiment, chriswarbo 2 years ago in reply to "Show HN: Carbide – A New Programming Environment"

“just filed an issue - but the error message is pretty obnoxious for a catch all- bound to the $(window) error event is a catch all error that blames me for not having enough data (56 public repos not enough?) This means that anyone who knows this url and decides to look me up will see a message accusing me of being a non-producer if anything goes wrong with the resume -14 sentiment, beezee 6 years ago in reply to "Show HN: My Github rsum"

...those were within the first 10. If this similarly neutral-toned problem solving type report makes me mean in your algorithm’s view, that is a label I shall wear with pride.

[+] jancsika|7 years ago|reply
> The most negative ones are shown below.

The message that is the third most negative by this metric is the following:

> Looks like a cool project, but I can't scroll very far down the site before my browser crashes. I've reproduced this several times, here's the terminal output if it helps: $ conkeror https://alpha.trycarbide.com ...... JavaScript strict warning: https://alpha.trycarbide.com/, line 603: SyntaxError: test for equality (==) mistyped as assignment (=)? fault....

That is clearly a false positive.

[+] craftyguy|7 years ago|reply
My favorite is #5 from the top, where a user DMCA'd themselves to get yahoo to delete some website they created previously:

> I used to have a Geocities containing weird bad poetry I wrote when I was a teenager. I forgot about it, until years later I stumbled upon it again. I was embarrassed. I asked Yahoo to delete it. But I'd forgotten the password, and I'd used fake personal details (wrong date of birth) to create the account, and I couldn't remember what the fake info was, so they refused to delete it because I couldn't verify that I was who I said I was. What do I do? I hit on a solution. I decided to DMCA myself. I sent Yahoo a DMCA takedown request for my old Geocities, and straight away it disappeared. Mission accomplished.

Again, not negative at all, IMHO.

[+] repolfx|7 years ago|reply
It's probably picking up on words like "Looks ... cool ... but", "crashes", "I can't", "strict", "warning", "mistyped", "fault".
[+] philipodonnell|7 years ago|reply
My understanding is that the scores from sentiment analysis are indicating the confidence that the message is positive or negative, not the degree of negativity. Can anyone with experience with this particular method comment?

That _is_ a negative comment, the site is crashing the browser.

[+] kunimi|7 years ago|reply
Quick question. How does your sentiment analysis treat the following two sentences?

I fucking hate this thing.

VS

I fucking love this thing.

[+] walz|7 years ago|reply
I just ran this though and "I fucking hate this thing" has a -7 score, and "I fucking love this thing" is -1. "Fucking" and "hate" is negative, but "love" is positive and adds to the score. It would be improved if it could tell the difference, "fuck" itself definitely sounds negative but "fucking" can almost mean anything
[+] qbrass|7 years ago|reply
>this shit is wayyy too fucking cool....next snapchat!

>-8 sentiment, gailees 5 years ago in reply to "Show HN: Vinepeek - watch the world in realtime in 6 second snippets"

Like that?

[+] anoncoward111|7 years ago|reply
It would have to somehow know that "fucking" in this case is being used as an adverb similar to "really", e.g "I really love dogs".

Another tough one would be "I don't fucking hate dogs", which actually means you like them. The sentence needs to be parsed together, not word for word :)

[+] osrec|7 years ago|reply
When I do my next Show HN, and the negativity gets too much to take, this will be a good resource to turn to, in order to feel a little better about myself (unless, of course, I end up at the top of your list)!
[+] stcredzero|7 years ago|reply
The site is down now. I was going to see if I was on yet another HN leaderboard!
[+] noobermin|7 years ago|reply
Even despite the false positives as pointed out in other comments, I'm delightfully surprised that the worst it gets is around <15% negative comments. Sometimes, HN seems to cynical to me, at least the comments that float up to the top do. What would be interesting is negative comments weighted by the place in the comment section (since you can't see upvoted scores).
[+] jbob2000|7 years ago|reply
The site is blocked for me at work, but if he didn't include shadow banned comments, then he's missing the biggest pool of potentially negative comments.
[+] anonytrary|7 years ago|reply
> how are you going to avoid head hunters' spam, either as fake candidates to discover new clients or with fake offers for CV mining?

Was given a sentiment of -9, but I'd say the sentiment is closer to 0. Anyway, it's clear there are a ton of false positives, but overall, this was a really neat idea and it would definitely be interesting to further index the posts.

[+] misterbowfinger|7 years ago|reply
I'm confused how this comment got rated "-10":

I use Backblaze now and once I get my NAS, I’ll probably end up using a B2 based backup. But let’s make an honest comparison. Backblaze does not replicate your data across data centers. The standard S3 storage class does (0.23/gb). The comparible storage class for S3 is one zone infrequent access (.01/gb). B2 still comes out ahead, but I wouldn’t use either one for primary storage. For thier suggested “3-2-1” backup strategy, sure. Then again, just for backup, I could use S3 glacier for $.004/gb. That’s cheaper than B2 and I get multiple AZ storage. The data charges would be higher - but its backup. If catastrophe struck and I lost my primary and my local backups, getting my data fast is the last thing I would worry about.

https://news.ycombinator.com/item?id=17407275

[+] soared|7 years ago|reply
> does not > I wouldn't use > then again > catastrophe > struck > worry

I can see it. If you bag-of-words'd it there are a lot of negative words used and effectively no positive words.

[+] scarface74|7 years ago|reply
I was about to mention that comment. Especially since it is mine. I actually said that I’m a happy Backblaze customer, and couldn’t see why it was considered so negative.
[+] CM30|7 years ago|reply
Huh, so apparently the day with the highest percentage of mean comments is Sunday.

Anyone want to have a guess as to why that may be the case? Personally, I'd expect people to least happy on Monday morning or something, not the second day they usually get a rest that week.

Similarly confused as to why everyone is supposedly so positive on Wednesday...

[+] joshuak|7 years ago|reply
My guess would be that a larger percentage of respondents are not professionals in the field. The pros are afk.
[+] lylecubed|7 years ago|reply
The difference between the most negative day (14.18%) and the least negative day (10.66%) is very small.
[+] everdev|7 years ago|reply
A few guesses:

1. People who got into family fights over the weekend

2. People who loved online for a minute and got an upsetting work email and are dreading the coming work week.

3. People who feel isolated or bored and are online instead of enjoying their weekend.

Would be curious to see if this is random noise though it if it's consistent year to year.

[+] abjorn|7 years ago|reply
I'm wondering how this compares to the frequency of any comments by day. It could be that more comments are posted on Sunday in general, and the least on Wednesday.
[+] corobo|7 years ago|reply
Smonday - the period of time where you realise your weekend is coming to an end and you have to go back to work tomorrow
[+] scarface74|7 years ago|reply
I’m no expert in the field - I’ve only watched a few videos - but the example I’ve seen is where they use a movie’s rating by a person (1-5) and their comments to train a model and then use the model to determine sentiment analysis. Unfortunately, since AFAIK their isn’t a way to determine how many points a post earned except for your own, he couldn’t do that.
[+] blattimwind|7 years ago|reply
For some reason there is a .nobreak class that's actually enabling word breaks. Weird! And it even goes one step further and enables "word-break: break-all", so that the renderer will break all your nice words apart anywhere. That's not nice.

(Yes, this was a half-arsed attempt to match the "negative sentiment tone" without actually being mean ;)

[+] walz|7 years ago|reply
Haha, thanks for sounding nice :)

I added that because some comments had really long URLs, so I had to enable breaks so the page wouldn't be really wide, more so on phones. Didn't realize that it added hyphens to words, thanks for pointing that out, it's fixed now

[+] shove|7 years ago|reply
I consider it a personal failure of character not to have at least gotten an honorable mention ;)
[+] crsv|7 years ago|reply
This is pretty cool - I'd love to see similar analysis for truly "contentious" comments, wherein there were an almost equal but large amount of upvoting and downvoting, controlled for accounts that have the ability to do either.
[+] bumholio|7 years ago|reply
I hope you realize there is a built in sentiment analyzer on HN, based on highly advanced, natural intelligence algorithms.

Your filter actually seems to dig strongly opinionated posts. They are not automatically bad, and they can be quite good.

[+] sam0x17|7 years ago|reply
I think it's really funny how most of the comments in this list are genuine critiques and concerns, that get downvoted by toxic users. I went through and upvoted about half of them.