top | item 8075318

(no title)

mlmilleratmit | 11 years ago

In our spare time, we're researching this dataset in detail. Here are some questions that we're interested in. Would love to hear other ideas and to have folks dig into the data. I think this dataset may be of interest to hackers, researchers and marketers.

1. Are the trajectories (e.g. rank vs time) for all popular posts of the same shape? They look ~logarithmic.

2. Are there identifiable clusters when you look in 4d space for rank vs points vs comments?

3. How does the impact of a post depend quantitatively on its respective cohort. I.e., what's a good model to normalize performance based on what else was happening that day?

4. What fraction of posts have comment threads that are "hijacked" by the first comment? Is their a quantitative way to find this, perhaps by looking at (2) above?

5. What are more detailed metrics to collapse "performance" of a post onto a single number?

6. How does performance on HN compare to reddit, etc?

7. How is the HN community different than other communities, if at all?

8. Given the time-dependent data, can we create a good estimator for the number of active HN users per day? Or can we at least create a relative ranking of the number of unique users between different days?

discuss

order

anigbrowl|11 years ago

# of comments vs points would be interesting, and if you are willing to crawl the comments then diversity or complexity of commentary would also be interestin g- eg several multi-threaded discussions vs. a string of 'this is awesome' comments on some popular but shallow topic (eg Huble space telescope imagery or somesuch, which tends to attract much admiration but not necessarily a lot of discussion).

kordless|11 years ago

9. Sentiment of comments, via comment downvotes and/or contextual analysis.

Also, frequent violators of 'hijacking' the most popular comment by commenting on it. :)

mlmilleratmit|11 years ago

Good thought. I've been trying to think of how to do that without crawling the comments, too.