There are a lot of good ideas and sentiments in this article. And then at the end I find out the trail-blazing product he left academia for is "a newsfeed based on your interests."
How many startups like this are there now? 300? That is probably a low estimate. I'm sure that everyone living in the Valley personally knows at least one founder working on the exact same product. And none of them are better than my Hacker News/ Facebook/ Reddit/ RSS feed combo.
I've heard some of the many Prismatic competitors describe themselves as "a Pandora for news" which is an apt description, since rdio, soundcloud, and Spotify are better than Pandora. Self, social and community sourcing for digital entertainment tends to be better than algorithms. While I believe the "problem" domain is severely overworked, if you're going to stick with it then my bet is that social algorithms like collaborative filtering a la Netflix does a better job than topic modeling.
On the optimistic side, a good company is more than a single product. Maybe the author's considerable expertise and experience from building Prismatic will lead to something cool down the road.
As a prismatic user, I think the product he's built is pretty far ahead of any competitors I've seen.
I don't think the fact that others have tried with limited success is an indictment of prismatic at all. If anything, that strengthens that argument that this project required a lot of NLP skill.
My feelings exactly. It's made worse by the fact that, out of all the academic departments out there, AI and NLP research has perhaps the greatest potential to transform society in so many ways.
He leaves the field trying to create intelligent machines in order to write a Google News clone? I truly don't get it.
Prismatic has a much larger vision of bringing together elements from my AI and machine learning research background along with modern interaction design to build smart everyday consumer products.
The current app is just the first step in a longer roadmap of building smart products. Currently, we're great at discovering new articles, but in the future will include discovering relevant apps, music, movies, and local events. Don’t be too surprised if we’re on your TV soon.
Keep in mind that if it happened that you had an AI that could quickly scope people's desires out, the application you would want to be selling would be:
"The 'it gives you what you want' thing"
Rather than any kind of fancy description. It wouldn't be the description that sold customers, it would the fact that it really, actually gave them what they want effortlessly that would sell them. And course, news is gives you a lot information to easily parse. So from the "I have a program that's unique in its language comprehension" perspective, this makes perfect sense. If the product was more specialized, it quite possibly would not have as much scope to demonstrate that it could choose for you.
Of course, whether AI could possibly work for this is another question but I can understand why he'd want to try this route.
To be fair, PG is the one advocating that engineers satisfy their immediate, and often extraordinarily boring, needs.
There are 300 startups that are funded doing newsfeeds, even while those startups might be a tiny minority of exciting ideas.
On the other hand, the role of business is not to realize and finance science fiction. Become a writer (or perhaps a Researcher) rather than a programmer if you genuinely have a great imagination.
Great post. As someone who works in a different field of science, one bit got me thinking:
>Like any academic community, the work within NLP had become largely an internal dialogue about approaches to problems the community had itself reified into importance.
I would argue that this is an issue of goals. If you're motivated by the application of research results to solve practical problems (as the author is), then this is a valid criticism. But for me science is also self evidently valuable — understanding the principles that govern the universe is a noble goal irrespective of finding opportunities to apply them.
Perhaps the field plays a role as well. In all sciences one's goal is to discover the rules and facts that govern a particular system. In the natural sciences, this system happens to be the world in which we live, whereas in the formal sciences (math / CS), the system is often an artificial one of human construction. In the natural sciences, the importance of the rules you discover is self evident (they govern our own lives and capabilities!), whereas in the formal sciences, its more necessary to justify the importance of your discoveries (with, say, practical applications) because the importance of the system you're studying isn't as self-evident.
This isn't to say that the natural sciences are in some way superior; just a speculation about the attitudes/motivations of academics in different fields.
In the natural sciences, this system happens to be the world in which we live
Not to say I disagree with your post, but it's it more accurate to say that the natural sciences today explore models for the world in which we live? The models tend to be fairly obvious for things on the human scale, but when you talk about systems on the atomic scale, or on the cosmic scale, the immediacy of the models tends to break down. The result is that, just as for the abstract sciences (math, CS, etc), the usefulness of the models has to be justified. So, I feel that the line between purely abstract sciences and "natural sciences" is not quite as fine and well-defined as your post makes it to be.
Great article, and agreed on most everything you said. Regarding Prismatic itself, some constructive feedback (feel free to ignore).
My immediate first impression was that this was another app that would waste my time. I want less of those, not more. The absolute best news feed app I've used is Flipboard for the iPad, and even then I don't use it too much as I feel too much like I'm wasting time with it (like HN!).
Secondarily, the homepage doesn't grab you enough. The text in the graphics is too blurry and the pictures are generic. The homepage below the icons doesn't have the production values to explain why it's cool.
I'm not trying to be negative here. Some of the points in your article really hit home (email summarization would be awesome and paradigm shifting). So I wonder if you might focus more on making something that will save people time and solve a pain point vs. "another web-based time-wasting thing" (that may not be fair, but that was a first impression).
For example, can you scrape an inbox and list of Facebook/Github/Tumblr/RSS/Twitter feeds to get a single high-sensitivity greatest hits from all monitored news feeds? This way you can check a single page in five minutes on your phone and feel reasonably content that you saw the top headlines for that week. Kind of like news.ycombinator.com/best.
Hi Aria. I have a question regarding the use of automatic sentiment analysis and it's relationship to choice. It seems to me that there is a fundamental divergence between the program of "give me what I will enjoy" and "give me what I need". It seems that your startup is focused entirely on the former question, and there's no doubt that if you are successful you will indeed be consuming more attention, which is of course the primary commodity in the attention economy.
But wouldn't it be better to use NLP to emancipate people from the attention economy altogether, to help identify not what people want, but what they need to he happier, and more engaged in the real world doing real things for real people?
How does prismatic actually use NLP for finding articles you like? Does it use sentiment analysis of reviews from people it thinks are similar to you?
Btw I think "subreddits on steroids" is an incredible testimonial catchphrase - it would have been enough for me to buy your app, but I don't own a smartphone.
Hi. I'm trying use natural language generation (NLG) software, and found lots of long dead projects, two large lisp-based packages (FUF/SURGE, KMPL) that are not updated in years and not user friendly at all and a simple java package (SimpleNLG) that is indeed simple but not very sophisticated. Is this representative of the NLP/NLG software?
- Do you think you would have the opportunity to avail yourself of the fruits of "academia" if the current approach was fundamentally broken?
- What would happen if viability of academic research was tied to one being a successful business person? (Asking in light of your comment elsewhere in this page re. "tenure system".)
Loved your post. Very articulate and nicely written.
I've been amidst the academia -> industrial fun transition myself this past year, absolutely best decision i've made in the past half decade.
One point you only lightly touched on, but which I think is worth restating is this: in academia theres this frustrating sense of "if you're not narrow, you can't be deep", whereas I see a lot of the more sophisticated bits of industry really valuing folks who strive for broad depth.
Would you agree with that assessment?
(admission: i'm presently having a go at building some tech products/tools that tie into a whole range of my researchy interests)
Really interesting article, especially since I'm an assistant prof idly considering the same sort of move. How much of a "business model" did you have before commitment -- did you have enough connections that you were confident you'd get funding once you had the idea mapped out in detail? Was UMass still a viable safety net, or were you fully committed?
A really fantastic article overall. I was never personally at risk of academia, so I don't resonate with that part of the article. But I particularly liked these two sentences for their practical insight:
that process of taking qualitative ideas and struggling to represent them computationally is the core of artificial intelligence (AI).
and
that path from research to product rarely works, and when it does it's because a company is built with research at its core
Thanks! It seems like an obvious thing, but thinking about "operationalizing" intuitions computationally really helped me hone in on what I should be focusing on with research. My advisor Dan Klein was really awesome at teaching that.
Awesome post. I -- along with many of my colleagues from grad school -- have had pretty similar experiences with academia (minus the outlook for a promising academic career ;) ). This sort of thing (very smart people abandoning academia) is going to continue to happen unless academia figures out a way to make itself more relevant.
academic |ˌakəˈdemik|
adjective
2 not of practical relevance; of only theoretical interest : the debate has been largely academic.
My experience in academia thus far: for every five smart people abandoning academia for industry, one truly brilliant person remains behind. That's all that is required for the model to sustain. (And even then, there are not enough academic jobs to go around.)
Well, I think the number one thing that would correct the balance between academia and industry is if people could freely go from one to the other. Someone could take insights from one arena to the other, enriching both.
There are many reasons why this is difficult, but the primary obstacle is the tenure system.
Agree with the author that academia sometimes focusses on a very narrow band of research topics, which might or might not improve end user experience. Also, agree with the author that personalized news has very interesting NLP+Machine Learning problems.
However, I am not convinced that personalized news is what users wants, and whether users want to discover popular news and articles by serendipity, socially, rather than personalization by algorithms. Further, I think personalized news has very limited opportunity for generating significant revenue.
I faced a similar problem in academia, but it was in economics. I started my undergrad career as a computer science major, later ending up in economics, and I went on for my PhD in economics after finishing my BA. For those that aren't familiar, a PhD in economics is a lot like a PhD in mathematics. If you don't already respect economists mathematical ability, you really should, but that's not my point here because math is just a tool, and contains no absolute truths for the social scientist.
In an effort to combine solid theoretical research with well built empirical models, I learned how to program Python to scrape data from web sites. Since nobody around me knew anything about computer science, I found myself having to teach myself everything when it came to "how would I write a program that collects X things from Y products in Z Internet markets". In the beginning, when I was learning to use Python libraries for doing HTTP requests (eventually converging to using 'requests'), how to parse that HTML (started with BeautifulSoup and then converged to 'lxml'), and how to aggregate and analyze that data (eventually converging to 'pandas'), I had to spend a lot of time learning the ins and outs of Python. To this day, I still think it was a great choice because Python has changed the way I approach any kind of business question that could be answered with a well defined empirical model.
As a research assistant, I would spend 12 hours writing scripts to automate the collection and analysis of data for some project I discussed with a professor. The day after writing that script, I would go and talk to the professor and want to discuss some of the computational issues with the script (say, encoding issues, or even the use of computers on EC2 to help collect massive amounts of data every day), but the economics professors would not care at all to hear about any of that. "You need to be studying the economics, not the computer science" they'd tell me. This would infuriate me, because in my mind, if you wanted to be able to ask all these interesting questions that rely on the data, then you need to spend time to make sure you've done the computation correctly and in a way that is reliable. Maybe I dug my own grave by showing my excitement as I began to become more fluent with Python, and thus more confident in my abilities.
Nonetheless, I eventually decided to leave my PhD program because my interests in topics that lie at the intersection of computer science and economics were a bad fit for my program. It was so disheartening to me because I truly loved economics, and was confident that what I intended to do with my PhD research was going to be unique and lay a framework for the field of industrial organization. However, the experience I received in my program was the lack of an adviser that could truly help me achieve what I wanted to do, and a regular negative response to anything I'd bring up that was out of the realm of what my professors were used to talking to graduate students about.
Overall, as my tone may signal, graduate school has left me very, very bitter.
I've talked with many friends in a similar situation as yours (econ phd) and they all faced the same issues. Some advisors don't even care if you are coding in Python or in stone tables, as long as you "have three stars in your results table". Your comments about the quality of the computations also reminded me a lot about Ken Judd's (often ignored) arguments about paying attention to all these implementation details.
Good luck with your current goals, and I really hope you are more successful than within the limitations and issues of academia.
PS: Just out of curiosity, what were you interested on, regarding your IO research?
I think part of your problem was that PhD economics programs are so narrow, and this is part of the problem with economics generally.
It has ignored too much of what is occurring outside its discipline, and turned in on itself and neat models that do not actually describe how the world works.
I found it funny that your background was the opposite of mine. I was in love with Economics in high school and undergrad. But when I took my first programming course in undergrad, I slowly shifted away from a B.A. to B.Sc. Btw, my primary attraction to CS was that I could create things in software. In grad school, I tried to merge my love for econ with CS .. it made for good CS work but no contributions to Economics. Funny how life turns out :) Best of luck to you!
What exactly do people get out of all these news aggregators? I'd imagine the front page of reddit or at least alocalised version is more interesting than a personilised version.
When does it go from news to infotainment?
And if thats not the case why would someone think something beyond simple facebook like or twitter comment extractions is a big deal? I've seen lots of papers where representing a web page and a user in a vector space based on TFIDF performs well.
I've even see Yahoo research post slides about using some form of random bucket for news recommendation because the novelty of newer / stranger articles improves ad click throughs.
Then again did this article come out the same time as the PR release about funding for their company? If thats the case is this just a wide scale PR initiative?
I signed up to see how accurate it could be - the interests outlined in the e-mail I received seem pretty accurate but the stories that are being displayed have nothing to do with those topics. There was a dearth of interesting reading material.
Having spent several years doing academic research before finally leaving to work on real-world problems in machine learning and NLP, I can relate to this article.
If any of you have problems in this domain, I am interested to chat.
Building Prismatic is way more than just about having a formal ML background. It's that along with large-scale systems skills and having a sixth sense for working with data (text especially) and knowing what simple ideas will work and what needs to be complicated.
So to answer your question yes, you need a formal ML background but you need a lot else. Luckily, you can pick up all these skills from online courses, real world building, and a lot of self study and improvement
[+] [-] jacoblyles|13 years ago|reply
How many startups like this are there now? 300? That is probably a low estimate. I'm sure that everyone living in the Valley personally knows at least one founder working on the exact same product. And none of them are better than my Hacker News/ Facebook/ Reddit/ RSS feed combo.
I've heard some of the many Prismatic competitors describe themselves as "a Pandora for news" which is an apt description, since rdio, soundcloud, and Spotify are better than Pandora. Self, social and community sourcing for digital entertainment tends to be better than algorithms. While I believe the "problem" domain is severely overworked, if you're going to stick with it then my bet is that social algorithms like collaborative filtering a la Netflix does a better job than topic modeling.
On the optimistic side, a good company is more than a single product. Maybe the author's considerable expertise and experience from building Prismatic will lead to something cool down the road.
[+] [-] dbecker|13 years ago|reply
I don't think the fact that others have tried with limited success is an indictment of prismatic at all. If anything, that strengthens that argument that this project required a lot of NLP skill.
[+] [-] xaa|13 years ago|reply
He leaves the field trying to create intelligent machines in order to write a Google News clone? I truly don't get it.
[+] [-] aria|13 years ago|reply
The current app is just the first step in a longer roadmap of building smart products. Currently, we're great at discovering new articles, but in the future will include discovering relevant apps, music, movies, and local events. Don’t be too surprised if we’re on your TV soon.
[+] [-] joe_the_user|13 years ago|reply
Keep in mind that if it happened that you had an AI that could quickly scope people's desires out, the application you would want to be selling would be:
"The 'it gives you what you want' thing"
Rather than any kind of fancy description. It wouldn't be the description that sold customers, it would the fact that it really, actually gave them what they want effortlessly that would sell them. And course, news is gives you a lot information to easily parse. So from the "I have a program that's unique in its language comprehension" perspective, this makes perfect sense. If the product was more specialized, it quite possibly would not have as much scope to demonstrate that it could choose for you.
Of course, whether AI could possibly work for this is another question but I can understand why he'd want to try this route.
[+] [-] doctorpangloss|13 years ago|reply
There are 300 startups that are funded doing newsfeeds, even while those startups might be a tiny minority of exciting ideas.
On the other hand, the role of business is not to realize and finance science fiction. Become a writer (or perhaps a Researcher) rather than a programmer if you genuinely have a great imagination.
[+] [-] IheartApplesDix|13 years ago|reply
[deleted]
[+] [-] jamesjporter|13 years ago|reply
>Like any academic community, the work within NLP had become largely an internal dialogue about approaches to problems the community had itself reified into importance.
I would argue that this is an issue of goals. If you're motivated by the application of research results to solve practical problems (as the author is), then this is a valid criticism. But for me science is also self evidently valuable — understanding the principles that govern the universe is a noble goal irrespective of finding opportunities to apply them.
Perhaps the field plays a role as well. In all sciences one's goal is to discover the rules and facts that govern a particular system. In the natural sciences, this system happens to be the world in which we live, whereas in the formal sciences (math / CS), the system is often an artificial one of human construction. In the natural sciences, the importance of the rules you discover is self evident (they govern our own lives and capabilities!), whereas in the formal sciences, its more necessary to justify the importance of your discoveries (with, say, practical applications) because the importance of the system you're studying isn't as self-evident.
This isn't to say that the natural sciences are in some way superior; just a speculation about the attitudes/motivations of academics in different fields.
[+] [-] iyulaev|13 years ago|reply
Not to say I disagree with your post, but it's it more accurate to say that the natural sciences today explore models for the world in which we live? The models tend to be fairly obvious for things on the human scale, but when you talk about systems on the atomic scale, or on the cosmic scale, the immediacy of the models tends to break down. The result is that, just as for the abstract sciences (math, CS, etc), the usefulness of the models has to be justified. So, I feel that the line between purely abstract sciences and "natural sciences" is not quite as fine and well-defined as your post makes it to be.
[+] [-] aria|13 years ago|reply
[+] [-] temphn|13 years ago|reply
My immediate first impression was that this was another app that would waste my time. I want less of those, not more. The absolute best news feed app I've used is Flipboard for the iPad, and even then I don't use it too much as I feel too much like I'm wasting time with it (like HN!).
Secondarily, the homepage doesn't grab you enough. The text in the graphics is too blurry and the pictures are generic. The homepage below the icons doesn't have the production values to explain why it's cool.
I'm not trying to be negative here. Some of the points in your article really hit home (email summarization would be awesome and paradigm shifting). So I wonder if you might focus more on making something that will save people time and solve a pain point vs. "another web-based time-wasting thing" (that may not be fair, but that was a first impression).
For example, can you scrape an inbox and list of Facebook/Github/Tumblr/RSS/Twitter feeds to get a single high-sensitivity greatest hits from all monitored news feeds? This way you can check a single page in five minutes on your phone and feel reasonably content that you saw the top headlines for that week. Kind of like news.ycombinator.com/best.
Just some thoughts, FWIW.
[+] [-] javajosh|13 years ago|reply
But wouldn't it be better to use NLP to emancipate people from the attention economy altogether, to help identify not what people want, but what they need to he happier, and more engaged in the real world doing real things for real people?
[+] [-] mikedmiked|13 years ago|reply
Btw I think "subreddits on steroids" is an incredible testimonial catchphrase - it would have been enough for me to buy your app, but I don't own a smartphone.
[+] [-] rom16384|13 years ago|reply
[+] [-] eternalban|13 years ago|reply
- What would happen if viability of academic research was tied to one being a successful business person? (Asking in light of your comment elsewhere in this page re. "tenure system".)
[+] [-] carterschonwald|13 years ago|reply
One point you only lightly touched on, but which I think is worth restating is this: in academia theres this frustrating sense of "if you're not narrow, you can't be deep", whereas I see a lot of the more sophisticated bits of industry really valuing folks who strive for broad depth.
Would you agree with that assessment?
(admission: i'm presently having a go at building some tech products/tools that tie into a whole range of my researchy interests)
[+] [-] pseut|13 years ago|reply
[+] [-] HalcyonicStorm|13 years ago|reply
[+] [-] nobody-special|13 years ago|reply
http://cdn.getprismatic.com/cdn/img/people/aria-bio__60cac85...
[+] [-] marshray|13 years ago|reply
that process of taking qualitative ideas and struggling to represent them computationally is the core of artificial intelligence (AI).
and
that path from research to product rarely works, and when it does it's because a company is built with research at its core
[+] [-] aria|13 years ago|reply
[+] [-] iamtheneal|13 years ago|reply
academic |ˌakəˈdemik| adjective
2 not of practical relevance; of only theoretical interest : the debate has been largely academic.
[+] [-] hyperbovine|13 years ago|reply
[+] [-] betula82|13 years ago|reply
[+] [-] aria|13 years ago|reply
There are many reasons why this is difficult, but the primary obstacle is the tenure system.
[+] [-] mitultiwari|13 years ago|reply
Agree with the author that academia sometimes focusses on a very narrow band of research topics, which might or might not improve end user experience. Also, agree with the author that personalized news has very interesting NLP+Machine Learning problems.
However, I am not convinced that personalized news is what users wants, and whether users want to discover popular news and articles by serendipity, socially, rather than personalization by algorithms. Further, I think personalized news has very limited opportunity for generating significant revenue.
[+] [-] jacoblyles|13 years ago|reply
[+] [-] junktest|13 years ago|reply
[+] [-] mikeklaas|13 years ago|reply
[+] [-] zissou|13 years ago|reply
In an effort to combine solid theoretical research with well built empirical models, I learned how to program Python to scrape data from web sites. Since nobody around me knew anything about computer science, I found myself having to teach myself everything when it came to "how would I write a program that collects X things from Y products in Z Internet markets". In the beginning, when I was learning to use Python libraries for doing HTTP requests (eventually converging to using 'requests'), how to parse that HTML (started with BeautifulSoup and then converged to 'lxml'), and how to aggregate and analyze that data (eventually converging to 'pandas'), I had to spend a lot of time learning the ins and outs of Python. To this day, I still think it was a great choice because Python has changed the way I approach any kind of business question that could be answered with a well defined empirical model.
As a research assistant, I would spend 12 hours writing scripts to automate the collection and analysis of data for some project I discussed with a professor. The day after writing that script, I would go and talk to the professor and want to discuss some of the computational issues with the script (say, encoding issues, or even the use of computers on EC2 to help collect massive amounts of data every day), but the economics professors would not care at all to hear about any of that. "You need to be studying the economics, not the computer science" they'd tell me. This would infuriate me, because in my mind, if you wanted to be able to ask all these interesting questions that rely on the data, then you need to spend time to make sure you've done the computation correctly and in a way that is reliable. Maybe I dug my own grave by showing my excitement as I began to become more fluent with Python, and thus more confident in my abilities.
Nonetheless, I eventually decided to leave my PhD program because my interests in topics that lie at the intersection of computer science and economics were a bad fit for my program. It was so disheartening to me because I truly loved economics, and was confident that what I intended to do with my PhD research was going to be unique and lay a framework for the field of industrial organization. However, the experience I received in my program was the lack of an adviser that could truly help me achieve what I wanted to do, and a regular negative response to anything I'd bring up that was out of the realm of what my professors were used to talking to graduate students about.
Overall, as my tone may signal, graduate school has left me very, very bitter.
[+] [-] zzleeper|13 years ago|reply
Good luck with your current goals, and I really hope you are more successful than within the limitations and issues of academia.
PS: Just out of curiosity, what were you interested on, regarding your IO research?
[+] [-] thmcmahon|13 years ago|reply
It has ignored too much of what is occurring outside its discipline, and turned in on itself and neat models that do not actually describe how the world works.
[+] [-] throwaway1979|13 years ago|reply
[+] [-] pseut|13 years ago|reply
[+] [-] Irishsteve|13 years ago|reply
When does it go from news to infotainment?
And if thats not the case why would someone think something beyond simple facebook like or twitter comment extractions is a big deal? I've seen lots of papers where representing a web page and a user in a vector space based on TFIDF performs well.
I've even see Yahoo research post slides about using some form of random bucket for news recommendation because the novelty of newer / stranger articles improves ad click throughs.
Then again did this article come out the same time as the PR release about funding for their company? If thats the case is this just a wide scale PR initiative?
[+] [-] unknown|13 years ago|reply
[deleted]
[+] [-] yawgmoth|13 years ago|reply
[+] [-] unknown|13 years ago|reply
[deleted]
[+] [-] sbashyal|13 years ago|reply
If any of you have problems in this domain, I am interested to chat.
[+] [-] galois198|13 years ago|reply
[+] [-] aria|13 years ago|reply
So to answer your question yes, you need a formal ML background but you need a lot else. Luckily, you can pick up all these skills from online courses, real world building, and a lot of self study and improvement
[+] [-] hna0002|13 years ago|reply
[+] [-] HalcyonicStorm|13 years ago|reply