I didn't submit this. I didn't know anyone was submitting it. Or I would have written a post about why it is different than other search engines. Better is still up for grabs because it is new.
Samuru doesn't use link authority, it analyzes pages and matches what you queried to the types of pages and picks the best matches.
Let me give you an example.
You search for "How to Make cupcakes"
Google says give me the pages that have the most inbound linkes (over simplification) that contain all those words.
The winner is Brandon's Cupcakes (not really but play along for a minute) because it says, "We know how to make the best cupcakes, because we have been doing it for 25 years"
That is not a useful result. Samuru on the other hand says "how to make cupcakes is a search for instructions" and it looks for pages that match the words, and are written as instructions.
We weigh other factors, like is there an author associated with the article. Do they routinely write about the topic?
We do this for reviews, products and other things as well.
To be a full replacement for Google we need Driving directions, and image search and a lot of things. But in order to do all the other things we are doing we needed a search engine. (related content, analysis, speed testing, building a corpus of words)
Responses get better if you search something someone else has searched or do a second search 30 seconds later. This is because we haven't deep indexed the entire Internet yet, and so we don't have all the deep data.
Re: your portrayal of how Google works...An "over simplification"? It's just plain wrong. Google has, for quite awhile, not depended on sites containing all the words of a query...and natural language processing plays a huge part in analyzing intent of a query.
I applaud this ambitious project but I'm skeptical you'll achieve what you aim for if you're way off the mark in understanding how Google is so successful...I mean, to even talk of replacing Google at this stage -- and saying it's just a matter of providing rich snippets and other ancillary features as if that was your engine's main deficiency compared to Google -- is quite bold and a little cart before horse, IMO
-
Edit: an example...I did a search for my own name, something I do habitually because I'm locked in an eternal struggle with a younger, better looking, more talented namesake for the top Google result. However, your search engine returns neither me nor my singing rival as the top result...instead you return the domain that is my first and last name with a hyphen, which is exactly the superficial result that Google was designed to avoid.
Quick comment on the interface. Looks like the initial page is optimized for 1024x768. I'm on a netbook at 1024x600 (even less so since I'm not at fullscreen). While I realize that I'm in the minority, it looks like the issue is everything is position absolutely. The only reason that the bottom of the page is cut off is because there is a bunch of empty space between the top of the page and the logo: http://imgur.com/vDFwY8b
I realize that this is a bit of a nitpick, but I felt the need to mention it.
This is really strange. I just searched for "how to ride a bike" and the first links from Samuru are completely useless whereas the first link from Google is exactly what I wanted, instructions from wikihow. How do you explain that?
Pre-Google, this is pretty much how search engines worked... by analysing page content and weighting that rather than the network of links around the page.
Having just played with it, it feels both backwards and refreshing to go back to that. The results are different enough to feel good for the terms I used.
Other features I should have mentioned:
Threaded results. If a result is cited by other results, they will be grouped so that you can see the conversation across sites.
Better Social Media integration. We do Facebook, Twitter, Google Plus not just Google Plus for showing authors.
Voice Input if you are on Chrome 25 or higher.
Results are returned with Summaries not Snippets.
With that I am falling asleep. I have enjoyed answering questions on this an the https://news.ycombinator.com/item?id=5579336 thread but 5 hours of it has worn me out. If you leave comments I'll promise to get back to them.
Hi, congratulations, I like really Samuru also if it's not perfect. I wanted ask you two questions :
1) Are you sure that giving a "bonus" to domains containing a part of a query is a good idea ? I understand the reason behind that, and know that you need time to turn off this "bonus" but waiting that moment are you really sure that is a good idea ?
Instead from the third positions the web pages seems to be great.
2) how works the search suggest ?
I m a french user and in our language we have a lot of accents like "é è ù à". While typing a search query many people do not use them. When i correctly type a query with the accents, Samuru suggests the same query but without accents, this is wrong and that's why I m asking me about the provenience of data used by the search engine to provide these queries suggests.
it is now very hard to get many Google results that contains all your search terms which is why I start to dislike it... for example it gives you "synonyms" or the terms are completly missing.... I sure will give yours a try
OK. My first try was to search for "plato dialogue concerning friendship". Google gave me the result I expected (a reference to the Lysis dialogue) through wikipedia and a bunch of articles about it (the most helpful being a link to the Stanford Encyclopedia of Philosophy, ranked third). It didn't link the text though in the first page (it only appears in the third page of results, with a copy at the MIT Classics archive). Samuru gives me a bunch of general articles on Plato first (oddly enough, the first results are articles from the SEP, but not the article on "Plato on Friendship and Eros".), some noise and then information on the Lysis. The text itself appeared at 25th place.
Something I find interesting is that one of the snippets samuru gave me (on the 5th result) has a pretty good description of the lysis as the item most likely to be the "plato dialogue concerning friendship": "the dramatically later Lysis presents Plato's more developed understanding of love and friendship than the dramatically earlier Symposium and Phaedrus". From this description of the Lysis one could gather that the text of the Lysis itself should be a very relevant result to the query; at the very least, that information about it should be weighted as more relevant to the query than info on the Symposium or the Phaedrus, and then info on those over all else. From this, I think, one could build a better representation of a good answer to the query than in google or samuru.
I think natural language analysis is very promising here. I hope work on this area yields good results, but it seems like a hard problem.
Counterpoint. I was surprised by it. A couple weeks ago I decided to start recording any search phrases which I felt were tricky or required good language modelling.
"baby features kept in adulthood" is the only one I've thought worth recording so far. You can compare the results in Google, Bing, DDG. Only Samuru and Google have it on page 1. Samuru has it as the first result. But this is just one example so I can't draw any conclusions. Curious to see how well it performs in general.
About 1 month ago I switched from google to bing. There different queries that use the to measure 'better'.
For simple queries 'strncmp', 'giraffe', 'sound transit schedule' ...
Google, Bing and Samuru perform pretty well. But Samuru is extremely slow.
For more complex queries like, 'seattle dumpling restaurant that is famous in singapore' or 'how to zip a list in ruby'. I find that Google always comes out on top, bing lacks the previous search history to personalize my searches and often thinks I mean (zip as in zipfile)... But samuru gave me relevant results for all three which is rather surprising.
Another type is one for people/social related searches... Bing's facebook/twitter/linkedin/yelp integration actually makes it better than google because the 'snapshot' bar it has is super helpful. However Samuru results are on par with Google and Bing results here (minus the snapshot bar).
Overall I was skeptical but other than it being unbearable slow (Google spoilt us with speed), Samuru does have very good search results for what I assume is not a mutlibillion dollar product.
Do your search a second time. Our index isn't exhaustive yet, and we are slow if we haven't seen enough of the results pages before. We generate the summaries and a bunch of other things after something has appeared in a search result. This is because we aren't a billion dollar company and have to be efficient in our indexing.
I don't understand. Is this spam? There is no context or accompanying article for the claim. I searched my name and the results weren't nearly as good. One data point, sure, but first impression is everything.
Edit for context: original title read: "This search engine is better than google."
Nope it doesn't. We decided that it was hard enough getting advertising without having "adult" search. We focus on text analysis so we aren't very good at porn searches.
Google is excellent. Bing is also excellent (with minor differences). DDG and Blekko are adding interesting and useful features.
But they all feel a bit like they're a mono-culture, and thus vulnerable to gaming. Black-hat seo seems to be something that Google is pretty good[1] at dealing with. White hat SEO and ads have changed the web drastically from what I remember.
So it's really nice to have an alternative method of search that searches in a different way. Your post (https://news.ycombinator.com/item?id=5580321) highlights a few things I find frustrating in search at the moment.
[1] It's odd that all the work they do isn't noticed.
I'm willing to have an open mind about this, but I think some sort of explanation on what samuru is hoping to achieve in distinction from other search engines would be helpful.
Exactly. Doing MVP of a search engine is hard, so it is okay to lack on quality of results initially when you launch. On HN probably. Even DDG is trying to only catch-up.
But to keep the engine running, and keep the hacker interested you should tell what distinction samuru is trying to achieve with its search engine.
And perhaps this query http://www.samuru.com/?q=porn should not be blocked by default, rather provide tools for safe search. Heard of the porn cookie guy? Just copy his footsteps, I'd say.
Selling your own ads is hard. Especially at low volumes getting started. So your choices are basically Google and Microsoft. (Chitika doesn't pay anything)
In what way? Writing something the meets our qualifications for "what is a review" is much harder to game than Link spamming. You can game the system only by writing content that is useful to the user.
The only easy to game part is that we give brands a pretty big bonus for themselves. Sony.com/playstation will always be the top hit for Sony PlayStation. Even if we should favor a .gov result that says they are recalled for bursting in to flames. But as that rarely becomes an issue we are ok with that being number 2.
Doesn't seem to tailor results to your location so might not be as useful for people outside of the US? Or did I just try a stupid search? I performed a vanity search and it was listing different names before there was anything about me. Same search in Australia on Google has me in four of the top six spots.
Google has 100 people searching every thing that can be searched. We have to do the work when you do the search. We get faster the more people use us. Exponentially.
How can they trademark the words "Liquid Helium"? The first search I did on Samuru was for Liquid Helium and it brought back about a half million results, all of which I assume are violating its purported trademark.
[+] [-] drakaal|13 years ago|reply
Samuru doesn't use link authority, it analyzes pages and matches what you queried to the types of pages and picks the best matches.
Let me give you an example.
You search for "How to Make cupcakes" Google says give me the pages that have the most inbound linkes (over simplification) that contain all those words. The winner is Brandon's Cupcakes (not really but play along for a minute) because it says, "We know how to make the best cupcakes, because we have been doing it for 25 years"
That is not a useful result. Samuru on the other hand says "how to make cupcakes is a search for instructions" and it looks for pages that match the words, and are written as instructions.
We weigh other factors, like is there an author associated with the article. Do they routinely write about the topic?
We do this for reviews, products and other things as well.
To be a full replacement for Google we need Driving directions, and image search and a lot of things. But in order to do all the other things we are doing we needed a search engine. (related content, analysis, speed testing, building a corpus of words)
Responses get better if you search something someone else has searched or do a second search 30 seconds later. This is because we haven't deep indexed the entire Internet yet, and so we don't have all the deep data.
[+] [-] danso|13 years ago|reply
I applaud this ambitious project but I'm skeptical you'll achieve what you aim for if you're way off the mark in understanding how Google is so successful...I mean, to even talk of replacing Google at this stage -- and saying it's just a matter of providing rich snippets and other ancillary features as if that was your engine's main deficiency compared to Google -- is quite bold and a little cart before horse, IMO
-
Edit: an example...I did a search for my own name, something I do habitually because I'm locked in an eternal struggle with a younger, better looking, more talented namesake for the top Google result. However, your search engine returns neither me nor my singing rival as the top result...instead you return the domain that is my first and last name with a hyphen, which is exactly the superficial result that Google was designed to avoid.
[+] [-] Matt_Cutts|13 years ago|reply
[+] [-] pyre|13 years ago|reply
I realize that this is a bit of a nitpick, but I felt the need to mention it.
[+] [-] mrknmc|13 years ago|reply
[+] [-] buro9|13 years ago|reply
Having just played with it, it feels both backwards and refreshing to go back to that. The results are different enough to feel good for the terms I used.
[+] [-] drakaal|13 years ago|reply
Better Social Media integration. We do Facebook, Twitter, Google Plus not just Google Plus for showing authors.
Voice Input if you are on Chrome 25 or higher.
Results are returned with Summaries not Snippets.
With that I am falling asleep. I have enjoyed answering questions on this an the https://news.ycombinator.com/item?id=5579336 thread but 5 hours of it has worn me out. If you leave comments I'll promise to get back to them.
[+] [-] sashagim|13 years ago|reply
[+] [-] lightonseo|13 years ago|reply
1) Are you sure that giving a "bonus" to domains containing a part of a query is a good idea ? I understand the reason behind that, and know that you need time to turn off this "bonus" but waiting that moment are you really sure that is a good idea ?
When I type "How to rank well on Google" the first results is www.google.com => http://www.samuru.com/?q=How+to+rank+well+in+Google
Instead from the third positions the web pages seems to be great.
2) how works the search suggest ?
I m a french user and in our language we have a lot of accents like "é è ù à". While typing a search query many people do not use them. When i correctly type a query with the accents, Samuru suggests the same query but without accents, this is wrong and that's why I m asking me about the provenience of data used by the search engine to provide these queries suggests.
I really wish you to accomplish this project.
[+] [-] marcioaguiar|13 years ago|reply
P.S.: I mistakenly typed HOT to make cupcakes
[+] [-] nano111|13 years ago|reply
[+] [-] raulonkar|13 years ago|reply
[+] [-] fmoralesc|13 years ago|reply
Something I find interesting is that one of the snippets samuru gave me (on the 5th result) has a pretty good description of the lysis as the item most likely to be the "plato dialogue concerning friendship": "the dramatically later Lysis presents Plato's more developed understanding of love and friendship than the dramatically earlier Symposium and Phaedrus". From this description of the Lysis one could gather that the text of the Lysis itself should be a very relevant result to the query; at the very least, that information about it should be weighted as more relevant to the query than info on the Symposium or the Phaedrus, and then info on those over all else. From this, I think, one could build a better representation of a good answer to the query than in google or samuru.
I think natural language analysis is very promising here. I hope work on this area yields good results, but it seems like a hard problem.
[+] [-] Dn_Ab|13 years ago|reply
"baby features kept in adulthood" is the only one I've thought worth recording so far. You can compare the results in Google, Bing, DDG. Only Samuru and Google have it on page 1. Samuru has it as the first result. But this is just one example so I can't draw any conclusions. Curious to see how well it performs in general.
[+] [-] samirahmed|13 years ago|reply
For simple queries 'strncmp', 'giraffe', 'sound transit schedule' ... Google, Bing and Samuru perform pretty well. But Samuru is extremely slow.
For more complex queries like, 'seattle dumpling restaurant that is famous in singapore' or 'how to zip a list in ruby'. I find that Google always comes out on top, bing lacks the previous search history to personalize my searches and often thinks I mean (zip as in zipfile)... But samuru gave me relevant results for all three which is rather surprising.
Another type is one for people/social related searches... Bing's facebook/twitter/linkedin/yelp integration actually makes it better than google because the 'snapshot' bar it has is super helpful. However Samuru results are on par with Google and Bing results here (minus the snapshot bar).
Overall I was skeptical but other than it being unbearable slow (Google spoilt us with speed), Samuru does have very good search results for what I assume is not a mutlibillion dollar product.
[+] [-] jggonz|13 years ago|reply
We actually have a /programming slashtag that is very useful for these kind of queries.
http://blekko.com/ws/?q=how+to+zip+a+list+in+ruby+%2Fruby
http://blekko.com/ws/?q=seattle+dumpling+restaurant+that+is+...
Just for fun... results in tablet-friendly format:
http://izik.com/?q=seattle%20dumpling%20restaurant%20that%20...
[+] [-] drakaal|13 years ago|reply
[+] [-] ok_craig|13 years ago|reply
Edit for context: original title read: "This search engine is better than google."
[+] [-] furyofantares|13 years ago|reply
[+] [-] drakaal|13 years ago|reply
[+] [-] drakaal|13 years ago|reply
[deleted]
[+] [-] valtron|13 years ago|reply
[+] [-] drakaal|13 years ago|reply
[+] [-] DanBC|13 years ago|reply
Google is excellent. Bing is also excellent (with minor differences). DDG and Blekko are adding interesting and useful features.
But they all feel a bit like they're a mono-culture, and thus vulnerable to gaming. Black-hat seo seems to be something that Google is pretty good[1] at dealing with. White hat SEO and ads have changed the web drastically from what I remember.
So it's really nice to have an alternative method of search that searches in a different way. Your post (https://news.ycombinator.com/item?id=5580321) highlights a few things I find frustrating in search at the moment.
[1] It's odd that all the work they do isn't noticed.
[+] [-] nilkn|13 years ago|reply
[+] [-] monsterix|13 years ago|reply
But to keep the engine running, and keep the hacker interested you should tell what distinction samuru is trying to achieve with its search engine.
And perhaps this query http://www.samuru.com/?q=porn should not be blocked by default, rather provide tools for safe search. Heard of the porn cookie guy? Just copy his footsteps, I'd say.
[+] [-] drakaal|13 years ago|reply
[+] [-] unknown|13 years ago|reply
[deleted]
[+] [-] D9u|13 years ago|reply
[+] [-] orangethirty|13 years ago|reply
Disclaimer: I'm the guy behind Nuuton (a search engine).
[+] [-] saejox|13 years ago|reply
[+] [-] drakaal|13 years ago|reply
[+] [-] prawn|13 years ago|reply
[+] [-] matiasb|13 years ago|reply
[+] [-] p1mrx|13 years ago|reply
http://support.google.com/a/bin/answer.py?hl=en&answer=2...
[+] [-] xaviel|13 years ago|reply
[+] [-] drakaal|13 years ago|reply
The only easy to game part is that we give brands a pretty big bonus for themselves. Sony.com/playstation will always be the top hit for Sony PlayStation. Even if we should favor a .gov result that says they are recalled for bursting in to flames. But as that rarely becomes an issue we are ok with that being number 2.
[+] [-] arcatek|13 years ago|reply
It's interesting, results are not so far from what I want. I'll give it a look for my next searchs.
[+] [-] drakaal|13 years ago|reply
[+] [-] tokenadult|13 years ago|reply
[+] [-] prawn|13 years ago|reply
[+] [-] kludu|13 years ago|reply
No results.
WTF is this shit?
[+] [-] nu2ycombinator|13 years ago|reply
[+] [-] drakaal|13 years ago|reply
[+] [-] saintx|13 years ago|reply
[+] [-] drakaal|13 years ago|reply
[+] [-] kephra|13 years ago|reply
- you need a favicon, so its possible to pull your site into an icon bar for bookmarking.
- you need a search engine registration, so its possible to use it from search engine tab in browser
[+] [-] aw3c2|13 years ago|reply
[+] [-] drakaal|13 years ago|reply