top | item 5579804

A new search engine

105 points| trishume | 13 years ago |samuru.com | reply

103 comments

order
[+] drakaal|13 years ago|reply
I didn't submit this. I didn't know anyone was submitting it. Or I would have written a post about why it is different than other search engines. Better is still up for grabs because it is new.

Samuru doesn't use link authority, it analyzes pages and matches what you queried to the types of pages and picks the best matches.

Let me give you an example.

You search for "How to Make cupcakes" Google says give me the pages that have the most inbound linkes (over simplification) that contain all those words. The winner is Brandon's Cupcakes (not really but play along for a minute) because it says, "We know how to make the best cupcakes, because we have been doing it for 25 years"

That is not a useful result. Samuru on the other hand says "how to make cupcakes is a search for instructions" and it looks for pages that match the words, and are written as instructions.

We weigh other factors, like is there an author associated with the article. Do they routinely write about the topic?

We do this for reviews, products and other things as well.

To be a full replacement for Google we need Driving directions, and image search and a lot of things. But in order to do all the other things we are doing we needed a search engine. (related content, analysis, speed testing, building a corpus of words)

Responses get better if you search something someone else has searched or do a second search 30 seconds later. This is because we haven't deep indexed the entire Internet yet, and so we don't have all the deep data.

[+] danso|13 years ago|reply
Re: your portrayal of how Google works...An "over simplification"? It's just plain wrong. Google has, for quite awhile, not depended on sites containing all the words of a query...and natural language processing plays a huge part in analyzing intent of a query.

I applaud this ambitious project but I'm skeptical you'll achieve what you aim for if you're way off the mark in understanding how Google is so successful...I mean, to even talk of replacing Google at this stage -- and saying it's just a matter of providing rich snippets and other ancillary features as if that was your engine's main deficiency compared to Google -- is quite bold and a little cart before horse, IMO

-

Edit: an example...I did a search for my own name, something I do habitually because I'm locked in an eternal struggle with a younger, better looking, more talented namesake for the top Google result. However, your search engine returns neither me nor my singing rival as the top result...instead you return the domain that is my first and last name with a hyphen, which is exactly the superficial result that Google was designed to avoid.

[+] Matt_Cutts|13 years ago|reply
Hey Brandon, congrats on launching Samuru! I'll be curious what you think of running a search engine after being an SEO for so many years.
[+] pyre|13 years ago|reply
Quick comment on the interface. Looks like the initial page is optimized for 1024x768. I'm on a netbook at 1024x600 (even less so since I'm not at fullscreen). While I realize that I'm in the minority, it looks like the issue is everything is position absolutely. The only reason that the bottom of the page is cut off is because there is a bunch of empty space between the top of the page and the logo: http://imgur.com/vDFwY8b

I realize that this is a bit of a nitpick, but I felt the need to mention it.

[+] mrknmc|13 years ago|reply
This is really strange. I just searched for "how to ride a bike" and the first links from Samuru are completely useless whereas the first link from Google is exactly what I wanted, instructions from wikihow. How do you explain that?
[+] buro9|13 years ago|reply
Pre-Google, this is pretty much how search engines worked... by analysing page content and weighting that rather than the network of links around the page.

Having just played with it, it feels both backwards and refreshing to go back to that. The results are different enough to feel good for the terms I used.

[+] drakaal|13 years ago|reply
Other features I should have mentioned: Threaded results. If a result is cited by other results, they will be grouped so that you can see the conversation across sites.

Better Social Media integration. We do Facebook, Twitter, Google Plus not just Google Plus for showing authors.

Voice Input if you are on Chrome 25 or higher.

Results are returned with Summaries not Snippets.

With that I am falling asleep. I have enjoyed answering questions on this an the https://news.ycombinator.com/item?id=5579336 thread but 5 hours of it has worn me out. If you leave comments I'll promise to get back to them.

[+] sashagim|13 years ago|reply
So how exactly do you get better results 30 seconds later? How do you index more relevant pages? Do you... Google it?
[+] lightonseo|13 years ago|reply
Hi, congratulations, I like really Samuru also if it's not perfect. I wanted ask you two questions :

1) Are you sure that giving a "bonus" to domains containing a part of a query is a good idea ? I understand the reason behind that, and know that you need time to turn off this "bonus" but waiting that moment are you really sure that is a good idea ?

When I type "How to rank well on Google" the first results is www.google.com => http://www.samuru.com/?q=How+to+rank+well+in+Google

Instead from the third positions the web pages seems to be great.

2) how works the search suggest ?

I m a french user and in our language we have a lot of accents like "é è ù à". While typing a search query many people do not use them. When i correctly type a query with the accents, Samuru suggests the same query but without accents, this is wrong and that's why I m asking me about the provenience of data used by the search engine to provide these queries suggests.

I really wish you to accomplish this project.

[+] marcioaguiar|13 years ago|reply
I just searched for How to make cupcakes and the first three results were instructions of how to make a cupcake, including a video on youtube.

P.S.: I mistakenly typed HOT to make cupcakes

[+] nano111|13 years ago|reply
it is now very hard to get many Google results that contains all your search terms which is why I start to dislike it... for example it gives you "synonyms" or the terms are completly missing.... I sure will give yours a try
[+] raulonkar|13 years ago|reply
I am more curious about traffic. does your search engines provide traffic like google to site?
[+] fmoralesc|13 years ago|reply
OK. My first try was to search for "plato dialogue concerning friendship". Google gave me the result I expected (a reference to the Lysis dialogue) through wikipedia and a bunch of articles about it (the most helpful being a link to the Stanford Encyclopedia of Philosophy, ranked third). It didn't link the text though in the first page (it only appears in the third page of results, with a copy at the MIT Classics archive). Samuru gives me a bunch of general articles on Plato first (oddly enough, the first results are articles from the SEP, but not the article on "Plato on Friendship and Eros".), some noise and then information on the Lysis. The text itself appeared at 25th place.

Something I find interesting is that one of the snippets samuru gave me (on the 5th result) has a pretty good description of the lysis as the item most likely to be the "plato dialogue concerning friendship": "the dramatically later Lysis presents Plato's more developed understanding of love and friendship than the dramatically earlier Symposium and Phaedrus". From this description of the Lysis one could gather that the text of the Lysis itself should be a very relevant result to the query; at the very least, that information about it should be weighted as more relevant to the query than info on the Symposium or the Phaedrus, and then info on those over all else. From this, I think, one could build a better representation of a good answer to the query than in google or samuru.

I think natural language analysis is very promising here. I hope work on this area yields good results, but it seems like a hard problem.

[+] Dn_Ab|13 years ago|reply
Counterpoint. I was surprised by it. A couple weeks ago I decided to start recording any search phrases which I felt were tricky or required good language modelling.

"baby features kept in adulthood" is the only one I've thought worth recording so far. You can compare the results in Google, Bing, DDG. Only Samuru and Google have it on page 1. Samuru has it as the first result. But this is just one example so I can't draw any conclusions. Curious to see how well it performs in general.

[+] samirahmed|13 years ago|reply
About 1 month ago I switched from google to bing. There different queries that use the to measure 'better'.

For simple queries 'strncmp', 'giraffe', 'sound transit schedule' ... Google, Bing and Samuru perform pretty well. But Samuru is extremely slow.

For more complex queries like, 'seattle dumpling restaurant that is famous in singapore' or 'how to zip a list in ruby'. I find that Google always comes out on top, bing lacks the previous search history to personalize my searches and often thinks I mean (zip as in zipfile)... But samuru gave me relevant results for all three which is rather surprising.

Another type is one for people/social related searches... Bing's facebook/twitter/linkedin/yelp integration actually makes it better than google because the 'snapshot' bar it has is super helpful. However Samuru results are on par with Google and Bing results here (minus the snapshot bar).

Overall I was skeptical but other than it being unbearable slow (Google spoilt us with speed), Samuru does have very good search results for what I assume is not a mutlibillion dollar product.

[+] drakaal|13 years ago|reply
Do your search a second time. Our index isn't exhaustive yet, and we are slow if we haven't seen enough of the results pages before. We generate the summaries and a bunch of other things after something has appeared in a search result. This is because we aren't a billion dollar company and have to be efficient in our indexing.
[+] ok_craig|13 years ago|reply
I don't understand. Is this spam? There is no context or accompanying article for the claim. I searched my name and the results weren't nearly as good. One data point, sure, but first impression is everything.

Edit for context: original title read: "This search engine is better than google."

[+] drakaal|13 years ago|reply
Search your name a second time. It will do better. The rankings require us to have looked at all the pages a second time.
[+] valtron|13 years ago|reply
[+] drakaal|13 years ago|reply
Nope it doesn't. We decided that it was hard enough getting advertising without having "adult" search. We focus on text analysis so we aren't very good at porn searches.
[+] DanBC|13 years ago|reply
I wish you luck with this!

Google is excellent. Bing is also excellent (with minor differences). DDG and Blekko are adding interesting and useful features.

But they all feel a bit like they're a mono-culture, and thus vulnerable to gaming. Black-hat seo seems to be something that Google is pretty good[1] at dealing with. White hat SEO and ads have changed the web drastically from what I remember.

So it's really nice to have an alternative method of search that searches in a different way. Your post (https://news.ycombinator.com/item?id=5580321) highlights a few things I find frustrating in search at the moment.

[1] It's odd that all the work they do isn't noticed.

[+] nilkn|13 years ago|reply
I'm willing to have an open mind about this, but I think some sort of explanation on what samuru is hoping to achieve in distinction from other search engines would be helpful.
[+] monsterix|13 years ago|reply
Exactly. Doing MVP of a search engine is hard, so it is okay to lack on quality of results initially when you launch. On HN probably. Even DDG is trying to only catch-up.

But to keep the engine running, and keep the hacker interested you should tell what distinction samuru is trying to achieve with its search engine.

And perhaps this query http://www.samuru.com/?q=porn should not be blocked by default, rather provide tools for safe search. Heard of the porn cookie guy? Just copy his footsteps, I'd say.

[+] drakaal|13 years ago|reply
I posted an explanation in the comments. I didn't post this, or expect it to be posted so I didn't write anything up in advance.
[+] D9u|13 years ago|reply
I got fairly good results using decidedly esoteric queries, and although I'm on a very s-l-o-w connection I didn't notice a great speed discrepancy.
[+] orangethirty|13 years ago|reply
Aside from the different processing on the back-end, what else does samuru do? I'm curious.

Disclaimer: I'm the guy behind Nuuton (a search engine).

[+] saejox|13 years ago|reply
I find it funny that a Google competitor search engine using Google Analytics and AdSense.
[+] drakaal|13 years ago|reply
Selling your own ads is hard. Especially at low volumes getting started. So your choices are basically Google and Microsoft. (Chitika doesn't pay anything)
[+] xaviel|13 years ago|reply
Their SEO engine is easy to game
[+] drakaal|13 years ago|reply
In what way? Writing something the meets our qualifications for "what is a review" is much harder to game than Link spamming. You can game the system only by writing content that is useful to the user.

The only easy to game part is that we give brands a pretty big bonus for themselves. Sony.com/playstation will always be the top hit for Sony PlayStation. Even if we should favor a .gov result that says they are recalled for bursting in to flames. But as that rarely becomes an issue we are ok with that being number 2.

[+] arcatek|13 years ago|reply
"Why Samuru" => Spelling suggestion : "Why Samurai"

It's interesting, results are not so far from what I want. I'll give it a look for my next searchs.

[+] drakaal|13 years ago|reply
Samuru was a samurai. Several Japanese anime characters are named Samuru and are samurai. It is also turkish for otter.
[+] tokenadult|13 years ago|reply
Ghostery reminds me that this site runs Google Analytics, so the site founders apparently do trust Google for some services.
[+] prawn|13 years ago|reply
Doesn't seem to tailor results to your location so might not be as useful for people outside of the US? Or did I just try a stupid search? I performed a vanity search and it was listing different names before there was anything about me. Same search in Australia on Google has me in four of the top six spots.
[+] kludu|13 years ago|reply
I searched for "pussy", "sex" and "porn".

No results.

WTF is this shit?

[+] nu2ycombinator|13 years ago|reply
Better in the what sense? Its not better in respective to speed of returning the results.
[+] drakaal|13 years ago|reply
Google has 100 people searching every thing that can be searched. We have to do the work when you do the search. We get faster the more people use us. Exponentially.
[+] saintx|13 years ago|reply
How can they trademark the words "Liquid Helium"? The first search I did on Samuru was for Liquid Helium and it brought back about a half million results, all of which I assume are violating its purported trademark.
[+] drakaal|13 years ago|reply
We can trade mark it in the context of software. We don't sell frigid gas.
[+] kephra|13 years ago|reply
two suggestions:

- you need a favicon, so its possible to pull your site into an icon bar for bookmarking.

- you need a search engine registration, so its possible to use it from search engine tab in browser

[+] aw3c2|13 years ago|reply
At least in Opera and Chromium you can simply right-click the search form to add it.