top | item 39117876

Show HN: Deep search of all ML papers

109 points| tomhartke | 2 years ago |app.undermind.ai

Built an automated system to run a deep search of ArXiv and carefully find all the precise papers that exist on a complex topic.

It's different from simple RAG because it searches, classifies, and adapts based on relevant papers it uncovers, and then continues until it finds every paper on a topic (trying to mimic the human research process). Benchmarked 10x higher accuracy and total retrieval compared to Google Scholar for a median search (whitepaper on website). Also knows when it is complete, and misses virtually nothing (< 3% or so, once it's converged).

Website has a free trial and a bunch of example search reports. Want feedback and suggestions.

25 comments

tomhartke|2 years ago

Here's an example report on: tokenization-free large language model architectures, which have been shown to achieve compute/accuracy tradeoffs comparable to or better than traditional token-based models https://app.undermind.ai/query_app/display_one_search/05f0b8...

krohling|2 years ago

"Our AI agent finds precisely what you ask for, 10-50x better than Google Scholar"

I was curious how this was measured since benchmarking accuracy for LLMs is tough. Found this in the paper: "This classification accuracy was benchmarked by manually analyzing over 400 papers across a range of representative searches, and comparing the human evaluation to the language model’s judgment"

I'm skeptical that their dataset of 400 papers with 3 classification labels (highly relevant, closely related, or ignorable) is large enough to represent the diversity of queries they're going to get from users. To be clear, I don't think this undermine's (haha) the value of what they've built, still very cool.

espadrine|2 years ago

It is certainly interesting, and I would love to try it for my hobbyist use-cases. I don’t do much research at work, but a fair bit on the weekends.

Are you filtering users however? I cannot sign up in a personal capacity with a GMail email. The page raises this error: “Please use a valid institutional or company email address.”

axg11|2 years ago

I would change the main CTA to "Try it now" and then use a different style for "Read the stats". It currently looks like there are two equally important CTAs.

If you can find a way to make the results closer to real-time, this will be a really popular product.

tomhartke|2 years ago

Appreciate the advice. Re: timing, it's bottlenecked by the sequential nature of the search. To be comprehensive, we discover a few papers, and use that info to choose where to look more closely next.

basb77|2 years ago

How does this compare to a system like Elicit? Seems to be very similar at first glance.

axpy906|2 years ago

What about compare to ArXiv Sanity Preserver?

tomhartke|2 years ago

FYI system is a bit delayed because of traffic levels. May take a bit longer to generate results at the moment (usually takes ~10 min).

Yenrabbit|2 years ago

I tested this out on a topic I'd been discussing with some fellow researchers, and it pulled in the papers we'd chatted about plus a bunch of related ones that look very relevant and interesting. Congrats on a cool project!

kingkongjaffa|2 years ago

It's a shame the research publishing industry is a bunch of walled gardens.

Since this only supports arXiv, and not paper repositories from other industries.

jakderrida|2 years ago

Maybe the model will get smart enough to go to SciHub and Libgen? IP holders and distributors come after me with evidence, I'll just pull out my belt and tell them I gotta go teach some naughty GPUs another lesson.

htrp|2 years ago

what is pricing?

WhitneyLand|2 years ago

I don’t want to be negative but you asked for feedback, so I’ll give you a few impressions including the superficial and quite subjective fwiw:

1. The hyperbolic claims are going to be off-putting to some. You’ve “solved” ML search? 50x better than Google scholar on a metric no one’s been benchmarking against? Consider your audience and what they would find credible.

2. The UX needs work. To give one aesthetic example, in the results there are large, brightly colored, red and green circles that are used inconsistently, and they clash the palette. This stuff can affect how sticky your service is.

3. Don’t restrict signup by email domain. This is nuts. Never add friction to gaining customer relationships. If you’re capacity constrained limit the trial. If you’re trying to segment the market there are better ways.

4. The name “Undermind”, is not working to my ear. It’s worth changing. At least find a product person whose opinion you respect and ask their take.

5. I think a lot of people here would be willing to give you useful technical feedback on the architecture and approach if more information were shared about how the service works, but I didn’t notice that was available.

tremarley|2 years ago

This is very effective criticism.

If OP wants to grow their project, those 4 has to be the first to target.

xcdzvyn|2 years ago

undermine: "lessen the effectiveness, power, or ability of, especially gradually or insidiously"

hm.

unknown|2 years ago

[deleted]

wackget|2 years ago

There's something disagreeable about charging a subscription to search freely-available scientific papers.

Yeah I get you're technically paying for the "advanced" search but it still leaves a bad taste in the mouth because this service's entire existence depends on open source knowledge.

P.S. hiding pricing behind registration isn't cool

bbsz|2 years ago

I think that full text search queries over long text data is already kind of expensive server side. Users are paying specifically for this, not better UI or simple direct match search available in free to use projects. I would say it's very reasonable to charge for costs incurred here.

unknown|2 years ago

[deleted]

danielmarkbruce|2 years ago

[deleted]

frogamel|2 years ago

IMO if you're going to profit off open research, you should at least make your own work available for other researchers. The white paper has 10 pages of performance benchmarks but 5 sentences on methodology.

unknown|2 years ago

[deleted]

knicholes|2 years ago

Immediately bailed once it required I provide my email address.

unknown|2 years ago

[deleted]

mangoo84|2 years ago

[deleted]

whalekawhi|2 years ago

[deleted]

huqedato|2 years ago

Nice but way too pricey.

https://chat.openai.com/g/g-dGz4aw9iA-research-refiner - the free version (just ChatGPT Plus subscription needed)

tomhartke|2 years ago

The goal is to be systematic and handle complex topics. ChatGPT + keyword search can't handle complex topics at all, and isn't systematic either.

callalex|2 years ago

Free, just needs a subscription?

3abiton|2 years ago

I wonder how many of such services chatgpt undercuts already?

canadiantim|2 years ago

Awesome had no idea this existed. Very useful, thanks!

notso411|2 years ago

[deleted]