top | item 21417859

Bayard: a full-text search and indexing server written in Rust

283 points| jinqueeny | 6 years ago |github.com | reply

107 comments

order
[+] xvilka|6 years ago|reply
It would be nice to integrate all Rust alternatives to ELK stack:

1. Toshi[1] - alternative to Elasticsearch

2. Sonic[2] - alternative to Elasticsearch

3. Vector[3][4] - alternative to Logstash

4. native_spark[5] - alternative to Apache Spark

[1] https://github.com/toshi-search/Toshi

[2] https://github.com/valeriansaliou/sonic

[3] https://vector.dev/

[4] https://github.com/timberio/vector

[5] https://github.com/rajasekarv/native_spark

[+] MuffinFlavored|6 years ago|reply
none of these are actual drop in replacements as far as I can tell for Elasticsearch

aka you can't point your Kibana instance to these "nodes" and have it speak the Elasticsearch API

[+] CameronNemo|6 years ago|reply
One of logstash's main draws is as a data transformation pipeline. You can do lookups via dns or a json or csv file, for example. From what I can tell vector is just a simple log shipper.
[+] jinqueeny|6 years ago|reply
It's built on top of Tantivy (https://github.com/tantivy-search/tantivy) that implements Tha Raft Consensus Algorithm (https://raft.github.io/) by raft-rs (https://github.com/tikv/grpc-rs) and The gRPC (HTTP/2 + Protocol Buffers) by grpc-rs (https://github.com/tikv/grpc-rs) and rust-protobuf (https://github.com/stepancheg/rust-protobuf).
[+] mrec|6 years ago|reply
So would it be roughly accurate to say that Bayard is to Tantivy what Elasticsearch is to Lucene?
[+] karterk|6 years ago|reply
Elasticseach is notoriously hard to roll out and develop against (for smaller companies especially), and so I am happy to see smaller projects in this space.

I've also been working on a light, fast, typo-tolerant search engine: https://github.com/typesense/typesense

It's been around for a couple of years now, and have a few happy customers who have had great success in replacing $X0,000/year popular hosted search with Typesense!

[+] atheiste|6 years ago|reply
I see the author did the same search engine in Go a while ago. So I suppose the project being a side project to learn a new language. Or is there a different reason?
[+] devy|6 years ago|reply
> So I suppose the project being a side project to learn a new language.

Perhaps. But the author Minoru Osuka ain't nobody[1]. He is

- Engineer at Mercari, Inc.

- Committer at Apache Software Foundation

- Co-author of a Apache Solr book in Japanese

- Ex-Yahoo! JAPAN

- Ex-Rakuten

So yeah, I think he knows what he's doing.

[1] https://twitter.com/minoru_osuka/

[+] mark_l_watson|6 years ago|reply
That is a good observation. The author might also need flexible search options at work. In any case, I have some interest in Rust but don’t actively use it. I found reading through the main server.rs file interesting as example code.
[+] NPMaxwell|6 years ago|reply
FYI: Who was Bayard Rustin? https://en.wikipedia.org/wiki/Bayard_Rustin It's a silly play on words celebrating one of the very great heroes of 20th Century America
[+] seanc|6 years ago|reply
I assumed the same thing. Quite the co-incidence.

And yes, Bayard Rustin is absolutely one of the great heroes of the 20th Century.

[+] mmoez|6 years ago|reply
"X written in Rust" is becoming a tiring clickbait pattern on tech boards.

Why not simply announcing "X" in the title?

[+] reacharavindh|6 years ago|reply
It’s not useless. I for one associate language with their run time properties..

Written in Python - easy to understand, but lacks performance. Probably cant use more than 1 CPU core. Needs a lot of memory.

Written in Go - fast enough for most cases, all CPU cores, but possibly high mem usage becUse of GC. I need to plan for it.

Written in Rust - possibly new and maturing, uses memory effectively, likely to use all cores. Easy to deploy (single binary)

Written in JS - probably not for me - personal taste and hate of npm ecosystem.

Written in C - probably the best performing, but less robust, no memory safety.

So, “written in” helps in judging whether to care for that project or not to some extent.

[+] codetrotter|6 years ago|reply
As someone writing software in Rust myself I am always interested in knowing about projects using Rust for multiple reasons.

1) In the case of libraries (crates), it might be something I can make use of in the future.

2) I can look at how they solved the problem they are solving and compare that with how I'd do it and maybe learn something new that can be useful to me in my future projects.

3) I want Rust to thrive and I want people to be aware of projects using Rust because the more people that are aware of Rust the bigger is the probability that I can work for more companies in the future writing software for them in Rust.

[+] staticassertion|6 years ago|reply
"In Rust" signals something very important to me, especially with the context of "full text search".

ES is the current free text search engine out there, and it's famously painful to manage. Resource consumption and GC pain can be really significant.

I see 'rust' and I know immediately that at least some pains I've experienced will be eliminated.

[+] giancarlostoro|6 years ago|reply
I like knowing what language something is coded in. It makes me more likely to look into the project. If it's written in something I'm not interested in I may click through, but not be as thorough, and some languages I save the link for later because I have no interest in them professionally or on my time off. I like looking at all projects eventually because some people come up with amazing pieces of software in all types of languages, but others might not care to look at a Ruby, PHP, NodeJS, Python, C, C++, Rust etc project.
[+] Hendrikto|6 years ago|reply
I see this just as often for Python, C, JavaScript, etc.

The patters is more like "X written in Y", which is totally fine imo.

[+] acdha|6 years ago|reply
Rust is an interesting language both for its technical characteristics, which is a direct appeal as other commenters have noted, but it can also be worth noting because Rust interoperates almost as well as C. If I announce a cool Python module, someone who primarily uses Ruby is probably going to ignore it because the level of effort to use it would be more than it's worth. If I announce a cool Rust module, they might think “you know, it's pretty easy to build a wrapper…”.
[+] gpm|6 years ago|reply
What codetrotter and ccccc0 said, also Rust as a language and a community has a strong focus on correctness, which makes me more interested in actually using the project.

In general I think the title of a project on a news aggregator should basically be a 80 character sales pitch, " in rust" is 8 characters that signal a lot more than most 8 characters could (to me).

[+] ccccc0|6 years ago|reply
Rust is young enough that you can read this as an ad for the language, not the project. "Rust is a language in which people write full-text search and indexing"
[+] opencl|6 years ago|reply
I think it's relevant for open source projects because people might want to contribute to them or just read through the code to see how things work. And of course people will be more interested in doing those things with languages they have experience using.

The same developer also happens to have written a similar server in Python a while back: https://github.com/mosuka/cockatrice

[+] hobofan|6 years ago|reply
Tell the mods that. I've seen mupltiple titles where they edited the original title to include "in X".
[+] bishala|6 years ago|reply
was about to comment the same before seeing your comment.
[+] jinqueeny|6 years ago|reply
FYI: this is just a PoC and is very early in the stage :)
[+] fnord123|6 years ago|reply
1 commit.

Is this an opening of a mature project that has been coded in private somewhere? Is this just a code drop on the community?

Note: this comes from a developer in Japan. Tantivy's main developer is also based in Japan. @fulmicoton, is there any interaction between the projects?

[+] jsd1982|6 years ago|reply
Not all projects are birthed in public. It may have been extracted from a larger private project which may have issues sharing its exact history of development.
[+] chopraaa|6 years ago|reply
The creator of Bayard is apparently the co-created of Solr.

Bayard looks like a search-in-rust PoC.

[+] fulmicoton|6 years ago|reply
I met the author of Bayard a couple of times and had beers with him. Does that count as interactions?
[+] wil421|6 years ago|reply
I was looking at Raspberry Pi projects for Rust. There were similar complaints on the Pi forum. Looks I’ll be using Python for my projects.
[+] jhancock|6 years ago|reply
I'm looking for an easy to use typeahead/autocomplete search solution. javascript lib for frontend paired with easy to manage, lightweight server. something modern.

The dataset isn't huge. e.g. 1 million strings of no more than 512 utf-8 chars each and not reindexed more than once a day or week. clusters, sharding etc unnecessary.

I keep hoping to stumble on a fully baked solution...any ideas?

[+] snitch182|6 years ago|reply
Interesting. Since the underlying engine(Tantivy) is faster than lucene - at least in their benchmarks - it should be faster that solr. Seems like the author is exploring a faster alternative to solr. I never got around to explore elasticsearch since our solr instances are running so smoothly.
[+] rapsey|6 years ago|reply
Raft storage in-memory only. Not exactly safe replication.
[+] lolive|6 years ago|reply
Is the query langage less oscure than the query langage of ElasticSearch?
[+] _pgmf|6 years ago|reply
The proof that rust is a meme language is evidenced by the need to include "written in rust" every time a rust project is mentioned.
[+] coldtea|6 years ago|reply
That makes no sense.

Devs write "written in rust" because:

- it's interesting to other developers (which are HN's main audience) to see.

They don't sell some shrink wrapped software, where the language doesns't matter. Nor some already established package you just download and use as is like Postgres or Bash, or whatever.

- it matters for those looking for compatible stuff for their own projects (for libraries, reusable packages, etc.)

- it offers certain guarantees other languages do no (e.g. memory safety, native binaries) which can be an important criterium for those looking for a project

- it's important for possible collaborators to know the language (the project being Open Source and everything).

- in a field where a Java based project (Lucene/Elastic Search) dominates, it is important to advertise that you offer a non-Java alternative for people who want to avoid Java/Oracle/etc.

- Rust is also currently on the rise (!= meme), and thus gets new programmers, and new greenfield projects. And since those people are trying the language, they want to advertise their involvement to the community, talk about how they found the experience, etc.

[+] pengstrom|6 years ago|reply
How do you define a "meme language" and why does that follow for rust?
[+] bishala|6 years ago|reply
"meme language". lol thats hilarious.