One of logstash's main draws is as a data transformation pipeline. You can do lookups via dns or a json or csv file, for example. From what I can tell vector is just a simple log shipper.
Elasticseach is notoriously hard to roll out and develop against (for smaller companies especially), and so I am happy to see smaller projects in this space.
It's been around for a couple of years now, and have a few happy customers who have had great success in replacing $X0,000/year popular hosted search with Typesense!
I see the author did the same search engine in Go a while ago. So I suppose the project being a side project to learn a new language. Or is there a different reason?
That is a good observation. The author might also need flexible search options at work. In any case, I have some interest in Rust but don’t actively use it. I found reading through the main server.rs file interesting as example code.
As someone writing software in Rust myself I am always interested in knowing about projects using Rust for multiple reasons.
1) In the case of libraries (crates), it might be something I can make use of in the future.
2) I can look at how they solved the problem they are solving and compare that with how I'd do it and maybe learn something new that can be useful to me in my future projects.
3) I want Rust to thrive and I want people to be aware of projects using Rust because the more people that are aware of Rust the bigger is the probability that I can work for more companies in the future writing software for them in Rust.
I like knowing what language something is coded in. It makes me more likely to look into the project. If it's written in something I'm not interested in I may click through, but not be as thorough, and some languages I save the link for later because I have no interest in them professionally or on my time off. I like looking at all projects eventually because some people come up with amazing pieces of software in all types of languages, but others might not care to look at a Ruby, PHP, NodeJS, Python, C, C++, Rust etc project.
Rust is an interesting language both for its technical characteristics, which is a direct appeal as other commenters have noted, but it can also be worth noting because Rust interoperates almost as well as C. If I announce a cool Python module, someone who primarily uses Ruby is probably going to ignore it because the level of effort to use it would be more than it's worth. If I announce a cool Rust module, they might think “you know, it's pretty easy to build a wrapper…”.
What codetrotter and ccccc0 said, also Rust as a language and a community has a strong focus on correctness, which makes me more interested in actually using the project.
In general I think the title of a project on a news aggregator should basically be a 80 character sales pitch, " in rust" is 8 characters that signal a lot more than most 8 characters could (to me).
Rust is young enough that you can read this as an ad for the language, not the project. "Rust is a language in which people write full-text search and indexing"
I think it's relevant for open source projects because people might want to contribute to them or just read through the code to see how things work. And of course people will be more interested in doing those things with languages they have experience using.
Is this an opening of a mature project that has been coded in private somewhere? Is this just a code drop on the community?
Note: this comes from a developer in Japan. Tantivy's main developer is also based in Japan. @fulmicoton, is there any interaction between the projects?
Not all projects are birthed in public. It may have been extracted from a larger private project which may have issues sharing its exact history of development.
I'm looking for an easy to use typeahead/autocomplete search solution. javascript lib for frontend paired with easy to manage, lightweight server. something modern.
The dataset isn't huge. e.g. 1 million strings of no more than 512 utf-8 chars each and not reindexed more than once a day or week. clusters, sharding etc unnecessary.
I keep hoping to stumble on a fully baked solution...any ideas?
Interesting. Since the underlying engine(Tantivy) is faster than lucene - at least in their benchmarks - it should be faster that solr. Seems like the author is exploring a faster alternative to solr. I never got around to explore elasticsearch since our solr instances are running so smoothly.
- it's interesting to other developers (which are HN's main audience) to see.
They don't sell some shrink wrapped software, where the language doesns't matter. Nor some already established package you just download and use as is like Postgres or Bash, or whatever.
- it matters for those looking for compatible stuff for their own projects (for libraries, reusable packages, etc.)
- it offers certain guarantees other languages do no (e.g. memory safety, native binaries) which can be an important criterium for those looking for a project
- it's important for possible collaborators to know the language (the project being Open Source and everything).
- in a field where a Java based project (Lucene/Elastic Search) dominates, it is important to advertise that you offer a non-Java alternative for people who want to avoid Java/Oracle/etc.
- Rust is also currently on the rise (!= meme), and thus gets new programmers, and new greenfield projects. And since those people are trying the language, they want to advertise their involvement to the community, talk about how they found the experience, etc.
[+] [-] xvilka|6 years ago|reply
1. Toshi[1] - alternative to Elasticsearch
2. Sonic[2] - alternative to Elasticsearch
3. Vector[3][4] - alternative to Logstash
4. native_spark[5] - alternative to Apache Spark
[1] https://github.com/toshi-search/Toshi
[2] https://github.com/valeriansaliou/sonic
[3] https://vector.dev/
[4] https://github.com/timberio/vector
[5] https://github.com/rajasekarv/native_spark
[+] [-] MuffinFlavored|6 years ago|reply
aka you can't point your Kibana instance to these "nodes" and have it speak the Elasticsearch API
[+] [-] snikolaev|6 years ago|reply
[+] [-] CameronNemo|6 years ago|reply
[+] [-] jinqueeny|6 years ago|reply
[+] [-] mrec|6 years ago|reply
[+] [-] clemParis|6 years ago|reply
[+] [-] karterk|6 years ago|reply
I've also been working on a light, fast, typo-tolerant search engine: https://github.com/typesense/typesense
It's been around for a couple of years now, and have a few happy customers who have had great success in replacing $X0,000/year popular hosted search with Typesense!
[+] [-] elephantum|6 years ago|reply
It has built-in fulltext search: https://www.postgresql.org/docs/12/textsearch.html
[+] [-] atheiste|6 years ago|reply
[+] [-] devy|6 years ago|reply
Perhaps. But the author Minoru Osuka ain't nobody[1]. He is
- Engineer at Mercari, Inc.
- Committer at Apache Software Foundation
- Co-author of a Apache Solr book in Japanese
- Ex-Yahoo! JAPAN
- Ex-Rakuten
So yeah, I think he knows what he's doing.
[1] https://twitter.com/minoru_osuka/
[+] [-] mark_l_watson|6 years ago|reply
[+] [-] NPMaxwell|6 years ago|reply
[+] [-] MS90|6 years ago|reply
https://en.wikipedia.org/wiki/Pierre_Terrail,_seigneur_de_Ba...
[+] [-] devy|6 years ago|reply
[+] [-] seanc|6 years ago|reply
And yes, Bayard Rustin is absolutely one of the great heroes of the 20th Century.
[+] [-] mmoez|6 years ago|reply
Why not simply announcing "X" in the title?
[+] [-] reacharavindh|6 years ago|reply
Written in Python - easy to understand, but lacks performance. Probably cant use more than 1 CPU core. Needs a lot of memory.
Written in Go - fast enough for most cases, all CPU cores, but possibly high mem usage becUse of GC. I need to plan for it.
Written in Rust - possibly new and maturing, uses memory effectively, likely to use all cores. Easy to deploy (single binary)
Written in JS - probably not for me - personal taste and hate of npm ecosystem.
Written in C - probably the best performing, but less robust, no memory safety.
So, “written in” helps in judging whether to care for that project or not to some extent.
[+] [-] codetrotter|6 years ago|reply
1) In the case of libraries (crates), it might be something I can make use of in the future.
2) I can look at how they solved the problem they are solving and compare that with how I'd do it and maybe learn something new that can be useful to me in my future projects.
3) I want Rust to thrive and I want people to be aware of projects using Rust because the more people that are aware of Rust the bigger is the probability that I can work for more companies in the future writing software for them in Rust.
[+] [-] staticassertion|6 years ago|reply
ES is the current free text search engine out there, and it's famously painful to manage. Resource consumption and GC pain can be really significant.
I see 'rust' and I know immediately that at least some pains I've experienced will be eliminated.
[+] [-] giancarlostoro|6 years ago|reply
[+] [-] Hendrikto|6 years ago|reply
The patters is more like "X written in Y", which is totally fine imo.
[+] [-] acdha|6 years ago|reply
[+] [-] gpm|6 years ago|reply
In general I think the title of a project on a news aggregator should basically be a 80 character sales pitch, " in rust" is 8 characters that signal a lot more than most 8 characters could (to me).
[+] [-] ccccc0|6 years ago|reply
[+] [-] opencl|6 years ago|reply
The same developer also happens to have written a similar server in Python a while back: https://github.com/mosuka/cockatrice
[+] [-] hobofan|6 years ago|reply
[+] [-] bishala|6 years ago|reply
[+] [-] jedisct1|6 years ago|reply
[+] [-] jinqueeny|6 years ago|reply
[+] [-] wiradikusuma|6 years ago|reply
[+] [-] fnord123|6 years ago|reply
Is this an opening of a mature project that has been coded in private somewhere? Is this just a code drop on the community?
Note: this comes from a developer in Japan. Tantivy's main developer is also based in Japan. @fulmicoton, is there any interaction between the projects?
[+] [-] jsd1982|6 years ago|reply
[+] [-] chopraaa|6 years ago|reply
Bayard looks like a search-in-rust PoC.
[+] [-] fulmicoton|6 years ago|reply
[+] [-] wil421|6 years ago|reply
[+] [-] atombender|6 years ago|reply
[1] https://github.com/zhihu/rucene
[+] [-] jhancock|6 years ago|reply
The dataset isn't huge. e.g. 1 million strings of no more than 512 utf-8 chars each and not reindexed more than once a day or week. clusters, sharding etc unnecessary.
I keep hoping to stumble on a fully baked solution...any ideas?
[+] [-] snitch182|6 years ago|reply
[+] [-] manigandham|6 years ago|reply
[+] [-] rapsey|6 years ago|reply
[+] [-] lolive|6 years ago|reply
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] Kinnard|6 years ago|reply
[+] [-] fulmicoton|6 years ago|reply
[+] [-] _pgmf|6 years ago|reply
[+] [-] coldtea|6 years ago|reply
Devs write "written in rust" because:
- it's interesting to other developers (which are HN's main audience) to see.
They don't sell some shrink wrapped software, where the language doesns't matter. Nor some already established package you just download and use as is like Postgres or Bash, or whatever.
- it matters for those looking for compatible stuff for their own projects (for libraries, reusable packages, etc.)
- it offers certain guarantees other languages do no (e.g. memory safety, native binaries) which can be an important criterium for those looking for a project
- it's important for possible collaborators to know the language (the project being Open Source and everything).
- in a field where a Java based project (Lucene/Elastic Search) dominates, it is important to advertise that you offer a non-Java alternative for people who want to avoid Java/Oracle/etc.
- Rust is also currently on the rise (!= meme), and thus gets new programmers, and new greenfield projects. And since those people are trying the language, they want to advertise their involvement to the community, talk about how they found the experience, etc.
[+] [-] pengstrom|6 years ago|reply
[+] [-] w-j-w|6 years ago|reply
[deleted]
[+] [-] bishala|6 years ago|reply