arjenpdevries
|
3 years ago
|
on: EU Open Web Search project kicked off
https://qz.com/1145669/googles-true-origin-partly-lies-in-ci...Enjoy! It's a great story.
(Plus: for who might not know, DARPA is US defense research, and heavily influenced by the intelligence services needs. Which is not necessarily bad! Just good to understand where and how Google originated. And wrt DARPA, they funded the creation of the internet itself, for whatever matters.
In Europe, things often go slightly different. The Web is a result of CERN, who are also a project partner of OpenWebSearch.EU. Why? Well, better search can also be beneficial for better science, not just for end users wanting to find their way or buying something.)
arjenpdevries
|
3 years ago
|
on: EU Open Web Search project kicked off
Yes, that is an advantage.
You can also integrate search results for which you cannot have the index, like social media APIs, another reason.
You could also mix and match search results from various topic-oriented indices. That's a research question, whether that is really better than building one unified one. But we think it is the way to bring index fragments to the edge, with the obvious privacy advantages.
arjenpdevries
|
3 years ago
|
on: EU Open Web Search project kicked off
The project is starting so not all your questions are answerable today, but, we definitly will produce an open web index, already by the end of the first year, with improvements for years two and three.
We further deliver components to make search engines on top of this index. The project vision is that there will be many different search engines, not just 4 worldwide. Hoping to lead the way!
arjenpdevries
|
3 years ago
|
on: EU Open Web Search project kicked off
This is not a Gaia-X, it is an exploratory project, showing a possible way forward and setting first steps.
arjenpdevries
|
3 years ago
|
on: EU Open Web Search project kicked off
Google started out of a CIA funded Stanford project.
arjenpdevries
|
3 years ago
|
on: EU Open Web Search project kicked off
We like Federated search, we like decentralized search, and even P2P search; we are trying to find a good mix, and decided to get started rather than wait! Exciting times.
arjenpdevries
|
3 years ago
|
on: EU Open Web Search project kicked off
For starters: the objective is to create the index not the engine, that's quite a different ambition.
We are very aware of the Quaero/Theseus history :-)
arjenpdevries
|
3 years ago
|
on: EU Open Web Search project kicked off
That cannot be true, as the project has yet to start. But anyone can start a crawler, so you may have encountered other people's software. We wouldn't be so unknowledgeable to ignore robots.txt ;-)
arjenpdevries
|
3 years ago
|
on: EU Open Web Search project kicked off
Isn't it lovely?!
arjenpdevries
|
3 years ago
|
on: EU Open Web Search project kicked off
That's the marketing story. I think it's because they didn't clutter their homepage like AltaVista did.
arjenpdevries
|
3 years ago
|
on: EU Open Web Search project kicked off
Slovenia, Czech Republic. But yes, I think there was a competing proposal from Italy/Spain. Not enough budget for two projects in this area, unfortunately, as they were good too.
arjenpdevries
|
3 years ago
|
on: EU Open Web Search project kicked off
We will explore that idea in the project, I also think it may help (but vulnerable for Web index spam by adversary parties).
arjenpdevries
|
4 years ago
|
on: Comparing SQLite, DuckDB and Arrow with UN trade data
(All the possible extensions you mention would be beneficial for the solution using DUCKDB.)
arjenpdevries
|
4 years ago
|
on: Comparing SQLite, DuckDB and Arrow with UN trade data
Arrow is meant to share data as-is instead of requiring a copy, and, often, serialization/deserialization.
(This requires both ends to be able to handle the Arrow representation.)
Eg, it has the potential to speed up query processing in PySpark by a lot, because of its Java/Python interoperability.
arjenpdevries
|
4 years ago
|
on: Efficient SQL on Pandas with DuckDB
Great exposition of benefits of DuckDB to augment Pandas for data analysis.
Enjoy! It's a great story.
(Plus: for who might not know, DARPA is US defense research, and heavily influenced by the intelligence services needs. Which is not necessarily bad! Just good to understand where and how Google originated. And wrt DARPA, they funded the creation of the internet itself, for whatever matters.
In Europe, things often go slightly different. The Web is a result of CERN, who are also a project partner of OpenWebSearch.EU. Why? Well, better search can also be beneficial for better science, not just for end users wanting to find their way or buying something.)