It's built from scratch. I'm doing the crawling and indexing. Look through my comments and you'll find a few outlines of the stack and the index design.
This isn't meant to be pressurizing or to sound like a demand (if it seems that way), but have you thought about uploading the source code for your search engine?
Something like this has the potential to be used in university courses to teach how to build a search engine and/or teach 'advanced' programming concepts and ideas. It's a real program showing what you need to do to optimize your database and software to work on consumer desktops (even if your specs are higher than what other people would have; 128 GiB for example is quite a lot of RAM for most consumers) and how to handle malicious data that you will come across (for example, link farms).
In addition, I read all the posts on your site that were listed in the page you linked to, and to me those posts would actually seem more useful as an explanation of the code that people can view together side-by-side, rather than as the only way people can know how you implemented your algorithms and search engine.
I guess what I'm saying is, having an explanation in words of the algorithms and code along with the actual code can be a very powerful combination for teaching and learning.
Thus, again, would it be alright if you upload a copy of the source code for people (including myself) to look at? I personally don't care about if it's released under an open-source license (or not), or if you just add a zip file on your site vs making a repository on Github, or even if you never update the code you release. I (and most likely others) want to peek at at least one version of what you wrote to see how something like this works under-the-hood, which again, I'm asking if that's alright with you.
Also, I'm not asking you to share the database(s) you have for this, especially since they're giant and would likely take up more traffic downloading from your site than anything the search engine can do.
marginalia_nu|4 years ago
Here are my blog entries relating to this:
https://memex.marginalia.nu/topic/astrolabe.gmi
CreCre|4 years ago
Something like this has the potential to be used in university courses to teach how to build a search engine and/or teach 'advanced' programming concepts and ideas. It's a real program showing what you need to do to optimize your database and software to work on consumer desktops (even if your specs are higher than what other people would have; 128 GiB for example is quite a lot of RAM for most consumers) and how to handle malicious data that you will come across (for example, link farms).
In addition, I read all the posts on your site that were listed in the page you linked to, and to me those posts would actually seem more useful as an explanation of the code that people can view together side-by-side, rather than as the only way people can know how you implemented your algorithms and search engine. I guess what I'm saying is, having an explanation in words of the algorithms and code along with the actual code can be a very powerful combination for teaching and learning.
Thus, again, would it be alright if you upload a copy of the source code for people (including myself) to look at? I personally don't care about if it's released under an open-source license (or not), or if you just add a zip file on your site vs making a repository on Github, or even if you never update the code you release. I (and most likely others) want to peek at at least one version of what you wrote to see how something like this works under-the-hood, which again, I'm asking if that's alright with you.
Also, I'm not asking you to share the database(s) you have for this, especially since they're giant and would likely take up more traffic downloading from your site than anything the search engine can do.