Hey, creator of the Bible Semantic Search app here. I 100% agree with you. I hacked together this prototype for the purposes of learning the Pinecone API rather than fine-tuning a language model, but I'm still pleasantly surprised by the quality of the search results despite all the shortcomings you mentioned. The results aren't great, but they're decent. I'm surprised how well SBERT works out of the box considering I haven't done any fine-tuning whatsoever, and like you said, it doesn't fully understand the KJV's archaic writing style. Switching to a version that uses more modern English like the NIV is trivial so maybe I'll do that.
chrismorgan|3 years ago
mikeytown2|3 years ago
jordanmoconnor|3 years ago
hushpuppy|3 years ago
One of the things that makes KJV version useful for scholars is that since it's been the "standard" for hundreds of years a great deal of other work references its structure. People use it because of this. It's less to do with it being a fabulously accurate translation or whatever. It's just newer bibles don't have this wealth of history and documentation that references it and much of them are pretty expensive to license, while KJV is public domain.
For example "Strong's Exhaustive Concordance". If you get a version of KJV with "Strong's Numbers" you can cross reference words and phrases in their original languages (greek/hewbrew/etc). This way students can understand some of the original meanings that go into difficult or disputed passages.
Also there is a large number of commentaries of all sorts of different types that reference specific passages.
Besides that KJV is just mostly valued in Protestant Christian dialects of Christianity. Other Christian religions such as various versions of Eastern Orthodox have different numbers of books or will arrange things in different orders. There have been different attempts to past scholars to arrange things in more chronological order, too.
This makes the Bible fairly unique when it comes to literature. Each verse of text can have dozens of different "back links" and "references".
so if somebody searches for the subject "Homosexuality" it will get hits in various commentaries. Those commentaries all directly reference verses in the Bible.
So you could show the version found and why it was selected. That way a reader would be shown "These authors think this verse is has to do with homosexuality" and they could click through and find out the justification for this, different translations, what those translations are likely based on, what other Christian sects feel this verse means, and so on and so forth.
I don't know if it would be useful for you, but there is a "Sword Project" that collects and cross references different Bibles and bible resources as well as tools.
https://www.crosswire.org/sword/index.jsp
Syonyk|3 years ago
ESV or CSB, please...
How hard would it be to train it on the range of common modern translations? It would be an interesting stress test of the models to see how close searches in different translations are - they're theoretically all communicating the same thing, with different styles and emphasis, but I'd expect a lot of that to fall out in the semantic search (at least, if it were working properly).
You could grab a range of English translations ranging from "very literal" to "thought for thought" (The Message applies here, and I'm not even sure it's thought for thought), do various searches, and see what the overlap in results is.
In any case, very neat project... concept. :/ It appears to have fallen over, all I get is "Please wait..." when I try to access it. Even without my usual web filters interfering. I think.
xhrpost|3 years ago
I haven't personally read much of ESV or CSB, just curious, why the preference for those?
[1]: https://en.wikipedia.org/wiki/Dynamic_and_formal_equivalence
eckza|3 years ago
This is awesome. Thank you for building it.
xhrpost|3 years ago
yjftsjthsd-h|3 years ago
Edit: Actually, better yet: https://crosswire.org/sword/modules/ModDisp.jsp?modType=Bibl... has licensing info for each listed item
gibspaulding|3 years ago
As someone else pointed out, the WEB translation is public domain so it might be a better option.
swasheck|3 years ago
paxcoder|3 years ago
[deleted]