The semantic web is dead – Long live the semantic web

[+] cyocum|3 years ago|reply

The author of this post mentions the Humanities at the end of their post and TerminusDB. I work on a Humanities based project which uses the Semantic Web (https://github.com/cyocum/irish-gen) and I have looked at TerminusDB a couple of times.

The main factor in my choice of technologies for my project was the ability to reason data from other data. OWL was the defining solution for my project. This is mainly because I am only one person so I needed the computer to extrapolate data that was logically implied but I would be forced to encode by hand otherwise. OWL actually allowed my project to be tractable for a single person (or a couple of people) to work on.

The author brings up several points that I have also run into myself. The Open World Assumption makes things difficult to reason about and makes understanding what is meant by a URL hard. Another problem that I have run into is that debugging OWL is a nightmare. I have no way to hold the reasoner to account so I have no way when I run a SPARQL query to be able to know if what is presented is sane. I cannot ask the reasoner "how did you come up with this inference?" and have it tell me. That means if I run a query, I must go back to the MS sources to double check that something has not gone wrong and fix the database if it has.

Another problem that the author discusses and what I call "Academic Abandonware". There are things out there but only the academic who worked on it knows how to make it work. The documentation is usually non-extant and trying to figure things out can take a lot of precious time.

I will probably have another look at TerminusDB in due course but it will need to have a reasoner as powerful as the OWL ones and an ease of use factor to entice me to shift my entire project at this point.

[+] closewith|3 years ago|reply

> I work on a Humanities based project which uses the Semantic Web (https://github.com/cyocum/irish-gen) and I have looked at TerminusDB a couple of times.

I had never come across anything like this before, but this is a wonderful project.

[+] zozbot234|3 years ago|reply

"Reasoning" capability can be added to any conventional database via the use of views, and sometimes custom indexes. The real problem is that it's computationally expensive for non-trivial cases.

[+] lmeyerov|3 years ago|reply

Very cool topic... and not the article I was expecting!

I actively work with teams making sense of their massive global supply chains, manufacturing process, sprawling IT/IOT infra behavior, etc., and I personally bailed from RDF to bayesian models ~15 years ago... so I'm coming from a pretty different perspective:

* The historical killer apps for semantic web were historically paired with painfully manual taxonomization efforts. In industry, that's made RDF and friends useful... but mostly in specific niches like the above, and coming alongside pricey ontology experts. That's why I initially bailed years ago: outside of these important but niche domains, google search is way more automatic, general, and easy to use!

* Except now the tables have turned: Knowledge graphs for grounding AI. We're seeing a lot of projects where the idea is transformer/gnn/... <> knowledge graph. The publicly visible camp is folks sitting on curated systems like wikidata and osm, which have a nice back-and-forth. IMO the bigger iceberg is from AI tools getting easier colliding with companies having massive internal curated knowledge bases. I've been seeing them go the knowledge graph <> AI for areas like chemicals, people/companies/locations, equipment, ... . It's not easy to get teams to talk about it, but this stuff is going on all the way from big tech co's (Google, Uber, ...) to otherwise stodgy megacorps (chemicals, manufacturing, ..).

We're more on the viz (JS, GPU) + ai (GNN) side of these projects, and for use cases like the above + cyber/fraud/misinfo. If into it, definitely hiring, it's an important time for these problems.

[+] strangattractor|3 years ago|reply

Generally agree. There is a lot of discussion concerning the technical difficulties, RDF flaws and road blocks little acknowledgement of other non-technical impracticalities. Making something technically feasible does insure adoption. Changing a bunch of code over time will always be preferable redefining ontologies and reprocessing the data.

[+] bawolff|3 years ago|reply

Funnily enough, the why semantic web is good section is the section that actually identifies why it failed.

We are going to have an ultra flexible data model that everyone can just participate in?

That never works. Protocols work by restricting possibilities not allowing everything. The more possibilities you allow, the more room for subtle incompatibilities and the more effort you have to spend massaging everything into compatibility.

[+] ggleason|3 years ago|reply

That's discussed in the article though. The open world assumption is untenable. Having shareable interoperable schemata that can refer to each-other safely would be a god send however. And that's what is currently very hard but needn't be.

[+] jerf|3 years ago|reply

The reason why the semantic web is even more fundamental: You can't get everyone to agree on one schema. Period. Even if everyone is motivated to, they can't agree, and if there is even a hint of a reason to try to distinguish oneself or strategically fail to label data or label it incorrectly, it becomes even more impossible.

(I mean, the "semantic web" has foundered so completely and utterly on the problem of even barely working at all that it hasn't hardly had to face up to the simplest spam attacks of the early 2000s, and it's not even remotely capable of playing in the 2022 space.)

Agreement here includes not just abstract agreement in a meeting about what a schema is, but complete agreement when the rubber hits the road such that one can rely on the data coming from multiple providers as if they all came from one.

Nothing else matters. It doesn't matter what the serialization of the schema that can't exist is. It doesn't matter what inference you can do on the data that doesn't exist. It doesn't matter what constraints the schema that can't exist specifies. None of that matters.

Next in line would be the economic impracticality of expecting everyone to label their data out of the goodness of their hearts with this perfectly-agreed-upon schema, but the Semantic Web can't even get far enough for this to be its biggest problem!

Semantic web is a whole bunch of clouds and wishes and dreams built on a foundation that not only does not exist, but can not exist. If you want to rehabilitate it, go get people to agree (even in principle!) on a single schema. You won't rehabilitate it. But you'll understand what I'm saying a lot more. And you'll get to save all the time you were planning on spending building up the higher levels.

[+] leoxv|3 years ago|reply

Wikidata is already providing a nearly globally accepted store of concept IDs. Wikipedia adds a lot of depth to this knowledge graph too.

Schema.org has become very popular and Google is backing this project. Wordpress and others are already using it.

Governments are requiring not just "open data", but also "open linked-data" (which can then be ingested into a SPARQL engine), because they want this data to be usable across organizations.

The financial industry are moving to the FIBO ontology, and on and on...

[+] lyxsus|3 years ago|reply

There're a lot of wrong perspectives on the topic in this thread, but this one I like the most. When someone starts to talk about "agreeing on a single schema/ontology" it's a solid indicator that that someone needs to get back to rtfm (which I agree a bit too cryptic).

The point here is that in semantic web there're supposed to be lots and lots of different ontologies/schemas by design, often describing the same data. SW spec stack has many well-separated layers. To address that problem, an OWL/RDFS is created.

[+] jeremiem|3 years ago|reply

I don't think it's impossible to agree on one schema but it's very expensive to do so and requires tools from the study of philosophy.

While I don't work in the domain, the ontologies in the OBO Foundry and all the ones deriving from Basic Formal Ontology[0] have some level of compatibility that make their integration possible. Still far from "one schema to rule them all" but it shows that agreement can be achieved.

There are other initiatives that I'm aware of that could also qualify as a step in the right direction: "e-Government Core Vocabularies" and the "European Materials Modelling Ontology".

I hope and want to believe that, sooner than later, we will have formalized definitions for most practical aspects of our lives.

[0] https://basic-formal-ontology.org/users.html

[+] leoxv|3 years ago|reply

I'm building a front end app for Wikipedia & Wikidata called Conzept encyclopedia (https://conze.pt) based on semantic web pillars (SPARQL, URIs, various ontologies, etc.) and loving it so far.

The semantic web is not dead, its just slowly evolving and and growing. Last week I implemented JSON-LD (RDF embedded in HTML with a schema.org ontology), super easy and now any HTTP client can comprehend what any page is about automatically.

See https://twitter.com/conzept__ for many examples what Conzept can already do. You won't see many other apps do these things, and certainly not in a non-semantic-web way!

The future of the semantic web is in: much more open data, good schemas and ontologies for various domains, better web extensions understanding JSON-LD, more SPARQL-enabled tools, better and more lightweight/accessible NLP/AI/vector compute (preferably embedded in the client also), dynamic computing using category theory foundations (highly interactive and dynamic code paths, let the computer write logic for you), ...

[+] lolive|3 years ago|reply

The future of the semantic web is in big companies. Where handling data exchanges at scale is becoming a massive waste of time, resources and sanity.

[+] the-alchemist|3 years ago|reply

That looks cool, thank you!

[+] strangattractor|3 years ago|reply

Having worked for an Academic Publisher that had intense interest in this I finally came to the following conclusions to why this is DOA.

1. Producers of content are unwilling to pay for it (and neither are consumers BTW) 2. It is impossible to predict how the ontology will change over time so going back and reclassifying documents to make them useful is expensive. 3. Most pieces of info have a shelf life so it is not worth the expense of doing it. 4. Search is good enough and much easier. 5. Much of what is published is incorrect or partial so.

In the end I decided this is akin to discussing why everybody should use Lisp to program but the world has a differ opinion.

[+] ternaryoperator|3 years ago|reply

Not sure I understand the comparison with Lisp. You list five reasons for the semantic web that mostly involve cost.

[+] pornel|3 years ago|reply

Semantic Web lost itself in fine details of machine-readable formats, but never solved the problem of getting correctly marked up data from humans.

In the current web and apps people mostly produce information for other people, and this can work even with plain text. Documents may lack semantic markup, or may even have invalid markup, and have totally incorrect invisible metadata, and still be perfectly usable for humans reading them. This is a systemic problem, and won't get better by inventing a nicer RDF syntax.

In language translation, attempts of building rigid formal grammar-based models have failed, and throwing lots of text at a machine learning has succeeded. Semantic Web is most likely doomed in the same way. GPT-3 already seems to have more awareness of the world than anything you can scrape from any semantic database.

[+] mwelt|3 years ago|reply

Comparing "rigid formal grammar-based models" (whatever that might actually mean for now) to machine learning is like comparing apples to bananas. The former one is a rigorous syntactical formalization, aimed at being readable by machine and humans alike. The latter one is a learned interpolation of a probability distribution function. I do not see a single way to compare these two "things". Nevertheless, I may guess, what you actually are trying to say: Annotating data by hand (the syntax is completely irrelevant) is inferior to annotating data by machine learning. And this claim is at least debatable and domain-dependent. There are domains where even a 3% false-positive rate translates to "death of a human being in 3 out of 100 identified cases", and there are domains where it's to much work to formalize every bits and pieces of the domain and extracting (i.e. learning) knowledge is a feasible endeavor. I have experience in both fields, and I dare to say, that extracting concepts and relations out of text in a way that it can be further processed and used for some kind of decision process is way more complicated than you might imagine, and GPT-3 et al. do not achieve that.

[+] pphysch|3 years ago|reply

Sure, but there are still a lot of decisions being made behind the curtain, when it comes to producing a model like GPT-3. How was the training data ontologized? Where did it come from? To some extent, these are the same problems facing manual curation.

[+] iamwil|3 years ago|reply

On our podcast, The Technium, we covered Semantic Web as a retro-future episode [0]. It was a neat trip back to the early 2000s. It wasn't a bad idea, pre se, but it depended on humans doing-the-right-thing for markup and the assumption that classifying things are easy. Turns out neither are true. In addition, the complexity of the spec really didn't help those that wanted to adopt its practices. However, there are bits and pieces of good ideas in there, and some of it lives on in the web today. Just have to dig a little to see them. Metadata on websites for fb/twitter/google cards, RDF triples for database storage in Datomic, and knowledge base powered searches all come to mind.

[0] https://youtu.be/bjn5jSemPws

[+] staplung|3 years ago|reply

Clay Shirky nailed in in 2003:

https://deathray.us/no_crawl/others/semantic-web.html

I'll just excerpt the conclusion:

``` The systems that have succeeded at scale have made simple implementation the core virtue, up the stack from Ethernet over Token Ring to the web over gopher and WAIS. The most widely adopted digital descriptor in history, the URL, regards semantics as a side conversation between consenting adults, and makes no requirements in this regard whatsoever: sports.yahoo.com/nfl/ is a valid URL, but so is 12.0.0.1/ftrjjk.ppq. The fact that a URL itself doesn’t have to mean anything is essential – the Web succeeded in part because it does not try to make any assertions about the meaning of the documents it contained, only about their location.

There is a list of technologies that are actually political philosophy masquerading as code, a list that includes Xanadu, Freenet, and now the Semantic Web. The Semantic Web’s philosophical argument – the world should make more sense than it does – is hard to argue with. The Semantic Web, with its neat ontologies and its syllogistic logic, is a nice vision. However, like many visions that project future benefits but ignore present costs, it requires too much coordination and too much energy to effect in the real world, where deductive logic is less effective and shared worldview is harder to create than we often want to admit.

Much of the proposed value of the Semantic Web is coming, but it is not coming because of the Semantic Web. The amount of meta-data we generate is increasing dramatically, and it is being exposed for consumption by machines as well as, or instead of, people. But it is being designed a bit at a time, out of self-interest and without regard for global ontology. It is also being adopted piecemeal, and it will bring with it with all the incompatibilities and complexities that implies. There are significant disadvantages to this process relative to the shining vision of the Semantic Web, but the big advantage of this bottom-up design and adoption is that it is actually working now. ```

[+] leoxv|3 years ago|reply

"However, like many visions that project future benefits but ignore present costs, it requires too much coordination and too much energy to effect in the real world" ... Wikipedia, Wikidata, OpenStreetMaps, Archive.org, ORCID science-journal stores, and the thousands of other open linked-data platforms are proofing Clay wrong each day. He has not been relevant for a long time IMHO. Semweb > tag-taxonomies.

[+] asplake|3 years ago|reply

Seems to miss the obvious double whammy:

1) Because it burdens producers to no obvious benefit, a problem forever

2) Because progress over time in language processing makes it less and less necessary

[+] jll29|3 years ago|reply

Natural language processing (NLP) may indeed understand the unstructured text, then according to (2), the "Semantic Web" is not needed, except for perhaps caching NLP outputs in machine-readable form.

(1) is more fundamental: a lot of value-add annotation (in RDF or other forms) would be valuable, but because there is work involved those that have it don't give it away for free. This part was not sufficiently addressed in the OP: the Incentive Problem. Either there needs to be a way how people pay for the value-add metadata, or there has to be another benefit for the provider why they would give it away. Most technical articles focus on the format, or on some specific ontologies (typically without an application).

A third issue is trust. In Berners-Lee's original paper, trust is shown as an extra box, suggesting it is a component. That's a grave misunderstanding: trust is a property of the whole system/ecosystem; you can't just take a prototype and say "now let's add a trust module to it!" In the absence of trust guarantees, who ensures that the metadata that does exist is correct? It may just be spam (annotation spam may be the counterpart of Web spam in the unstructured world).

No Semantic Web until the Incentive Problem and the Trust Problem are solved.

[+] leoxv|3 years ago|reply

1)

- SPARQL is _a lot better_ than the many different forms of SQL.

- Adding some JSON-LD can be done through simple JSON metadata. Something people using Wordpress are already able to do. All this will be more and more automated.

- The benefit is ontological cohesion across the whole web. Please take a look at the https://conze.pt project and see what this can bring you. The benefit is huge. Simple integration with many different stores of information in a semantically precise way.

2) AI/NLP is never completely precise and requires huge resources (which require centralization). The basics of the semantic web will be based on RDF (whether created through some AI or not), SPARQL, ontologies and extended/improved by AI/NLP. Its a combination of the two that is already being used for Wikipedia and Wikidata search results.

[+] oofbey|3 years ago|reply

Exactly.

A refinement on your second point is that the groups who would have benefited the most from semantic web were the googles of the world, but they were also the ones who needed it the least. Because they were well ahead of everybody else at building the NLP to extract structure from the existing www. In fact the existence of semantic web would have eroded their key advantage. So the ones in a position to encourage this and make it happen didn’t want it at all. So it was always DOA.

[+] boxslof|3 years ago|reply

keeping it short because on phone.

working for a company, 100 % semantic web, integrating many, many parties for many years now, all of it rdf.

- you get used to turtle. one file can describe your db and be ingested as such. handy. - interoperability is really possible. (distributed apps) - hardest part is getting everyone to agree on the model, but often these discussions is more about resolving ambuigties surrounding the business than about translating it to model. (it gets things sharp) - agree on a minimum model, open world means you can extend in your app - don't overthink your owl descriptions

- no, please no reasoners. data is never perfect.

- tooling is there - triple stores are not the fastest

pls, not another standard to fix the semantic web. Everything is there. More maturity in tooling might be welcome, but this a function of the number people using it.

[+] 0xbadcafebee|3 years ago|reply

Very well written introduction to some of the problems with semantic web dev.

Personally I think the reason it died was there were no obvious commercial applications. There are of course commercial applications, but not in a way that people realize what they're using is semantic web. Of all the 'note keepers' and 'knowledge bases' out there, none of them are semantic web. Thus it has languished in academia and a few niche industries in backend products, or as hidden layers, ex. Wikipedia. Because there wasn't something we could stare at and go "I am using the semantic web right now", there was no hype, and no hype means no development.

[+] k8si|3 years ago|reply

Very hard to make a business case because for the reasons you mentioned + the costs are very front-loaded because ontologies are so damn hard to build, even for very well-contained problems. Without a clear payoff, why bother

[+] PaulHoule|3 years ago|reply

Semweb people got burned out by the stress of making new standards which means that standards haven't been updated. We've needed a SPARQL 2 for a long time but we're never going to get it.

One thing I find interesting is that description logics (OWL) seem to have stayed a backwater in a time when progress in SAT and SMT solvers has been explosive.

[+] jrochkind1|3 years ago|reply

> Semweb people got burned out by the stress of making new standards which means that standards haven't been updated.

True. But and also, web standards seem to have mostly been abandoned/died beyond just semantic web. I am not sure how to explain it, but there was a golden age of making inter-operable higher-level data and protocol standards, and... it's over. There much less standards-making going on. It's not just SPARQL that could use a new version, but has no standards-making activity going on.

I can't totally explain it, and would love to read someone who thinks they can.

[+] zozbot234|3 years ago|reply

A recent paper connects SHACL (mentioned in OP) to description logic and OWL: https://arxiv.org/abs/2108.06096 . This is a surprising link which seems to have been missed by SemWeb practitioners when SHACL was proposed.

[+] ggleason|3 years ago|reply

That's a very good point re SAT/SMT. F* (https://www.fstar-lang.org/) has done truly amazing things by making use of them, and it's great to be able to get sophisticated correctness checks while doing basically non of the work.

I'm going to have to go away and think about how one could effectively leverage this in a data setting, but I'd love to hear ideas.

[+] blablabla123|3 years ago|reply

Wikidata is quite usable though with SPARQL through REST. To me the biggest problem seems lack of documentation but for small scale experiments interesting stuff can be done with it (with enough caching, probably with SQL). Running my own triple store seems a lot of work though, already choosing which one to use actually

[+] throwaway0asd|3 years ago|reply

Semantic web is data science for the browser. Most people can’t even figure out how to architect HTML/JS without a colossal tool to do it for them, so figuring out data science architecture in the browser is a huge ask.

[+] z3t4|3 years ago|reply

There are two camps,

one that thinks you should use tools to generate HTML/JS and those tools should generate strict XML and any extra semantic data. The problem is that the actual users of these tools either don't care, or know about semantic HTML nor semantic data.

Then the other camp that thinks HTML should be written by hand which makes it small, simple and semantic (layout and design separated into CSS) without any div elements. Hand-writing the semantic data in addition to the semantic HTML becomes too burdensome.

[+] asiachick|3 years ago|reply

I only skimmed the article so maybe I missed I but at a glance it seemed the completely miss the biggest issue. People will intentionally mislabel things. If chocolate is trending people will add "chocolate" to there tags for bitcoin.

You can see this all over the net. One example is the tags on SoundCloud.

Another issue is agreeing on categories. say women vs men or male vs female. for the purpose of id the fluidity makes sense but less so for search. to put it another way, if I search for brunettes i'd better not see any blondes. If I search for dogs I'd better not see any cats. And what to do about ambiguous stuff. What's a sandwich? A hamburger? a hotdog? a gyro? a taco?

[+] bpiche|3 years ago|reply

Looks like it's been temporarily suspended, but worth mentioning: The Cambridge Semantic Web meetup, which I attended frequently around 2010-2013. It was cofounded by Tim Berners-Lee, and I got to meet him there a couple times. In fact, I think its earliest iteration was Berners-Lee and Aaron Swartz.

Met once a month in the STAR room at MIT. The best part was staying after to schmooze and drink with older programmers at the Stata Center bar down the hall from the STAR room. What a cool building, the Stata Center! And what cool topics we would discuss every week. Since Cambridge has so many pharma companies, a lot of the talks were regarding practical ontologies for pharmacology.

edit, a spandrel: Isn't w3c based out of MIT? And Swartz and Berners-Lee were in Boston at the same time.

https://www.meetup.com/The-Cambridge-Semantic-Web-Meetup-Gro...

[+] gibsonf1|3 years ago|reply

The semantic web has been reintroduced as part of "Solid" by Tim Berners-Lee (and Inrupt) and is growing very fast: https://solidproject.org/

The opposite of dead in fact.

[+] rch|3 years ago|reply

JSON-LD has some traction, but the author seems to prefer a slightly different syntax.

I don't see a material difference, but I'm curious to know what others think.

-- https://w3c.github.io/json-ld-bp/#contexts

-- https://w3c.github.io/json-ld-bp/#example-example-typed-rela...

-- https://terminusdb.com/docs/index/terminusx-db/reference-gui...

[+] ggleason|3 years ago|reply

Well, in one sense the are directly interconvertable. The documents in TerminusDB are elaborated to JSON-LD internally during type-checking and inference.

However, it's not just a question of whether one can be made into another. The use of contexts is very cumbersome, since you need to specify different contexts at different properties for different types. It makes far more sense to simply have a schema and perform the elaboration from there. Plus without an infrastructure for keys, Ids become extremely cumbersome. So beyond just type decorations on the leaves, It's the difference between:

  {
    "general_variables": {
      "alternative_name": ["Sadozai Kingdom", "Last Afghan Empire" ],
      "language":"latin"
    },
    "name":"AfDurrn",
    "social_complexity_variables": {
      "hierarchical_complexity": {"admin_levels":"five"},
      "information": {"articles":"present"}
    },
    "warfare_variables": {
      "military_technologies": {
        "atlatl":"present",
        "battle_axes":"present",
        "breastplates":"present"
      }
    }
  }

And

  {
    "@id":"Polity/7286b191f5f62a05290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11",
    "@type":"Polity",
    "general_variables": {
      "@id":"Polity/7286b191f5f62a05290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11/general_variables/GeneralVariables/e4360ee3766c2863f06a34ffcdd9869d41b03d04c6f6af5f94b0a14a47e8e704",
      "@type":"GeneralVariables",
      "alternative_name": ["Last Afghan Empire", "Sadozai Kingdom" ],
      "language":"latin"
    },
    "name":"AfDurrn",
    "social_complexity_variables": {
      "@id":"Polity/7286b191f5f62a05290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11/social_complexity_variables/SocialComplexityVariables/191353c4b7138842ec4029dd07fbd63c9dda752f0cd72b1584f046a274cf024c",
      "@type":"SocialComplexityVariables",
      "hierarchical_complexity": {
        "@id":"Polity/7286b191f5f62a05290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11/social_complexity_variables/Polity/7286b191f5f62a05290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11/social_complexity_variables/SocialComplexityVariables/191353c4b7138842ec4029dd07fbd63c9dda752f0cd72b1584f046a274cf024c/hierarchical_complexity/HierarchicalComplexity/d6a772c5c6919cc511a24ab89f908032aa32b1e3e939d2e0c32044b3a5d9151d",
        "@type":"HierarchicalComplexity",
        "admin_levels":"five"
      },
      "information": {
        "@id":"Polity/7286b191f5f62a05290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11/social_complexity_variables/Polity/7286b191f5f62a05290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11/social_complexity_variables/SocialComplexityVariables/191353c4b7138842ec4029dd07fbd63c9dda752f0cd72b1584f046a274cf024c/information/Information/2f557c1016552f30b8d8bb1bdd9a8584791dd06d32f25bded86a7eb59788ea7f",
        "@type":"Information",
        "articles":"present"
      }
    },
    "warfare_variables": {
      "@id":"Polity/7286b191f5f62a05290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11/warfare_variables/WarfareVariables/704a2c1854a2fe80616fbea0ef0dcd6ce47f5174529ca191617e42397108c437",
      "@type":"WarfareVariables",
      "military_technologies": {
        "@id":"Polity/7286b191f5f62a05290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11/warfare_variables/Polity/7286b191f5f62a05290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11/warfare_variables/WarfareVariables/704a2c1854a2fe80616fbea0ef0dcd6ce47f5174529ca191617e42397108c437/military_technologies/MilitaryTechnologies/80a91b3e5381154387bde4afc66fdd38834de16c671c49c769f5244475cbbb1b",
        "@type":"MilitaryTechnologies",
        "atlatl":"present",
        "battle_axes":"present",
        "breastplates":"present"
      }
    }
  }

[+] neilv|3 years ago|reply

In the late 1990s, I worked on lowercase-semantic Web problems.

I used descriptions like "the Web as distributed machine-accessible knowledgebase".

Some of the problems I identified were already familiar or hinted at from other domains (e.g., getting different parties to use the same terms or ontology, motivating the work involved, the incentive to lie (initially thinking mostly thinking about how marketers stretch the facts about products, though propaganda etc. was also in mind), provenance and trust of information, mitigations of shortcomings, mitigating the mitigations, etc.).

One problem I didn't tackle... I got into distributing computation among huge numbers of humans, and probably stopped thinking about commercial organization incentives. I don't recall at that time asking "what happens if a group of some kind invests lots of effort into a knowledge representation, and some company freeloads off of that, without giving back?". But we had seen eamples of that in various aspects of pre-Web Internet and computing. Maybe I was thinking something akin to compilation copyright, or that the same power that generated the value could continue to surprise and outperform hypothetical exploiters. Also, in the late 1990s, every crazy idea without traditional business merit was getting funded, and it was all about usefulness (or stickiness) and what potential/inspiration you could show.

[+] tconfrey|3 years ago|reply

I think the general message here is that complex and complete architectures tend to fail in favor of simpler solutions that people can understand and use to get things done in the here and now.

Its interesting to me that the recent uptick in the personal knowledge management space (aka tools for thought)[0] is all around the bi-directional graph which is basically a 2-tuple simplified version of the RDF 3-tuple. You lose the semantics of a labelled edge, but its easier for people to understand.

[0] See Roam Research, Obsidian, LogSeq, Dendron et al.

130 comments