Mathematicians prove the triviality of English

[+] johncolanduoni|10 years ago|reply

Click-bait title much? I don't know about anyone else, but when I read that title I thought it would have something to do with syntax or semantics, not as a monoid word problem[1]. Also, why did they call it "semigroup with identity" instead of monoid?

[1]: https://en.wikipedia.org/wiki/Word_problem_for_groups

[+] igravious|10 years ago|reply

Not click-baity but seriously lacking in information. The work relates to the relationship of orthography to pronunciation in English. From the title I was expecting some syntactical or grammatical result. At the very least the title should read, "Mathematicians prove the triviality of English pronunciation" but even then that misses the mark, doesn't it?

Not only that, but I think the conclusions are frankly incorrect. I would not say that because LAM=LAMB (when spoken) that B=1. I would say that LA=LA and M=MB. B is not silent elsewhere, only in combination with M so it is false to discount the B (setting it to 1) without contextualising that B as being alongside an M when this happens, if you see what I mean. This to me seems so obvious that I fear I am missing something huge here because I can't think how otherwise they can assert what they are asserting.

Don't like to be harsh or snarky on HN but in this instance...?

[+] CJefferson|10 years ago|reply

With regards the 'semigroup with identity', depends what area of research you are in.

Semigroup researchers would naturally talk about many extensions to semigroups, such as 'with identity'.

[+] good_gnu|10 years ago|reply

This is merely a different version of the old linguist joke that "In English, all letters are silent". Leaving this here for the interested: https://en.wikipedia.org/wiki/Silent_English_alphabet

[+] anon4|10 years ago|reply

"Queue is just Q followed by four silent vowels"

[+] mannykannot|10 years ago|reply

Thanks for the link. One of the examples is the 'e' in 'like'. In this case, though it is silent, it affects the pronunciation of the word - of the 'i', specifically.

[+] jameshart|10 years ago|reply

Seems to prove the decided non-triviality of English, really. In fact, it could read as part of a proof that English words can't be read, or written, at all, because essentially it shows that English orthography has basically no rules - all sequences of letters can be pronounced in any way.

[+] andreasvc|10 years ago|reply

Uh that doesn't follow at all. Just means that the rules are a bit more complicated than "this letter is always pronounced that way". Neither does it mean that "all sequences of letters can be pronounced in any way".

Look up Chomsky & HalLe's the sound pattern of English for an attempt at completely describing English phonology with rules.

[+] nkrisc|10 years ago|reply

Oh, there are rules. There's just little relating them. You must learn them all separately.

[+] phkahler|10 years ago|reply

Proof by contradiction. They are using pronunciation to define algebraic equality. So the minute B=1 and C=1, we can write B=C which is not true under the pronunciation rule. We have a contradiction which indicates the premise being false. In other words, using English pronunciation in that way is wrong. That's good because it seemed pretty stupid on first reading it. Glad my intuition on that was right.

[+] judk|10 years ago|reply

The whole point of the article is to show that the premise is else, in every single case: there is no letter in English that always has unambiguous prononuciation.

[+] grabcocque|10 years ago|reply

I think this is why you don't let linguists do maths.

[+] andreasvc|10 years ago|reply

Why is that? This was actually mathematicians doing "linguistics".

[+] anunderachiever|10 years ago|reply

That's a joke ... maybe to show how uncritically journalists will publish anything that comes across as "mathematical".

[+] judk|10 years ago|reply

People are misunderstanding "trivial". Here what they proved is that there is no way to encode any of the English pronunciation rules as a function purely of context-free spelling of phonemes in a way that is consistent across the language. The data shows that all spellings must yield identical pronunciations, unless English JS inconsistent and/or context senstive. Obviously, the latter is true.

[+] dahart|10 years ago|reply

Which people? The writers or the readers? 'Cause the context-free interpretation of the headline suggests the writers intended to make the content of an article that fits the definition of trivial sound profound and mathematical in a way that it isn't.

[+] laotzu|10 years ago|reply

>"By phonemic transformation into visual terms, the alphabet became a universal, abstract, static container of meaningless sounds"

-McLuhan

[+] dvh|10 years ago|reply

But there are at least 6 ways to write "f" in english (for, off, photon, enough, wife, mazeltov);

[+] mannykannot|10 years ago|reply

I don't see the difference between 'for' and 'wife' unless you are considering the 'fe' in the latter to be equivalent to the 'f' in the former. If so, then I wonder if the difference might alternatively be attributed to a silent and possibly 'magic' e (something I just learned about via good_gnu's post: https://en.wikipedia.org/wiki/Silent_e )

[+] elthran|10 years ago|reply

I don't think you can include the mazeltov type there - I cannot think of a native english word that uses that pronunciation. Happy to be proven wrong though.

[+] pessimizer|10 years ago|reply

My favorite page on the internet about the regularity (or lack thereof) of English orthography; oldie but a goodie: Hou tu pranownse Inglish.

http://zompist.com/spell.html

[+] wodenokoto|10 years ago|reply

Well, if we set all letters to equal 1, then what is stopping this from working out?

Haven't they just shown that the product of any combination of 1's is equal to the product of any other combination of ones?

[+] johncolanduoni|10 years ago|reply

The whole point is that they didn't immediately set all letters equal to one. The only relations they added (on top of cancellation) was that identically sounding words are equal within the monoid. From this, they were able to prove that the group must be the trivial group, no matter how you try to come up with a multiplication table.

[+] chriswarbo|10 years ago|reply

> Well, if we set all letters to equal 1, then what is stopping this from working out?

> Haven't they just shown that the product of any combination of 1's is equal to the product of any other combination of ones?

tl;dr they weren't trying to find a solution, they were trying to characterise the behaviour of all possible solutions; but it turns out that there are no "non-trivial" solutions (i.e. setting all letters to equal 1 is the only way to solve it).

Longer version:

Technically, yes. However, that's not really the goal of this kind of algebra.

In school, we only tend to do algebra where all of the constants and variables are numbers; e.g. 2 * x = 4. Our goal was usually to find a particular value for x which is consistent with the equations, e.g. "solve for x" to get x = 2. Your solution of "set all letters to equal 1" is a perfectly valid way of achieving this kind of goal.

However, for the kind of algebra this article is about, we don't restrict ourselves to working with numbers. Instead, we explicitly avoid talking about "concrete" representations at all. We only focus on the equations we've been given. Some sets of equations are so common that they're given names, like "group (laws)", "semigroup (laws)", "field (laws)", etc.

The quote in the article tells us what the equations are:

> Regard English as a left-cancellative and right-cancellative multiplicative semigroup with identity, i.e. obeying the relations XY=ZY or YZ=YX implies X=Z, and having an element “1” such that 1X=X1=X.

Lots of things satisfy these equations. Some HN-relevant examples:

- Positive integers, where "1" is the number one and multiplication (written as juxtaposition "xy") is integer multiplication. Notice that we can't allow zero, since 1 * 0 = 2 * 0 does not imply that 1 = 2.

- Integers, where "1" is the number zero and multiplication is integer addition.

- Booleans, where "1" is False and multiplication is OR.

- Booleans, where "1" is True and multiplication is AND.

- Sets, where "1" is the empty set and multiplication is set union.

- Lists, where "1" is the empty list and multiplication is concatenation.

- Functions, where "1" is the identity function (i.e. "identity = function(arg) { return arg; }") and multiplication is function composition (i.e. "compose(x, y) = function(arg) { return x(y(arg)); }")

- Commands, where "1" is the no-op command (i.e. it performs no actions) and multiplication is sequencing (i.e. "xy = x; y")

By focusing on the equations and ignoring any particular representation, our results will apply to all representations; making this "universal algebra" a very powerful method.

What these mathematicians have done is impose a whole load of extra equations on top of the semigroup laws, of the form "AISLE = ISLE", etc. They've then shown that this large set of equations is equivalent to the single equation "x = 1".

In other words, they've shown that your solution (AKA the "trivial" solution) is the only solution. In other words, by imposing these extra rules, we've gone from a relatively interesting system which can describe functions, commands, lists, etc. to a relatively boring one, where representations might include:

- The set {one}, where multiplication is integer multiplication.

- The set {zero}, where multiplication is integer addition.

- The set {False}, where multiplication is OR.

- The set {True}, where multiplication is AND.

- The set {{}}, where multiplication is set union.

- The set {[]}, where multiplication is concatenation.

- The set {identity}, where multiplication is function composition.

- The set {no-op}, where multiplication is sequencing.

[+] raverbashing|10 years ago|reply

What they proved is that, for certain words, all letters can work as a multiplicative identity element (which if we were talking about numbers it would be 1)

Or, https://en.wikipedia.org/wiki/Ghoti

[+] eru|10 years ago|reply

This wouldn't work in eg German, where there's no way to cancel out pronunciation like that.

[+] nickpsecurity|10 years ago|reply

The title is ironic for me given that I've seen mathematical types push formal specification of software (eg Z, VDM) for over a decade straight with a consistent, English-related justification. They say that English is too imprecise and ambiguous to be sure you can understand what specs mean. It was true in practice enough that a combined formal (math) and informal (English) specification became a requirement for any correctness and security argument for highly assured systems.

And now some are saying English is trivial. Haha.

[+] poelzi|10 years ago|reply

This wouldn't happen with lojban ;)

[+] andreasvc|10 years ago|reply

No but then again there's barely anyone to talk Lojban with.

[+] valine|10 years ago|reply

All it really proves is that English spelling rules is ridiculously inconsistent. It almost seems like an strange cousin of numerology. Where numerology applies significance to to arbitrary numbers, this applies significance to arbitrarily spelled English words.

[+] kefka|10 years ago|reply

Wow. And here I thought they were going to discuss vectorization of the English language, akin to how Word2Vec does it.

Instead it's first grader style writing with fake subtraction of letters such. That's a tremendous let down.

[+] tempodox|10 years ago|reply

That English is trivial is nothing new. That the average US-ian doesn't even know their own language should be more surprising (but isn't). Mark Twain still has the best rant about it, and he knew English.

[+] r-w|10 years ago|reply

Seems kind of obvious. When you can set so many products equal to each other, what could each of the terms equal but one?

62 comments