top | item 14920540

Sabotaged by Polish orthography

123 points| gbacon | 8 years ago |blog.plover.com

132 comments

order
[+] idlewords|8 years ago|reply
While we're in this thread: notice that pączki is a plural. Please remember this and don't say 'pierogies'—it's just pierogi.

There's no need to know the singular for pierogi, because no one has ever eaten just one.

In return, I promise to continue working to stop Polish people from pluralizing potato chips as "chipsy".

[+] pavel_lishin|8 years ago|reply
But at a certain point, a loan word becomes "adopted" by its language, right? And at that point, it doesn't make sense to carry over the foreign rules for pluralization, etc.

I'm not sure if "pierogi" is common enough yet for "pierogies" to become the valid way to pluralize it in English, but as a Russian speaker, "чипсы" sounds perfectly normal to me.

After all, if I were speaking Russian I wouldn't say "компьютерз" - I would say "компьютеры", using the correct pluralization in the language I'm speaking in.

[+] peteretep|8 years ago|reply

    > I promise to continue working to stop
A general campaign to stop non-natively-Anglophone Europeans from using "funny" to mean "lots of fun" would also be appreicated.
[+] Sharlin|8 years ago|reply
> In return, I promise to continue working to stop Polish people from pluralizing potato chips as "chipsy".

Finns are great at this as well. Some examples:

  chips - sipsit

  shorts - shortsit

  donut, donuts - donitsi, donitsit

  ribs - ribsit (as in the food)

  wings - wingsit (likewise)
and so on. These are all basically established loans by now.

Then there's the recent abomination that seems to be getting popular as there's no established word for a mobile app yet:

  app, apps - äpsi, äpsit
[+] jonnycomputer|8 years ago|reply
> There's no need to know the singular for pierogi, because no one has ever eaten just one.

That, friend, is hilarious, and so true.

[+] oz|8 years ago|reply
> There's no need to know the singular for pierogi, because no one has ever eaten just one.

Ain't that the truth! My Polish friend also introduced me to 'pierogi rooski'? Apparently, a Russian take on pierogi, with more meat. Do I have the spelling correct?

> In return, I promise to continue working to stop Polish people from pluralizing potato chips as "chipsy".

As a Jamaican, where banana chips are like a national snack, it was amusing to see "chipsy bananowe" for sale. Fun times; I plan to go back.

[+] mootothemax|8 years ago|reply
I see your chipsy and raise you tipsy.

Not because there's anything wrong with it, that is.

It's just always puts a smile on my face when I walk past a tipsy nail salon with a sign that means euphemistically, to my British ears, being on the path to getting properly drunk.

[+] ggambetta|8 years ago|reply
Similar to Italian plurals - one "panino", many "panini", and "paninis" is not a thing.
[+] owenversteeg|8 years ago|reply
Ok, so out of curiosity I googled "singular for pierogi", and it seems to be "pieróg". But there are a number of (seemingly) Polish people scattered around the Google results saying that this is never used, or incorrect, or refers to only the general form for filled dumplings.

Now I know that in real-world usage it would be unlikely to refer to one, but what if there was a need to? Let's say someone made a statue to pierogi, but due to budgetary problems only one of the several pierogi planned was constructed. A tourist asks you how many pierogi make up the statue. How would you respond? Would you use the singular, like "only one pieróg", or would you rework your response so there was no need for the singular, like "sadly only one of the several pierogi was built"? Which seems more natural?

[+] lkozma|8 years ago|reply
I recently came across some "burekasim" in Israel. The double plural there has a taste of the entire mediterranean from turkish/greek/arabic/ladino/spanish/hebrew :) I guess it could be dialed up one more notch if an english-speaking tourist would ask for some burekasims.
[+] mjd|8 years ago|reply
Another Polish-speaker who read this article said she was delighted by my use of “ogoneks” instead of “ogoneki”.
[+] szatkus|8 years ago|reply
> In return, I promise to continue working to stop Polish people from pluralizing potato chips as "chipsy".

So, like "chipy"? I don't think it's a good idea :)

[+] ScottBurson|8 years ago|reply
You never know. I have an old friend from college who has, in his entire life, eaten exactly one potato chip.
[+] waqf|8 years ago|reply
Don't forget panini!
[+] idlewords|8 years ago|reply
Polish orthography is a disaster partly because of the choice of Latin over Cyrillic. The latter would have been a much better fit for the sounds in the language.

Making matters worse, there are homophones (ż and rz, u and ó, ch and h) that depend on when the word entered the language.

Like so many things wrong with the country, you can comfortably blame the Catholic Church for this orthographic train wreck.

[+] adamzochowski|8 years ago|reply
Polish letters on top of Latin are no worse than German (ä, ö, ü, ß), or French (é, à, è, ù, â, ê, î, ô, û, ë, ï, ü, ÿ, ç). Cyrillic is not the answer, as each Cyrillic country has own variations, own letters. See https://en.wikipedia.org/wiki/List_of_Cyrillic_letters

Regarding homophones, this is not 'when word entered the language' but rather, the change of pronunciation no longer matching the phonetic writing.

The 'ó' vs 'u' is homophone because the sound now is the same. However, historically it wasn't. Additionally, since historically it was different sound, it also had different rules for how it morphed during declension.

Similar situation is with 'ż' and 'rz'. There are even cool words that due to declension we can see where they originated from. For example two declensioned words have same pronunciation: 'każe' and 'karze'. The first comes from root word 'kazać' as in tell people what to do. The second comes from root word 'karać' meaning punish.

I would wager that 'ch' and 'h' is the most recent homophone. Some people still alive were taught to use correct hard 'H' vs soft 'CH' when pronouncing words.

[+] mootothemax|8 years ago|reply
And yet - somewhat hilariously, Polish is one of the easiest languages out there in terms of knowing how to pronounce the written word.

Spend a week or two practising a pronunciation guide and that's pretty much well it.

Reading something out loud for the first time? Native level accuracy 95% of the time, easy.

Compared with the absolute crap shoot which is English or - my nightmare - French, reading Polish out loud is an absolute walk in the park.

[+] spatulon|8 years ago|reply
Who do we blame for Vietnamese adopting the Latin alphabet? That makes Polish seem pretty straightforward.
[+] waqf|8 years ago|reply
Could have done what Serbian (and formerly Serbo-Croat) does: freely added letters to the Latin alphabet to match the required set of phonemes, so that there's a one-to-one correspondence between the Latin orthography and the Cyrillic orthography.

(It doesn't solve the problem of missing letters and missing diacritics when trying to write your language using English letters only — but Serbs, Ukrainians etc. can't write their language correctly using only Russian letters either.)

[+] Mediterraneo10|8 years ago|reply
"The latter would have been a much better fit for the sounds in the language."

The Cyrillic alphabet as originally developed for Old Church Slavonic lacks several sounds of Polish, namely the velarized /l/ (which has now become a semivowel /w/ except in peripheral dialects), and the palatalized affricates /ź/ and /ć/.

If you are a Russian speaker, you might think that Cyrillic could represent palatalized sounds by use of the soft sign, but that it not what the soft sign was actually used for originally. It was meant to represent the front reduced vowel /ĭ/ before the fall of the yers. So, Russian choose to represent its phonology by extending the Cyrillic alphabet in a way that was originally unintended, while Polish chose to represent its phonology by extending the Latin alphabet. How is either of these choices better than the other?

[+] int_19h|8 years ago|reply
Polish orthography is weird (subjective opinion!) mostly because of its weird digraphs, IMO. Czech and Serbian also use Latin, but they are easier to parse.

I'm not a linguist, but looking at the differences between the corresponding alphabet, it feels like Polish one had a stronger German influence. The use of W rather than V stands out in particular (and makes no sense in an alphabet that doesn't use V at all!). But also, Germans love their digraphs and trigraphs.

[+] jonnycomputer|8 years ago|reply
This may be true, but may I suggest two things:

1. Any suitable choice of orthography will become unsuitable eventually. Pronunciation is not static.

2. There is, and has been, quite a bit of dialectical variation within the Poland and its diaspora. What may be suitable for one group, may not be for another.

[+] idlewords|8 years ago|reply
A small correction to the OP: paczki means packages and not boxes (pudełka). It used to be a much more common sign in Polish neighborhoods in the US before Poland leveled up to the first world.
[+] joering2|8 years ago|reply
.. and the of course it depends how you want to use the objective boxes (pudełka).

I will throw you some pudełka. I am busy with those pudełkami. There is something wrong with those pudełkom. Whats the size of these pudełek?

And its funny: the newest IOs still don't have "ą"

[+] danielam|8 years ago|reply
"Parcels" is a better translation.
[+] mirekrusin|8 years ago|reply
In comments poles often omit those squiggles which lead to interesting nuances:

"Laske mi robi." means "to condescend/deign to do something for somebody" (ł) or "to do a blowjob" (l).

[+] nathancahill|8 years ago|reply
Similar to how Spanish will not accent capital letters. The México passport for example says MEXICO on the front.
[+] eesmith|8 years ago|reply
How strong a rule is this? Or is it regional?

http://buscon.rae.es/dpd/srv/search?id=BapzSnotjD6n0vZiTp is from the Diccionario panhispánico de dudas, 2005. It shows several examples of accented capital letters, like LA NACIÓN in the headline of a newspaper, and the phrase "ESTÁ PROHIBIDO FUMAR DENTRO DE LAS DEPENDENCIAS DE LA EMPRESA."

That said, La Nación itself doesn't use accents when capitalized. On the other hand, you can see "EL PAÍS" several times at https://elpais.com/ . On the third hand, the El País in Uruguay doesn't use the accent http://www.elpais.com.uy/ . But it does uses ALCANZÓ in a headline http://a2010.kiosko.net/02/06/uy/uy_elpais.750.jpg .

http://procedimientospolicialesargentina.blogspot.de/2016/04... has an interesting mix with the blog title "DIA DE LA POLICIA DE LA NACION ARGENTINA" and a poster image saying "DIÁ NACIONAL DEL POLICÍA".

[+] jmiserez|8 years ago|reply
Isn’t that just historically due to a lack of space on the line when typesetting?
[+] urethrafranklin|8 years ago|reply
French, or, if you will, FRANCAIS (note the lack of cedilla) is the same.
[+] pavlov|8 years ago|reply
For the benefit of French speakers, the ą in Polish seems to be pronounced like the "on" sound.

So "pączki" is pronounced "pon-tch-qui" while "paczki" is "patch-qui".

[+] ninly|8 years ago|reply
I lived in a Polish neighborhood in Brooklyn for a few years, and absorbed as much orthography and pronunciation as I could during that time. I ate a few pączki, too.

While I learned to speak a little bit, it was not enough to be functional beyond basic greetings and ordering food. It was always a lot of interesting fun, though.

The main outcome of all this is that for ten years, whenever I see a Toyota Camry, my weird brain thinks "hmm... tsahm-rih..." and I roll my eyes at it.

[+] mszcz|8 years ago|reply
Ugh, this reminds me of character encoding issues... I hope I never meet the guy who invented ISO-8859-2 or CP1250 or other such nonsense for that guy's sake...
[+] changs|8 years ago|reply
This brings the memory of Internet chats in the 90s when almost nobody used Polish letters with diacritics (because it is faster to type without them and there were a lot of different incompatible encodings). Usually you can understand the meaning from the context but sometimes the same kind of funny misunderstanding would happen as there are quite a few words that without diacritics become completely different but valid words.
[+] Symbiote|8 years ago|reply
Wikipedia gives "Zażółć gęślą jaźń" as an example containing all the Polish diacritic letters.

Is there a standard way to write this when limited to ASCII?

For example, Danish/Norwegian replace æ, ø, å with ae, oe, aa. It seems less likely that something could exist and be reasonably readable for Polish.

https://en.wikipedia.org/wiki/Polish_orthography

[+] warpech|8 years ago|reply
It is quite common to remove the diacritics if you are lazy or don't have access to the diacritics. That phrase becomes "Zazolc gesla jazn".

Most search engines find "Zazolc" and "Zażółć" equal because of that. This becomes a problem in case of the words like "paczki" (boxes) and "pączki" (donuts), which have their own separate meaning - as explained in the article.

In contrast to most European countries, in Poland we use American keyboard layout with "Polish (programmers) layout" keyboard setting in OS.

You press ALT+A, ALT+E, ALT+L, ALT+S, ALT+C, ALT+Z, ALT+X to write "ą", "ę", "ł", "ś", "ć", "ż", "ź", respectively.

[+] kornakiewicz|8 years ago|reply
Mostly no.

Some words written without Polish characters can become ambiguous without context. For example: word "łaska" - "mercy", written without Polish letter "ł" is "laska" - "stick".

Although, in old good times some people used British pound character in texts to express this letter, since '£' is visually similar to 'Ł' and more often available.

For other characters (ś,ź,ć,ż,ó,ą,ę) nothing like that was widely adopted. Workaround would be possible for 'ż' and 'ó' since they are (almost - see below) phonetically identical with 'rz' and 'u' respectively, but it wasn't popular, since most probably would be perceived as sign of very, very bad orthography.

*Almost, since some people claim they can distinguish these, but it's not popular ability.

[+] wlk|8 years ago|reply
When required people just omit the diacritic signs and replace the letter with regular ASCII letter, so for example: ż->z, ł-l, ó->o, etc.

When used in (computer) writing this is very readable, but no one would do this with handwriting, I think most common cases nowadays would be SMS messages (especially on dump phones) or some weird displays that are unable to properly render diacritic letters.

[+] zokier|8 years ago|reply
Its curious how simultaneously we are stubbornly holding onto diacritics (and current orthography in general) while also having continuously lots of issues actually reproducing/handling them. One would think that we either would have gotten better at dealing with them, or dropped them altogether. Especially considering that modern fixed orthographies are generally relatively recent phenomenon.
[+] schoen|8 years ago|reply
Usually the diacritics exist because of something that is contrastive in the language's phonology. This post is mainly about a concrete example of this, where pączki and paczki refer to two different things that a shop could sell. So it's super-helpful that shops can make that distinction in writing!

Or for example

https://en.wikipedia.org/wiki/Minimal_pair#Stress

In Portuguese, my strongest foreign language, I can think of examples like

a 'the (feminine)' / à 'at the (feminine)'

nó 'knot' / no 'in the'

dá 'gives' / da 'of the'

nós 'we' / nos 'in the (plural)'

sê 'I should be' / se 'oneself' / sé 'see (Catholic)'

pode 'can' / pôde 'could (past)'

avô 'grandfather' / avó 'grandmother'

tem 'he/she has' / têm 'they have'

among many others.

At least a couple of these reflect vowel differences, although some would be homophones in speech. Every language that holds on to diacritics would have pairs like this where the diacritics make a difference to the meaning. So people may really appreciate having writing systems that can reflect these differences in order to avoid confusions that would otherwise occur.

[+] avaer|8 years ago|reply
Nit: it's not pawnch-kee if you're speaking American English, it's more like pown-chkee. But still not quite.
[+] jdmichal|8 years ago|reply
Or, for those who are versed in IPA: /pɔ̃.tʂki/.

* There is no individually-pronounced "n" sound; it is built entirely into the nasalized vowel.

* The /ʂ/ sound is actually a laminal retroflex. This is best described as making a "sh" sound, but with the blade of the tongue.

[+] mtrycz|8 years ago|reply
Yeah, we poles are as obsessed with our prononciation as much as italians are obsessed with their food.

People form other countries just can't get it right :)

[+] lukasm|8 years ago|reply
The french `bON ton` is the closes to ą you can get.
[+] ghomrassen|8 years ago|reply
Yeah, I think you're closer, as long as you pronounce the own like owned rather than pow-n, as I read it at first.
[+] yawgmoth|8 years ago|reply
In Michigan people will unabashedly say poonchkee. I've stopped arguing about it with people :).
[+] ChrisArchitect|8 years ago|reply
this reminds me of what seems like a neverending debate about the actual pronounciation of paczki. Is it 'pawnch-kee' or 'poonch-kee'. haha.

Which lead me to this interpretation https://www.youtube.com/watch?v=zdNsFOPYMzE

[+] jdmichal|8 years ago|reply
It should be a vowel sound similar to the English words "thought", "dawn", "fall", and "straw", except nasalized. So "pawn" is probably closer than "poon". (If any one of those have a different vowel sound than the others in your dialect, use the more popular version.)