top | item 45457967

(no title)

onestay42 | 5 months ago

"Cattle labeling meat labeling supervision task transfer act" is just as bad as Rinderkennzeichnungsfleischetikettierungsüberwachungsaufgabenübertragungsgesetz, English just gets to use spaces where German doesn't. The underlying construction is the same. (I definitively got that translation wrong)

discuss

order

magarnicle|5 months ago

Usually English will try to come up with a single, Latin-or-Greek-derived word for compound ideas like this, which is another bad habit.

So surgery is full of -ectomies instead of -cut-outs.

1718627440|5 months ago

Medicine terms in German also use Latin or Greek, since this is the subject language, so this is a bad example.

arkensaw|5 months ago

English gets to use a sentence. It can be reworded any number of ways. I did a bit of quick googling and the clearest English I came up with for `Regulation (EC) No 1760/2000` is "Requirements for the Labelling of Minced Beef" which is a lot easier to process than Rinderkennzeichnungsfleischetikettierungsüberwachungsaufgabenübertragungsgesetz. The reason we split code over lines is the same reason we split sentences into words. Easier for the brain to parse.

I wonder do German brains work on a much longer context window because of the language?

1718627440|5 months ago

> I wonder do German brains work on a much longer context window because of the language?

Maybe, but more due to the spelling of numbers and long sentences. Compound words are not an example of this, since Germans can parse these words just fine as different things. It just means that the lowest "tokenization" in everyday use is not the word, but subcomponents of them.

Do English native speakers "tokenize" expressions in words? Do you see it as '(labelling) (of) (minced)' or '(label)l(ing) (of) (minc)(ed)' ?

I can't speak for most Germans, but the algorithm I think I use is just greedy from left to right. This is also consistent with how mistokenization in common puns works, so I think this is common.

In primary school we trained to recognize syllable boundaries. Is that just a German thing, or is this common in other countries? You need to know these for spelling and once you know these, separating word components becomes trivial.

detaro|5 months ago

a) the title of the regulation is not equivalent to the law (unsurprisingly), onestay42's translation is clunky but a lot closer

b) the official title of the law was "Gesetz zur Übertragung der Aufgaben für die Überwachung der Rinderkennzeichnung und Rindfleischetikettierung", so how again is it that English "gets to use a sentence" and German doesn't? German has the choice depending on context, sometimes having one word is convenient.

bmacho|5 months ago

Maybe in speech they are similar, but not in writing. The underlying construction is as different as it can be. English puts " " between words, and German does not.