top | item 43390655

Most of the World Can't Code

10 points| jayathra | 11 months ago

Programming is only accessible to those who understand English and the Latin alphabet.

If you don’t, your chances of becoming a programmer drop drastically - not because you lack intelligence, but because everything from syntax, documentation, and debugging tools is built in English.

Why is coding still tied to a single language? Shouldn’t anyone, regardless of their native script, be able to write Python in Japanese, Arabic, Sinhala, or Hindi - while keeping full compatibility with existing ecosystems?

Has anyone here faced or thought about this problem? What do you think the biggest challenges would be?

41 comments

order

yshklarov|11 months ago

This is by no means unique to programming. Many areas of knowledge are less accessible to those who don't speak English, and much more so to those who don't speak any of the dozen major languages. Because of this, many people will simply learn (enough) English. It's not such a big deal.

In my view, having a single lingua franca is nice. It better facilitates knowledge transfer. I wouldn't want to see a fracturing where each area of knowledge (or, say, every specialization/application programming) is best treated in a distinct linguistic community. That would be bad for everyone.

jayathra|11 months ago

I see your point - having a global language certainly helps with knowledge transfer, and English has become that standard for programming. But is learning English really 'not a big deal' for everyone? For someone with limited access to quality English education, it could take years before they’re comfortable enough to learn programming effectively. Meanwhile, a fluent English speaker can start coding right away.

Rather than fragmenting knowledge, what if we had a system that let people write and learn code in their native script, while still maintaining full compatibility with the existing programming ecosystem? Similar to how Unicode enables multiple languages on the web without breaking global communication. Do you think that could work?

throwaway798214|11 months ago

There are loads of programming languages which have nothing to do with the English language.

Assembly:

   LDA #$01
   STA $0200
   LDA #$05
   STA $0201
   LDA #$08
   STA $0202
Brainfuck:

   >>,[>>,]<<[
   [<<]>>>>[
   <<[>+<<+>-]
   >>[>+<<<<[->]>[<]>>-]
   <<<[[-]>>[>+<-]>>[<<<+>>>-]]
   >>[[<+>-]>>]<
   ]<<[>>+<<-]<<
   ]>>>>[.>>]
Oh, you meant easy to learn programming languages based on a real language? Yeah, English just happens to be one of the easier languages to learn and if you need to learn a programming language you can just as well learn English on the side. I did.

jayathra|11 months ago

I’m curious—do you think learning English is equally easy for everyone? Many programmers come from regions where English education is either poor or expensive. If someone is highly logical but struggles with English, should that be a barrier to learning how to code? Also what was your level/accessibility to education of English before you learned it? Did you start from scratch? Did you know a language that shared a root language with English?

jotux|11 months ago

Assembly instructions are English mnemonics. LDA->LoaD Accumulator, STA->STore Accumulator, ADD, SUB, JMP, MOV, etc.

Tomte|11 months ago

We've had localized programming languages: Visual Basic for Applications in Germany had "prüfe" instead of "select" (yes, with Umlaut): https://de.wikipedia.org/wiki/Visual_Basic_for_Applications?...

Today German Excel still doesn't accept "XLOOKUP", but insists on "XVERWEIS". On input, that is, it silently converts languages when opening .xlsx files.

jayathra|11 months ago

But do you think they failed because they were localized per country, rather than designed for global compatibility from the start?

Silent conversions could definitely be messy, but what if there were a standardized system that allowed programming in any script while keeping everything interoperable? Could that avoid the pitfalls of past localized languages?

nextts|11 months ago

> silently converts languages when opening .xlsx files.

Sounds like a bug waiting to happen

yubblegum|11 months ago

One option is to adopt a front end for your language for existing (legacy) langauges that have been overwhelmingly developed by people who write using Latin letters: you type in your language and the front-end maps it to Latin letters. There are various issues here, depending on the written form of your language, however. Latin letters are effectively block type glyphs, which lends itself to programming.

Other option is for future languages to be formally specified in a globally adopted IL and then your local area geeks are responsible for writing a front-end that transpiles to that IL.

Or we could design and adopt a universal (~visual) glyph for programming. Various structural elements (think [ ], { }, < >, etc.) are pretty much that already. Then we have the (pseudo) mathematical elements (+, -, /, =) which are again universal. That leaves us with named elements which remain somewhat problematic.

In any event, all this seems to be a transitional period's grief. Very soon, you will interact in your native language with some AI and that thing will write the actual code. :)

Regardless (thinking of music notation here) programming notation is ultimately a specialized form of notation. Are you bothered by the fact that a musician in x-land has to learn the notation invented by some Europeans way back when?

jayathra|11 months ago

I love the idea of a universal intermediate language (IL) with region-specific front-ends—that could be a great way to make programming more accessible without fragmenting the ecosystem.

But with AI handling more code generation, how important will it be for people to truly understand the underlying code? Do you think AI will make coding more of a black box, or will there always be value in knowing how things work under the hood?

Music is a great comparison—eastern music notation exists in native scripts, and western pieces can be translated into it. Could programming work the same way, where the structure remains universal, but the notation adapts to different languages?

tacostakohashi|11 months ago

There are, of course, programming languages that have non-English keywords, they're just not very popular.

I guess if you're learning all of C/C++/Java/Python/etc... the "English" keyword meanings are a tiny/trivial part of what you need to learn anyway.

Also, using English means you only need ASCII, and a US keyboard layout which allows easy entry of the printable ASCII characters. For Japanese, Arabic, etc... you need need Unicode, input methods, UTF-8 / UTF-16 etc., all of a sudden there's a whole lot more to go wrong than if you use English in ASCII.

jayathra|11 months ago

ASCII and a US keyboard do keep things simple. But given that modern systems already support Unicode everywhere (from websites to filenames), do you think the complexity argument still holds today?

beardyw|11 months ago

When I was very young in the 1950s I remember my older brother opting to learn German because so many scientific papers were written in it.

That said, I later opted for Latin for reasons neither I nor the examiners could explain.

jayathra|11 months ago

Do you think we’ll always be stuck in this cycle, or is it possible to design programming languages that aren’t tied to any one language, preventing this issue in the future?

taylodl|11 months ago

APL and J aren't based on English. Several countries have developed local language-based programming languages that have simply stayed local. Global adoption of programming languages will be affected by global language, amongst other factors. English is currently the global language. Your guess is as good as mine for how long that will remain the case, but so long as English is the global lingua franca, then programming will largely be done in English.

jayathra|11 months ago

English is the global standard now, and that’s shaped programming as well. But what happens if, in the future, another language (like Hindi or Mandarin) becomes dominant? Would the programming world have to shift again?

If programming were script-agnostic from the start, we wouldn’t have to constantly adapt to shifting global languages. Instead of relying on English or any single language, shouldn’t we explore ways to make programming more accessible to all scripts from the ground up?

unfixed|11 months ago

I think having a lingua franca have way more advantages than drawbacks. As a spanish native speaker, I know is easier for me that for someone whose first language doesn't come from latin, but is really not much of a hassle.

Even coming from one of the most used languages in the world (and having a just enough english level), is very rare that I search of read something programming related in other language than english.

jayathra|11 months ago

For Latin-based language speakers like Spanish, Italian, or French, learning programming in English might not be a huge hurdle. But for people whose native languages use completely different scripts (Chinese, Arabic, Sinhala, etc.), the challenge is much greater.

You mentioned that you rarely search for programming-related content in Spanish. Do you think that’s because English is simply better suited for programming, or is it more about the lack of high-quality programming resources in Spanish?

If programming had built-in support for multiple scripts while keeping a universal structure, do you think more people would use resources in their native language?

romanhn|11 months ago

I was programming BASIC and Pascal as a non-English-speaking kid and the Latin-based syntax was not a problem at all. I think you underestimate how far sheer drive can get you. That said, I do agree that basic understanding of English is more or less a requirement these days if you want to stay up to date on modern developments. If not, well, many languages have enough translated materials to get by.

jayathra|11 months ago

That’s amazing! I have visited certain parts of the world where the access to resources are minimal in terms of learning English. I don't know the specifics of your story/background but especially when the native language doesn't involve the Latin alphabet at all, the learning curve is steeper. A fluent English speaker can begin coding immediately, while a non-English speaker has to learn two languages at once.

Even if basic materials exist in other languages, most advanced documentation, debugging tools, and libraries remain in English. Do you think that creates a significant disadvantage for non-English speakers?

disqard|11 months ago

You could also remove the text from the programming "language", like in this research:

https://dl.acm.org/doi/10.1145/3173574.3174196

jayathra|11 months ago

Thanks for sharing this! The idea of removing text from programming languages is really interesting, especially for making coding more accessible to people with different language backgrounds.

The paper you linked explores text-free programming through visual programming tools, which is one possible solution. But what do you think about making existing text-based languages, like Python or JavaScript, script-agnostic instead of moving entirely to visual coding?

Would love to hear your thoughts on whether a hybrid approach — where code can be written in any script but still remains text-based — could work.

slowtrek|11 months ago

I don't think the language itself being English is an issue at all because there's only 50-100 keywords per language, maybe. It's all the code comments and documentation/manuals/discussion that can be the issue.

jayathra|11 months ago

What do you think is a more realistic solution? Should we focus on improving translations for documentation and learning materials, or is there a way to make programming itself less dependent on a single language?