Launch HN: Lang (YC S19) – Internationalization Built for Devs
Previously, we all worked on building internationalization and localization tooling for companies. In our experience, companies don’t think about translation until too late, and the tech debt builds up very fast. It’s a nightmare to receive a task that says “translate app into Spanish.” Choosing the right open-source framework, refactoring the entire codebase, and integrating with human translators is a massive effort. As engineers, we wanted to work on features - not putting every string in our codebase into a translations.json file. In our months of internationalization work, we couldn’t find a good all-in-one toolkit. So we built Lang.
Like other internationalization libraries, Lang gives you a tr() function. Wrap your strings with tr(), and we’ll show your users translations that correspond to their language settings at run-time. But how do you actually get the translations? Open-source frameworks like Polyglot.js stop here, but Lang doesn’t. Run “push,” and our command-line tool will parse your code files, find tr() calls, collect newly added strings, and send them to human translators for you. For JavaScript, we use Babel to construct an Abstract Syntax Tree (AST) of your code, and traverse the tree to find tr()’d strings. For a developer, this makes it simple to add/remove/update strings: just run “push” in your terminal. You can track the status of your translations on our dashboard, and when they’re done just run “pull.” We’ll generate a translation file for you, and connect it with our tr() function. You own the file - Lang doesn’t make any network requests for translations at run-time, and your translations always load, even if our service is down.
This works for static strings in the code, but what about dynamic content in the backend or database? We expose a function called liveTr(), which takes a string argument. The first time liveTr() sees an untranslated string, it will make a request to Lang to translate it and return the string in its original language. But the next time, it will fetch the translation on-demand. We’ve shipped liveTr() with built-in caching functionality to reduce the number of network requests. We also have self-hosted solutions for users with high uptime requirements. This is a common in-house feature companies build for internationalization, and we want to make it available to all devs.
Lang currently supports JavaScript and Typescript apps (React, React Native, Vue etc.) with closed betas for Django, Android, and iOS. Give us a try at https://www.langapi.co/signup - machine translations are free, so you can see your app in another language in minutes. If you use human translations, we charge $99 / month for our tooling, and 6-8 cents per word translated. A lot of our work is inspired by open-source, and we want to give back - if you’re building an open-source project or non-profit, ping us at [email protected]. We’ll drop the monthly fee :)
The HN community builds amazing products, and we’re sure there are plenty of people here who have translated their apps - we’d love to hear your experiences in this area and your feedback on how we can improve!
[+] [-] codingdave|6 years ago|reply
What would be compelling is if you could pro-actively call out the bigger gotchas in translation - grammatical differences that make you change word orders, different mechanism for handling plurals, etc. If you could preemptively warn us, even before a "push" that we may hit a problem, I'd take a closer look. For example, flagging a line saying, "Hey, it looks like you are using phrasing that will be problematic in <Italian/Hindi/Russian/etc.> Here's why..."
[+] [-] abhisiv|6 years ago|reply
We would love to explore more ways we could help pro-actively solve these problems. Ping me at [email protected] if you want to talk more!
[+] [-] mkycl|6 years ago|reply
[1] https://projectfluent.org/
[+] [-] davidzweig|6 years ago|reply
Here's our approach.
1. Move all strings into a Google doc. Takes about 8 hours.
2. Organise strings into groups with screenshots, think carefully, split strings, look for reused strings, reword things to make them simpler and easier to translate, add notes to some strings to make the meaning more explicit. Very tedious part of job, 2-3 days work.
3. Put the doc, editable with link, onto upwork with a fixed payment, somewhat generous for translation wordcount. Check translator is a native in the target language, had some good feedback and ideally some IT/programming experience. Order translations for the languages we can check ourselves (5-6 languages).
4. Check the translations received for issues. Translators typically misinterpret the same things, as the source was not clear enough. Fix these issues. Maybe 4 hours work.
5. Now send to 10+ other translators for the languages we don't understand. Cross fingers that these will be ok.
6. Check translations of labels for homogenous usage of semicolons, capitals, fullstops etc. Struggle with zh/ja/ko.
7. Use a small JS script to transform CSV output from sheets to JSON for chrome.i18n.
8. Cycle through all locales and for overflowing text or other issues.
9. For any extra strings that we might need later, can try Microsoft UI translations database, or else, Google translate (which is mostly ok, can check the reverse translation).
Honestly this all was quite a lot of boring work, but we probably ended up with reasonable translations at a good price, and managed to pay translators reasonable money.
[+] [-] earenndil|6 years ago|reply
FYI, these are frequently called CJK (chinese, japanese, korean). Although, that more has to do with the fact that each character takes multiple keystrokes to type and thus is harder to support, than to do with the locale.
[+] [-] abhisiv|6 years ago|reply
[+] [-] jedberg|6 years ago|reply
Do your clients do local caching, or is my uptime dependent on your uptime (unless I code in my own caching I suppose)?
[+] [-] cyrieu|6 years ago|reply
For dynamic strings in the database/generated by users, our servers need to be up to receive + handle the translation request, but that will be cached the next time you try to look the translation up.
[+] [-] osrec|6 years ago|reply
Instead of saying "Lang me aapka swaagat hai", you've got "me aapka swaagat hai Lang".
It's like saying "Lang Welcome to".
Nice idea though, if you can iron stuff like the above out.
[+] [-] cyrieu|6 years ago|reply
[+] [-] kgodey|6 years ago|reply
[+] [-] abhisiv|6 years ago|reply
Sending translations to the agencies we've partnered with is completely optional.
[+] [-] geoffreyy|6 years ago|reply
[+] [-] abhisiv|6 years ago|reply
[+] [-] scrollaway|6 years ago|reply
The good:
The library's API seems pretty well-thought-out. A good i18n API in JS/TS is highly needed, even more so one that works well with React. I use i18next in my projects but it's mediocre, although I don't know that the difficulties I end up facing with it wouldn't show up here.
The bad:
Pricing. Sorry but translation services are extremely competitive, and players that have been around a long time such as Crowdin, Transifex and Weblate have the benefit of being already trusted by name by a huge community of devs and translators.
You also talk about open source a lot but I'm disappointed your web tooling doesn't seem to be open source & self-hostable. This is one point where you really could differenciate yourself.
The ugly:
It looks like you've pretty concretely tied your i18n API and your translation UI together. I can't see your UI or whether it's any good, but I'm likely to want to use your API with a different translation service, or your translation service with a different API.
Also, please, Google oauth is basically a requirement for any b2b service.
(I'm happy to give more thoughts on a video/screenshare chat if you like, feel free to reach out, email in my profile; always want to help new players in the i18n space)
[+] [-] khalilravanna|6 years ago|reply
1. Am I correct in understanding this is meant as a client-side only solution for now? Right now we have a pretty complicated translations process that needs to support translations that are spread across the client and server. Would this support a hybrid approach like that?
2. Another question I have is where does the `translations.json` file come from and where is that stored? Is that just generated by the CLI and then we have to deal with serving that however we want?
3. Is there one `translations.json` file per language? One with all of them? Are there performance concerns with sending large files like that over to the client? This is a general question for me to other developers of large sites: how do you deal with tons of translations?
4. Any plans to support existing translations? E.g. if I have an existing set of translations keys and values can I plug those in somewhere? I know y'all are bootstrapping so it wouldn't surprise me if that's a Future feature.
Again, love the idea of this, and it would be super cool if this solves the problems due to complexity we currently have with supporting a ton of languages.
--
For some background our current solution involves a generating a rather large (> 1MB) `translations.json` file for each language that we serve to the client via a CDN. Typical map of keys to values.
We create the keys ourselves as we go along something like `dashboard.salesCard-helpText`. Then we have to kludgy Drupal instance to populate the key and value, add some tagging to show it needs to be translated. Translations get entered into that Drupal instance. All of this is entered manually. Then that gets used to generate the `translations.json` file I mentioned earlier.
We have plans in the future to overhaul the process.
[+] [-] abhisiv|6 years ago|reply
2. The 'translations.json' file is autogenerated by the CLI and updated each time you pull translations. It's automatically bundled in the deploy process so you don't need to do any extra work.
3. Currently we have a single translations.json file. Space hasn't been an issue yet but we plan to add splitting to reduce it. For dynamic content which can be large, we have solutions where we can serve the content as a CDN. We could also give clients a microservice if they would like to self-host or directly update a cache on disk on the deployed machines. Still experimenting with the best/easiest way to do this.
4. Sure thing, we have a file upload in our dashboard right now, but we want to add this to our CLI to make it more accessible.
I understand your frustration and one of our core philosophies is to do away with keys completely. Our keys are auto-generated and not touched by the user. The code can have the actual text which is a lot more readable. Ping me at [email protected] and I'd love to help solve your problems.
[+] [-] felipap|6 years ago|reply
[+] [-] cyrieu|6 years ago|reply
1) JavaScript-based apps these days have complex rendering logic that makes the HTML-parser method to find + translate strings unfeasible. Every company we've worked at has needed to extract each string and wrap them with a special `translate` function in their codebase.
2) We make heavy use of the Babel and TypeScript compilers to work with JavaScript ASTs, and there's been huge progress on those recently.
We've thought about NLP, but quality is a huge concern of ours, and we're not quite ready yet to roll that out to companies. If that's something you're interested in, would love to chat, send me an email at HN username + gmail!
[+] [-] betimsl|6 years ago|reply
[+] [-] abhisiv|6 years ago|reply
[+] [-] mandatory|6 years ago|reply
[+] [-] abhisiv|6 years ago|reply
Ping me at [email protected] if you want a demo of this beta tool or if you want it for another framework.
[+] [-] mping|6 years ago|reply
A few questions:
- Some translations are a bit context dependant, what happens if I dont agree with the translation? - Sometimes we do some kind of media localization, eg: users in france will see a different image than users in portugal. Are you a translation only shop, or you plan to do some kind of l10n?
Best of luck!
[+] [-] abhisiv|6 years ago|reply
We currently handle plan to handle localization of text (regular, dates, currency, time, gender, plurals, etc.). Handling images is interesting though, I would love to explore that more if it was a pain point for you. Ping me at [email protected]'d love to talk!
[+] [-] smitop|6 years ago|reply
[+] [-] cyrieu|6 years ago|reply
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] jedberg|6 years ago|reply
[+] [-] pfista|6 years ago|reply
[+] [-] abhisiv|6 years ago|reply
[+] [-] AlchemistCamp|6 years ago|reply
[+] [-] abhisiv|6 years ago|reply
[+] [-] polskibus|6 years ago|reply
[+] [-] abhisiv|6 years ago|reply
[+] [-] whouweling|6 years ago|reply
[+] [-] abhisiv|6 years ago|reply
[+] [-] aaron425|6 years ago|reply
[+] [-] abhisiv|6 years ago|reply
[+] [-] jechiu|6 years ago|reply
[+] [-] cyrieu|6 years ago|reply
[+] [-] mdrx|6 years ago|reply
[+] [-] abhisiv|6 years ago|reply