We have moved to GitHub Issues
Created by Jan Hebnes 12 Jun 2015, 23:02:51 Updated by Shannon Deminick 26 Jun 2017, 07:11:51
Since this is a long rant, I’ll help you a bit along, the resume is -> http://poedit.net/wordpress <- so Umbraco should adopt this too to get in on this translation eco system…
Branch: https://github.com/janhebnes/Umbraco-CMS/blob/dev-v7-localization-using-gettext/ The interesting part is build/Translation.cmd and the resulting pot file at https://github.com/janhebnes/Umbraco-CMS/blob/dev-v7-localization-using-gettext/src/Umbraco.Web.UI/Umbraco/config/lang/messages.pot
This was an open space topic presented at Codegarden 15 - I would like the Umbraco core project to refactor the internal text translation eco system and adopt GNU Gettext and po+pot files translation strategy. The purpose is to remove the developer pain that exists in working with translating strings to internal gui and open up for a larger set of tools a communities maybe on a package level as well as on umbraco core.
Translation, i18n or localization in asp.net using resource files is broke, many fine blog posts argument to why, and is unusable in an open source space. http://www.expatsoftware.com/articles/2010/03/why-internationalization-is-hopelessly.html http://manas.com.ar/blog/2009/10/01/using-gnu-gettext-for-i18n-in-c-and-asp-net.html
The current implementation in Umbraco uses a simple and clean custom xml file based system with id references to text arrays. A developer must create the string as id and the initial translation en e.g. english. The system is easily expandable to package creators and requires “specialized” translation handling. The missing “id” could be injected at runtime, but if this is an edge case error message it might never find its way to the main translation file xml. The biggest problem right now i believe is that the package eco system is running there own translation subsystems on each package leaving the Umbraco Translations fragmented and an tedious task to handle for developers.
The GNU Project has another strategy that has been in use since 1995. https://en.wikipedia.org/wiki/Gettext/ https://www.gnu.org/software/gettext/ https://www.gnu.org/software/gettext/manual/html_node/index.html https://docs.python.org/2/library/gettext.html#gettext its big in python and https://developer.mozilla.org/en-US/docs/gettext and big in php
The Gettext strategy is to scan the source files and generating a pot (po template) file containing the message “id”, default keywords for localized strings are _(“”) or gettext(“”). And merge/update the po-template into the localized po files updating them so e.g. outdated translation messages are commented out.
Notice the translation mechanics in base gettext are looking at plural forms also https://github.com/neris/NGettext/blob/master/src/NGettext.Tests/BaseCatalogTest.cs
There is a large echo system build around the translation process of these file formats and tools can be used to cooperate in a global scale. http://poedit.net/ is one of the main tools and difference implementations can be found in linux and mac systems.
One of the main arguments to why Umbraco should change the strategy is this: http://poedit.net/wordpress
= How could we work with this. = I have tried this out on my pet project https://github.com/janhebnes/startlist.club and I am using https://github.com/vslavik/gettext-tools-windows for the main “ngettext” scanner and running a batch for updated the translation files https://github.com/janhebnes/startlist.club/blob/master/Translation.cmd
For working with the _ and gettext in asp.net i based my implementation on http://www.fairtutor.com/fairlylocal and tweaked it a bit to introduce LocalizedDisplayNames on models and have _ on the PageView. Everything available for review here: https://github.com/janhebnes/startlist.club/tree/master/FlightJournal.Web/Translations
http://www.fairtutor.com/fairlylocal uses a project build step, i have currently chosen a separate batch.
= How do we go about and handle the change. =
== C# sources ==
Most methods are based in \src\umbraco.businesslogic\ui.cs, methods like GetText and Text and a TextService.Localize. So maybe someone has allready had thought towards GNU when designing this. (nobody came out about it at codegarden 15’) All reference the TextService.Localize. The main issue is to get the “messageid” to be detectable by code scanning up front. Changing TextService.Localize to use po sources is the easy part.
Umbraco.ui contains: GetText (used 65 times) Text (used 53+14+378+20+134+345+10+2120+36) 3110 times
Any use of e.g. non message ids a unusable to the scanner e.g. ui.Text(action.Alias, u) \src\umbraco.cms\businesslogic\workflow\Notification.cs and should be made detectable in some other way to get into the pot file.
Localize is used 136 times
== Changing TextService.Localize to use po or mo sources == Using e.g. https://github.com/neris/NGettext or http://www.fairtutor.com/fairlylocal in the form of e.g. https://github.com/janhebnes/startlist.club/tree/master/FlightJournal.Web/Translations could be one way.
= Branch Created for demonstration = https://github.com/janhebnes/Umbraco-CMS/blob/dev-v7-localization-using-gettext/
Resulting pot based on current source https://github.com/janhebnes/Umbraco-CMS/blob/dev-v7-localization-using-gettext/src/Umbraco.Web.UI/Umbraco/config/lang/messages.pot
= messageid as english text = For removing a step for the developers and making translation process simplere the messageid or key must be changed to the actual english translation instead of a system key. But this can be a later step.
= Potential roadmap =
I would like to get some community feedback on this feature.
an important update to the lang files has been commited on http://issues.umbraco.org/issue/U4-5777#comment=67-21081, the core language files are now enriched with the plugin language files
Closing issue due to inactivity - see blog post for details https://umbraco.com/blog/issue-tracker-cleanup/
Type: Feature (request)
Backwards Compatible: True
Affected versions: 7.2.6
Due in version: