-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adopt Translation & Localization Management Platform for docs #45175
Comments
/area localization |
For Transifex: if Transifex commits a change, is there any license or copyright asserted by Transifex? (if not: I think we could make a tool that allows contributors to adopt those changes and confirm, through the tool, that their CLA applies to the contribution(s) in that commit). |
Consider integrating Crowdin (https://support.crowdin.com/enterprise/authentication-settings/) along with Transifex for translation purposes. Crowdin offers a comprehensive set of authentication settings, as illustrated in the image below: Additionally, Crowdin provides a versatile range of translation tools, enhancing the efficiency and engagement of contributors in the translation process. The image below showcases the various tools available: Encouraging the use of Computer Assisted Translation (CAT) tools like Transifex and Crowdin can significantly streamline the translation process and boost overall contributor engagement. It would be beneficial for CNCF to explore and embrace these tools for a more efficient and collaborative translation experience. |
I like it. Translation memory is always asset! How about to introduce chatGPT that can provide better results than traditional machine translations. |
Let's have one issue per change we need to make; if we'd like to have a wider discussion, see: as repo discussions |
@MaxymVlasov & colleagues: do you have a reply for #45175 (comment) ? |
Hello @sftim .
According to https://www.transifex.com/legal/terms/ the answer for quoted question is Transifex does not assert any ownership rights over the content a person submit, including text published via git repositories for localization. The terms specify that while using Transifex services, you retain full ownership of your content ("your words"), and Transifex only requires limited rights to perform the services requested by localizator\customer. This includes actions like hosting, sharing, or otherwise processing your content as directed by you, without claiming any copyright or license over it |
@sftim pretty similar conclusion for CrowdIn mentioned by @Andygol https://support.crowdin.com/terms/ Clients are responsible for ensuring they have the necessary rights to their content. Crowdin asserts no ownership over client data, which includes text submitted for translation or any other purpose cc @MaxymVlasov |
This comment was marked as duplicate.
This comment was marked as duplicate.
OK, so how about we document a workflow (I'm thinking Does that sound OK? |
The main blocker for a reasonable flow with Transifex is that easyCLA unable to process commits that include co-authors. |
Broadly, here's what I suggest:
If we make a script, it would be a script to do the equivalent steps. |
Why there is out of radar question about signing translation work with CLA inside Transifex/Crowdin/etc…? And here the work of not only translators, but also reviewers and approvers, who don't have to worry about the complexities of managing content using |
@sftim when Transifex (or any other LMP) suggest commit to pull it in main branch, a commit could already contain a result of a team work, not one author. Already reviewed and approved by owners on a localization management platform side. Such way of managing of work progress saves a lot of time and effort and increase translation quality dramatically so our plan is something like
UPD: This describes a way, where all authors signed CLA with easyCLA bot, not in localization management platform, it's closer to the current situation. If Legal and management agree to that localizators could sign CNCF CLA at platform (Transifex allows it for example) it would unblock process even more |
I try to rephrase my question. ❓ Must we track the original contributors to translations on GitHub, or is it possible to accomplish the same task using a different platform, such as a localization platform? |
When we localize manually, we don't list the original contributors as co-authors (and that's fine). No need to change when automating more of the process. |
I'd say we take it as implied that localized text has the upstream (English) contributors as co-authors. |
Although you're welcome to work in Beyond that, change one more detail, and it could work: -amend all related to the commit as co-author (as commit is already owned by Transifex bot, which should be approved by EasyCLA bot)
+change that combined commit to have the pull request submitter as the primary author, and list all other human coauthors as co-authors |
I'm pretty sure we'd need people to sign with the Linux Foundation, or for their employers to sign (again, via LF). Using Transifex for that signing does not sound feasible; the CNCF CLA signing and tracking is something that applies across the Linux Foundation. |
As another take on it: does Transifex provide a way for us to only accept translations from users who have signed the CLA at the Linux Foundation, and to reject work from anyone who doesn't have a current CLA? |
We need to watch out for tainting: if a commit that adds or updates a localized document has any change that isn't covered by a CLA, we can't accept that work. Even if most of the work was done by other people. |
It provides ability to force to sign CLA provided by org owner before any work will be started. If EasyCLA will be integrated with CLA in Transifex/etc - then answer is yes. Then we can avoid any CLA verifications and commit co-authors manipulations on git side, as it will be done during sign-up procedure in choosed TLMP |
The best we can do then is to ask people to confirm that they have signed the CLA and require this confirmation. We can't rely on the signature made within Transifex, and I don't expect that an EasyCLA integration is going to happen. Folks are welcome to ask the CNCF for this, but let's build something that'll work even if we don't get it. In other words, we don't ask people to use Transifex to sign the CLA. We ask people to use Transifex to formally confirm that, if we check their CLA status, it will show up as signed.
Although it's a nice idea, I also think it's safe to assume that it won't happen; I can't picture a source of money or resources to make it work how we'd like. What we can do is start with a manual process like I've outlined, and then automate that more. |
Thank you @natalisucks for doing the heavy lifting! I just have a couple of requests for the effort that @sftim plans to collaborate on:
|
☕ Okay, here are my thoughts/questions:
|
It's a follow up in that the English content has merged, but the (eg) Ukrainian content may well not have (or there may be a substantial update). We've so far treated localization work as copyrightable. |
Authors maintain the copyright since this repo's licensed under CC 4.0. Which makes sense. And with Transifex, they clearly and definitively say that the author retains the rights. To that end, I would move forward with getting a proof of concept in place (with Transifex or something else) that submits PRs and ensure actual-humans still review the content. But, it cannot be put into "production" or have any of said-PRs merged in yet (If I could color this red I would)
Before we merge any of the automated-translations, those questions need to be answered by the legal committee. I've escalated it up (See this reference) but fair warning we're in the shadow of KubeCon, the legal committee won't likely get to this till April. Just setting an expectation there. Hopefully this unblocks y'all a bit. |
@natalisucks easypeasy! without EasyCLA compliance confirmation it could take like no days to start work on translation of a source stored on git The only point that considers me is trial session is limited in time ((after trial we need to switch to a paid plan or complain to a platform "agreement" as open source)), so it would be great to have all volunteers to work on localization via Transifex (or any other TLMP) before the start So we could go through the test and collect cons\pros opinions as fast as possible Let's set the deadline and collect list of volunteers from different localization teams PS
Ukrainian Localization team had personal reached Robert Reeves before raising AGAIN this question again with SIG Docs leadership to clarify that we have some support on CNCF side this time for changes to improve, modernize, streamline localization process this time @MaxymVlasov I think it could be a great moment to summarize what we've discussed, what we have on the table at the moment and etc. |
@jeefy @sftim once again in this story we have three types of bots:
From a day one of this topic discussion those three are mixed and it's a trouble. It's a HUGE trouble, cause the whole discussion is mixed again and again, from what I can see We as Ukrainian Localization Team started this topic as let's-integrate modern localization MANAGEMENT tools into the process. Most of them, if not all have PR bots in their flow, as it was logical and understandable decision on platform developers side. These bots do not create texts\code. Their are only parts of a CI like process Easiest way to exclude those from the process is to create own TLMP, which doesn't look reasonable
We haven't found a platform that has an EasyCLA bot in their mind as a source of CLA sign verification. However some of these platforms have their own CLA sign \ sign verification mechanisms. Integrating them will save tons of time and effort into localization process Localization automation bots could be a game changer if we are talking about bringing a kickstarter to a community of speakers who don't have their localization team yet, or for those who are short on hands. However it's not the main priority for us (at least for the Ukrainian localization team). We are looking for tools that will make it easier for new people to get on board and reduce the technical effort to start working on the localization I am grateful for the patience of all who are going through these thoughts again, and hope that they will clarify the case if you are reading them for the first time. |
A bunch of those solutions look pretty cool IMO. I don't really have a say in the approach, but my two cents would be to try and do as much as possible without the need to install a tool locally (looking at the suggestion of some desktop app)
I can poke at this internally but I have a feeling it will not get done any time soon. Please do not do anything that requires it.
We (CNCF) fully support projects looking to do this sort of automation, but that's it. If there's some resource we can provide to streamline things, we're happy to. But we avoid prescribing any single solution. We prefer to leave it up to the projects to pick what works best for them. Robert works to build partnerships so once a solution is settled on, he'll hop on board to try and work with the vendor. He's not going to help y'all decide.
I might be out of place but that's probably best decided, prioritized, and left to the Chairs/TLs of this SIG :) They've got a lot going on (some of it because of me, sorry!) so patience would be welcome here. |
From my perspective the best thing which we could do to not rely on CLA verification of EasyCLA bot (because bot needs to be modified, to do a verification with extra steps. commit with co-authors scenario) is to move to a scenario where CLA verification is done on TLMP side 🤷 We will get one extra CLA related tool, but will not have to do extra steps. The only con i see - this is a less agile scenario |
Here is a quick summary of the current situation graph TD
subgraph C["CNCF/LF — SIG Docs"]
CNCF[CNCF/LF] -->
|"We fully support projects<br />looking to do this sort of<br />automation.<br />We avoid prescribing any<br />single solution"| SD[SIG Docs] -->
|"Are CNCF willing to do<br />a foundation policy change<br />where external tools can do<br />CLA signing, bypassing<br />EasyCLA?<br /><br /><br />We have yet to receive<br />any decision or<br />confirmation on<br />the above ask, and<br /><br />if the policy will<br />not be changed…"| CNCF
SD --> SD
end
subgraph LT["Localization Teams"]
M((("We need modern<br />Localization Platform")))
end
LT --> C
|
How about: we set up a trial to use Crowdin to localize some Ukrainian pages, and as part of that we make a process to get the content published. It's likely that people would need to sign the CLA once but that there would be more than one place where our systems verify this. Once we have that prototyped, we can look at adapting the process so it also works with Transifex. Does that work? |
If there's no communication with SIG Docs about being blocked on something, then we won't know 🙂 Our first try was to have the Ukrainian team raise a ticket with LF about the CLA bypass, which happened. To my knowledge, we've only heard about this not being resolved for the last ten months in these fresh conversations (in Slack and on this issue), hence our next attempt to escalate. Please work with us here, we are all on the same team 👍 |
Do any of these platforms allow webhook / HTTP callout authz? If we find one, we can link that to CLA checking - that's engineering, but that's good because implementing software has lower barriers than changing copyright policy. The outcome would be to help contributors discover as early as possible that they (or their employer) must sign the CLA. |
It would be great to try it out. I'm opt-in! Here is https://github.com/Andygol/k8s-website/tree/main-uk-wip 760+ already translated into Ukrainian Docs, in case we need to test on something.
✅ https://support.crowdin.com/webhooks/ and ✅ https://developers.transifex.com/docs/webhooks |
@natalisucks plz, stop refer to "If there's no communication, then we don't know" stuff. So let's just not touch it for some time, until things really wouldn't start rolling 🤝 |
@sftim could you Plz tell more about what you are looking for specifically? platforms could work with hooks, so we could at least to try to go your way, however I am not sure that we are looking for similar workarounds From what @Andygol , @MaxymVlasov and I have looked through previously, the EasyCLA bot limitations are on the road. But whatever: maybe you have something on your mind that we haven't looked into yet. Plz share with us |
That's a specific detail of using a translation and localization management platform. How about tracking that part in its own issue? |
@rolfedh proposed a testing plan on the discussion about which tools to consider. I'll copy it here:
|
Also: Here's a proposed list of Kanban tasks for our testing plan: Preparation Phase
Setup Phase
Translation Phase
Evaluation Phase
Decision Phase
Ensure each task is assigned to a team member with clear responsibilities and deadlines. This structured approach allows for continuous monitoring, adjustment, and evaluation of the platforms' effectiveness in streamlining the Kubernetes documentation translation process. |
I've created an issue that covers testing for these tools, #45756. If you are interested in volunteering to help test the tools, please review reach out on the issue. |
@MaxymVlasov Can you please give us an update on the progress of your Service Desk ticket? I'd like to make sure we're following up with the CNCF accordingly. Furthermore, for folks still following this issue, we're still accepting people to help out with #45756 so that we can get some testing started, without CLA checking or integration, on some possible tooling. @jeefy If you wouldn't mind letting us know if there's been an update from the Legal meeting with LF folks on the topic of bypassing the CLA bot (see my comment here) that would be great. |
@natalisucks LF helpdesk escalation doesn't work... |
@MaxymVlasov Thanks for the update. Since we're specifically looking for a decision about bypassing CLA without a specific tool in mind, I think it's worthwhile for us to pursue this question separately with @jeefy and the LF Legal team. As you know, tooling has not been selected yet, so we don't want this to apply to Transifex alone, if the decision is that a bypass can happen. Again, given that this is a change that affects the whole foundation, and is a legal question, it could take time to get an answer, but we are trying so that the project can move forward. In the meantime, please do not let this stop yours, @OleksaBaida's, and @Andygol's desires to test tooling for translation workflows – regardless of the CLA answer, an integration could still be really useful. We'll wait to hear back from Jeff about the Legal team's response. |
Good news - EasyCLA now checks CLA for everyone in commit:
|
Btw, https://hosted.weblate.org/ provides ability to get suggestions for similar text from other projects, which simplify localisation and it standartization across whole language. Usually they cost money, but for OSS project they can provide free license. Just in case if other solutions will not work fine |
Hello from Weblate! Yes, we are the libre localization tool for many communities, like openSUSE, Fedora, LibreOffice, Mattermost, Godot Engine, and many more. We also provide services to commercial companies. Suggestions based on TM or MT are available and can be used for fully automatic translation. Free hosting and support for open projects is what we like to do. For giant projects like those above, we provide dedicated instances for a good price. |
This is a Feature Request
Support of Translation & Localization Management Platform (TLMP) for docs
To effectively make and maintain translations, we need to adopt tools that were created for it.
There a bunch of solutions, like Transifex, Crowdin, and others.
After a quick research at the start of 2023, we chose Transifex for PoC as a free and reliable solution, but in the end, it's no matter what tool sig-docs chooses, it will be much better than the workflow that we have now via git&Github.
What would you like to be added
There are 3 ways, from better to worse:
k/websites
and move localization work fully to the chosen TLMP, including docs reviews and CLA sign. Add TLMP Github integration and auto-merge changes sent by TLMP integration user (provided by TLMP or you can setup your own)3. Manually approve and merge such PRs in Github, if EasyCLA check passes.
Why is this needed
When you write docs or code in 1 language, you work only with the current state of docs - everything that you need to track is tracked by git.
When you translate something - you deal with 2 sources of truth: original and existent translation, which can be partial, outdated or placed in locations that have already been moved/removed from the original. + other edge cases. There is no easy way in git to check what changes in the original also should be revisited in translation, so mostly every translated doc became unsupported from the moment when it was merged - as sometimes simpler to redo a translation from scratch than to figure out what changes are needed.
Long story short: git is just the wrong tool for translations. It is as bad for this as using .zips for VCS or trying to send a letter by pigeon mail to another continent and hoping for 3 workday answer.
Also, tech-writers, students, or newbies which'd like to contribute, in most cases have no or little knowledge how git and GitHub works, they don't have a GitHub account, and so on. Those are just not intuitive tools for non-techie folks.
So, what good enough TLMP will provide:
not translated
, andnot reviewed
strings/filesComments
@sftim
asked to add it as an issue here, to be able to track work on it.Related Linux Foundation issue
Screenshots of the LF issue in case you can't see it
Read msgs from bottom to top
P.S. That started as a Ukrainian localization team initiative, but we were blocked from the legal perspective of merging back that kind of change from TLMP.
The text was updated successfully, but these errors were encountered: