Skip to content

Aws::Translate::Client - translate_text Returns Random Do-Not-Translate IDs or Skips Translations Depending on < and > Character Usage #3191

@chadrschroeder

Description

@chadrschroeder

Describe the bug

Sorry if this is the wrong place to report this since this is an issue with the API and not the Ruby client, but I noticed this when testing out the do-not-translate syntax and I have examples in Ruby to illustrate the problems. There are 2 issues which are probably related.

A. Internal IDs Are Returned

# Good.  Skips translating the text.
Aws::Translate::Client.new.translate_text(text: '<span translate="no">Skip me</span>', source_language_code: 'en', target_language_code: 'fr').translated_text
 => "<span translate=\"no\">Skip me</span>"

# Good.  Extra ">" on the right side doesn't cause problems.
Aws::Translate::Client.new.translate_text(text: '<span translate="no">Skip me</span>>', source_language_code: 'en', target_language_code: 'fr').translated_text
 => "<span translate=\"no\">Skip me</span> >"

# Bad.  Extra "<" on the left side returns gibberish.
Aws::Translate::Client.new.translate_text(text: '<<span translate="no">Skip me</span>', source_language_code: 'en', target_language_code: 'fr').translated_text
 => "<DNT_GEBKJMMFHEHCKOAJBKHKJHCAkDHDNDDD"

This appears to be returning a unique random ID for some kind of "Do-Not-Translate" element because the response always starts with "DNT_" or "dnt_", followed by 32 random characters.

This isn't a huge issue because the example is contrived and the input isn't valid HTML, but there may be a problem somewhere down in the do-not-translate parsing which reveals internal values.

B. Text Between Greater Than and Less Than Characters Can Remain Untranslated

Probably as a result of how the do-not-translate parsing works, there's also a different issue where text between < and > characters sometimes isn't translated depending on other punctuation involved. I'm not sure how to avoid this because even when using &lt; and &gt; to escape the characters some text may remain untranslated.

# Good.  Translates everything when there's a "." between the "<" and ">"
text = "If expenses < revenue then we have a profit. But if expenses > revenue then we have a loss."
Aws::Translate::Client.new.translate_text(text: text, source_language_code: 'en', target_language_code: 'fr').translated_text
 => "Si les dépenses sont inférieures aux recettes, nous avons un bénéfice. Mais si dépenses > recettes, nous avons une perte."

# Bad.  Skips Translating text between "<" and ">".
text = "If expenses < revenue then we have a profit, but if expenses > revenue then we have a loss."
Aws::Translate::Client.new.translate_text(text: text, source_language_code: 'en', target_language_code: 'fr').translated_text
 => "Si les dépenses sont < revenue then we have a profit, but if expenses > des recettes, nous avons une perte."

# Bad.  Skips Translating text between "<" and ">".
text = "If expenses &lt; revenue then we have a profit, but if expenses &gt; revenue then we have a loss."
Aws::Translate::Client.new.translate_text(text: text, source_language_code: 'en', target_language_code: 'fr').translated_text
 => "Si les dépenses sont < revenue then we have a profit, but if expenses > des recettes, nous avons une perte."

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

A. Do not return DNT ID values when there are extra < or > characters in the text.
B. Do not skip translating text between &lt; and &gt; characters.

Current Behavior

A. Returns "<DNT_..." value which wipes out some of the input text.
B. Skips translating text between &lt; and &gt; characters.

Reproduction Steps

# Returns internal DNT ID
Aws::Translate::Client.new.translate_text(text: '<<span translate="no">Skip me</span>', source_language_code: 'en', target_language_code: 'fr').translated_text
 => "<DNT_GEBKJMMFHEHCKOAJBKHKJHCAkDHDNDDD"

# Doesn't translate text between "<" and ">" characters
text = "If expenses &lt; revenue then we have a profit, but if expenses &gt; revenue then we have a loss."
Aws::Translate::Client.new.translate_text(text: text, source_language_code: 'en', target_language_code: 'fr').translated_text
 => "Si les dépenses sont < revenue then we have a profit, but if expenses > des recettes, nous avons une perte."

Possible Solution

No response

Additional Information/Context

No response

Gem name ('aws-sdk', 'aws-sdk-resources' or service gems like 'aws-sdk-s3') and its version

aws-sdk-translate (1.79.0)

Environment details (Version of Ruby, OS environment)

Ruby 3.2.6, macOS Sonoma 14.7.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    service-apiGeneral API label for AWS Services.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions