#33318: Truncator class recognizes different length of ellipsis(...) depending
on
the LANGUAGE_CODE.
-------------------------------------+-------------------------------------
Reporter: YoungJoo Kim | Owner: nobody
Type: Bug | Status: closed
Component: | Version: 3.2
Internationalization |
Severity: Normal | Resolution: invalid
Keywords: truncatechars | Triage Stage:
Truncator ellipsis LANGUAGE_CODE | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by YoungJoo Kim:
Old description:
> First, I am so sorry about my bad english skill..;;
>
> There is something strange about the `truncatechars` method of Django
> Template Language (DTL).
> In the `add_truncation_text` method of the Truncator class, the length of
> the `truncate` variable depends on the value of LANGUAGE_CODE(ex. 'en-
> us', 'ko-kr' ...) in settings.py
>
> [[br]]
>
> The `add_truncation_text` method of the Truncator class is as follows.
> {{{
> # django/utils/text.py
>
> def add_truncation_text(self, text, truncate=None):
> if truncate is None:
> truncate = pgettext(
> 'String to return when truncating text',
> '%(truncated_text)s…')
> if '%(truncated_text)s' in truncate:
> return truncate % {'truncated_text': text}
> # The truncation text didn't contain the %(truncated_text)s string
> # replacement argument so just append it to the text.
> if text.endswith(truncate):
> # But don't append the truncation text if the current text
> already
> # ends in this.
> return text
> return '%s%s' % (text, truncate)
> }}}
>
> The `truncate` variable is assigned the string `'%(truncated_text)s…'` by
> the `pgettext` method.
>
> In LANGUAGE_CODE `'en-us'(default)`, ellipsis is recognized as a string
> of length `1`.
> But in LANGUAGE_CODE `'ko-kr'`, ellipsis is recognized as three dots(...)
> and has length `3`.
>
> I think that the pgettext method is the cause.
>
> [[br]]
>
> So even though it is the same code, the output is different depending on
> the language.
>
> In the `chars` method of the same Truncator class, the number of strings
> to print ellipsis from is calculated through the `for` statement.
>
> The number of times this `for` loop is determined by the return value (==
> `truncate` variable) of the `add_truncation_text` method.
>
> {{{
> # django/utils/text.py
>
> def chars(self, num, truncate=None, html=False):
> """
> Return the text truncated to be no longer than the specified number
> of characters.
>
> `truncate` specifies what should be used to notify that the string
> has
> been truncated, defaulting to a translatable string of an ellipsis.
> """
> self._setup()
> length = int(num)
> text = unicodedata.normalize('NFC', self._wrapped)
>
> # Calculate the length to truncate to (max length - end_text length)
> truncate_len = length
> for char in self.add_truncation_text('', truncate):
> if not unicodedata.combining(char):
> truncate_len -= 1
> if truncate_len == 0:
> break
> if html:
> return self._truncate_html(length, truncate, text, truncate_len,
> False)
> return self._text_chars(length, truncate, text, truncate_len)
> }}}
>
> [[br]]
>
> In conclusion, output of the `truncatechars` method in HTML is as
> follows.
> {{{ <p>{{ fruit|truncatechars:6 }}</p> }}}
> 1. LANGUAGE_CODE = 'en-us' (default)
> {{{
> straw...
> pinea...
> }}}
> 2. LANGUAGE_CODE = 'ko-kr'
> {{{
> str...
> pin...
> }}}
>
> [[br]]
>
> Even if the language is different, the ellipsis should be recognized same
> as a string of length `1`.
>
> Users from other countries may misunderstand the functionality of the
> `truncatechars` method.
>
> Thank you!
New description:
There is something strange about the `truncatechars` method of Django
Template Language (DTL).
In the `add_truncation_text` method of the Truncator class, the length of
the `truncate` variable depends on the value of LANGUAGE_CODE(ex. 'en-us',
'ko-kr' ...) in settings.py
[[br]]
The `add_truncation_text` method of the Truncator class is as follows.
{{{
# django/utils/text.py
def add_truncation_text(self, text, truncate=None):
if truncate is None:
truncate = pgettext(
'String to return when truncating text',
'%(truncated_text)s…')
if '%(truncated_text)s' in truncate:
return truncate % {'truncated_text': text}
# The truncation text didn't contain the %(truncated_text)s string
# replacement argument so just append it to the text.
if text.endswith(truncate):
# But don't append the truncation text if the current text already
# ends in this.
return text
return '%s%s' % (text, truncate)
}}}
The `truncate` variable is assigned the string `'%(truncated_text)s…'` by
the `pgettext` method.
In LANGUAGE_CODE `'en-us'(default)`, ellipsis is recognized as a string of
length `1`.
But in LANGUAGE_CODE `'ko-kr'`, ellipsis is recognized as three dots(...)
and has length `3`.
I think that the pgettext method is the cause.
[[br]]
So even though it is the same code, the output is different depending on
the language.
In the `chars` method of the same Truncator class, the number of strings
to print ellipsis from is calculated through the `for` statement.
The number of times this `for` loop is determined by the return value (==
`truncate` variable) of the `add_truncation_text` method.
{{{
# django/utils/text.py
def chars(self, num, truncate=None, html=False):
"""
Return the text truncated to be no longer than the specified number
of characters.
`truncate` specifies what should be used to notify that the string has
been truncated, defaulting to a translatable string of an ellipsis.
"""
self._setup()
length = int(num)
text = unicodedata.normalize('NFC', self._wrapped)
# Calculate the length to truncate to (max length - end_text length)
truncate_len = length
for char in self.add_truncation_text('', truncate):
if not unicodedata.combining(char):
truncate_len -= 1
if truncate_len == 0:
break
if html:
return self._truncate_html(length, truncate, text, truncate_len,
False)
return self._text_chars(length, truncate, text, truncate_len)
}}}
[[br]]
In conclusion, output of the `truncatechars` method in HTML is as follows.
{{{ <p>{{ fruit|truncatechars:6 }}</p> }}}
1. LANGUAGE_CODE = 'en-us' (default)
{{{
straw...
pinea...
}}}
2. LANGUAGE_CODE = 'ko-kr'
{{{
str...
pin...
}}}
[[br]]
Even if the language is different, the ellipsis should be recognized same
as a string of length `1`.
Users from other countries may misunderstand the functionality of the
`truncatechars` method.
Thank you!
--
--
Ticket URL: <https://code.djangoproject.com/ticket/33318#comment:3>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/065.e6afbf3d15b7e0001bbce8637a0c11f6%40djangoproject.com.