[Bug 19084] New: Spine label with BN_IN UTF8 data rendered incorrectly

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 19084] New: Spine label with BN_IN UTF8 data rendered incorrectly

bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=19084

            Bug ID: 19084
           Summary: Spine label with BN_IN UTF8 data rendered incorrectly
 Change sponsored?: ---
           Product: Koha
           Version: master
          Hardware: All
                OS: All
            Status: NEW
          Severity: major
          Priority: P5 - low
         Component: Label/patron card printing
          Assignee: [hidden email]
          Reporter: [hidden email]
        QA Contact: [hidden email]

Created attachment 65878
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=65878&action=edit
View of expected output vs output rendered

I was trying to generate spine labels with call numbers in my mother tongue
Bengali, which is spoken by 250 million people making it the 7th largest spoken
language in the world.

In the attachment, the text bordered with green outline is how it is expected
to appear, however what appears in the PDF is the text within the red border on
left. FWIW, the koha-conf.xml ttf settings for Bengali font are in place,
checked with different fonts, including Lohit Bengali which is the default
BN_IN font on RHEL, Fedora and Ubuntu etc. as well as with NOTO Sans Bengali
from Google.

None of the conjunct clusters as well as  matraas that go before a letter are
rendered in the correct order. Unless this is fixed Koha can not be used to
correctly generate spine labels for Bengali books

For reference, here is discussion about a similar problem with iText -
http://palashray.com/making-itext-work-with-indic-scripts/

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 19084] Spine label with BN_IN UTF8 data rendered incorrectly

bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=19084

Amit Gupta <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 19084] Spine label with BN_IN UTF8 data rendered incorrectly

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=19084

Marc Véron <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 19084] Spine label with BN_IN UTF8 data rendered incorrectly

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=19084

--- Comment #1 from Chris Nighswonger <[hidden email]> ---
Quoting from the mailing list:

This problem seems to be present for most Indian languages whenever
they have conjunct clusters in their call numbers (depicted as
grapheme clusters in an unicoded string).

To describe the problem simply - the order of chars rendered is
incorrect in the output. For example the string - "শেখর" is
represented by the following code points -
\x{09B6}\x{09C7}\x{0996}\x{09B0}.

Now here is the catch: \x{09B6} represents the bengali letter SHA,
whereas \x{09C7} represents the bengali vowel sign E; however in the
correct linguistic visual presentation, the vowel sign E sits before
the SHA, which is not how the codepoints are arranged in the unicode
string.

I looked around PDF::Reuse, Text::PDF::TTFont etc modules, what seems
to me to be the root of this problem is the unpacku() method which is
pushing the unicode characters into an array in order to introduce
them into the PDF content stream with the correct font information.
However, being pushed in in that order, I think may be the cause of
this problem, which would make this an upstream issue rather than a
Koha bug.

cheers
indranil

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 19084] Spine label with BN_IN UTF8 data rendered incorrectly

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=19084

--- Comment #2 from Chris Nighswonger <[hidden email]> ---
I wonder if this problem also occurs in other abugida writing systems?

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 19084] Spine label with BN_IN UTF8 data rendered incorrectly

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=19084

--- Comment #3 from Indranil Das Gupta <[hidden email]> ---
Yes! In fact, I was perhaps hasty in trashing unpacku() method. The root of the
trouble is the out_text() method where the the actual glyphs are parsed into
the PDF content stream. What is happening here is that the individual
codepoints pushed into @clist by unpacku() are being listed out one at a time
into the PDF content stream as glyphs, *without* the necessary glyph reordering
taking place.

So I would expect every single abiguda writing system to be be impacted.

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 19084] Spine label with BN_IN UTF8 data rendered incorrectly

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=19084

--- Comment #4 from Chris Nighswonger <[hidden email]> ---
(In reply to Indranil Das Gupta from comment #3)
> What is happening here is that the individual
> codepoints pushed into @clist by unpacku() are being listed out one at a
> time into the PDF content stream as glyphs, *without* the necessary glyph
> reordering taking place.

It would seem that the glyph order should never be "changed" in the first
place. ie. the order they are supplied should be preserved throughout the
entire process of PDF generation.

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 19084] Spine label with BN_IN UTF8 data rendered incorrectly

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=19084

--- Comment #5 from Indranil Das Gupta <[hidden email]> ---
Hi Chris,

The example I referred to on the m/l has the following codepoint order
\x{09B6}\x{09C7}\x{0996}\x{09B0} and that's exactly how PDF::Reuse and
PDI::API2 is pushing it out.

However as per rules of glyph reordered necessity by Bengali, the actual
ordering of glyphs (as opposed to the codepoint order) should be
\x{09C7}\x{09B6}\x{0996}\x{09B0}.

LibreOffice which uses the ICU rules, handles this perfectly within ODF as well
as during the PDF export, as does any software that uses Pango as the rendering
backend.

basically calls need to be made to pick up the correct information from the
GSUB and GPOS tables of the font being embedded, which this two perl libs
apparently (from my limited reading so far) do not do.

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 19084] Spine label with BN_IN UTF8 data rendered incorrectly

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=19084

--- Comment #6 from Chris Nighswonger <[hidden email]> ---
This Koha bug depends on this CPAN bug:

https://rt.cpan.org/Ticket/Display.html?id=122778

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Loading...