[Bug 23542] New: SRU import ending issue

classic Classic list List threaded Threaded
35 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] New: SRU import ending issue

bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

            Bug ID: 23542
           Summary: SRU import ending issue
 Change sponsored?: ---
           Product: Koha
           Version: master
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P5 - low
         Component: Z39.50 / SRU / OpenSearch Servers
          Assignee: [hidden email]
          Reporter: [hidden email]
        QA Contact: [hidden email]
                CC: [hidden email]

When importing records from a SRU server, the diacritics have bad encoding.
I reproduce with BNF server so it may be a UNIMARC issue.

--
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import ending issue

bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Fridolin SOMERS <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|[hidden email]-commun |[hidden email]
                   |ity.org                     |m

--
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import ending issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Fridolin SOMERS <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import ending issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

--- Comment #1 from Fridolin SOMERS <[hidden email]> ---
Created attachment 92606
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=92606&action=edit
Bug 23542: fix SRU import ending issue

When importing records from a SRU server, the diacritics have bad encoding.
I reproduce with BNF server so it may be a UNIMARC issue.
Looks like this does not affect autorities import for SRU server.

In C4::Breeding::_handle_one_result(), in case of a Z39.50 server, when
creating MARC::Record MarcToUTF8Record() is called which calls
SetMarcUnicodeFlag(). This ensures that leader indicates UTF-8 encoding.
Looking at MARC::Record->encoding() shows that encoding depends on leader even
for UNIMARC.
So this patch adds a call to SetMarcUnicodeFlag() in cas of a SRU server in
C4::Breeding::_handle_one_result().

Test plan :
1) Use a UNIMARC database
2) Configure a connexion to a UNIMARC SRU, for example BNF,
   see
https://doc.biblibre.com/koha/autour_de_koha/serveurs_z3950_sru#serveur_de_la_bnf
3) Go to cataloguing module
4) Click on 'New from Z39.50/SRU'
5) Choose only the SRU target
6) Search for ISBN 2266072889
7) Confirm you see good encoding : diacritic on 'a' of title 'Strate-a-gemmes'
8) Click on 'Marc preview'
9) Confirm you see good encoding
10) Click import
11) Confirm you see good encoding
12) Check SRU connexion on a MARC21 database is still OK

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import ending issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Fridolin SOMERS <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |Needs Signoff
   Patch complexity|---                         |Trivial patch

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import ending issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

--- Comment #2 from Marcel de Rooy <[hidden email]> ---
Not sure if this is the right solution.
Did you check if you pass new_from_xml the correct parameters in case of
UNIMARC ?

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Fridolin SOMERS <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|SRU import ending issue     |SRU import encoding issue

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

--- Comment #3 from Fridolin SOMERS <[hidden email]> ---
(In reply to Marcel de Rooy from comment #2)
> Not sure if this is the right solution.
> Did you check if you pass new_from_xml the correct parameters in case of
> UNIMARC ?

This is used in some other places of code, its a trick

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

[hidden email] <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]
             Status|Needs Signoff               |Signed Off

--- Comment #4 from [hidden email] <[hidden email]> ---
Patch tested with a sandbox, by Séverine QUEUNE <[hidden email]>

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

[hidden email] <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #92606|0                           |1
        is obsolete|                            |

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

--- Comment #5 from [hidden email] <[hidden email]> ---
Created attachment 93588
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=93588&action=edit
Bug 23542: fix SRU import ending issue

When importing records from a SRU server, the diacritics have bad encoding.
I reproduce with BNF server so it may be a UNIMARC issue.
Looks like this does not affect autorities import for SRU server.

In C4::Breeding::_handle_one_result(), in case of a Z39.50 server, when
creating MARC::Record MarcToUTF8Record() is called which calls
SetMarcUnicodeFlag(). This ensures that leader indicates UTF-8 encoding.
Looking at MARC::Record->encoding() shows that encoding depends on leader even
for UNIMARC.
So this patch adds a call to SetMarcUnicodeFlag() in cas of a SRU server in
C4::Breeding::_handle_one_result().

Test plan :
1) Use a UNIMARC database
2) Configure a connexion to a UNIMARC SRU, for example BNF,
   see
https://doc.biblibre.com/koha/autour_de_koha/serveurs_z3950_sru#serveur_de_la_bnf
3) Go to cataloguing module
4) Click on 'New from Z39.50/SRU'
5) Choose only the SRU target
6) Search for ISBN 2266072889
7) Confirm you see good encoding : diacritic on 'a' of title 'Strate-a-gemmes'
8) Click on 'Marc preview'
9) Confirm you see good encoding
10) Click import
11) Confirm you see good encoding
12) Check SRU connexion on a MARC21 database is still OK

Signed-off-by: Séverine QUEUNE <[hidden email]>
Signed-off-by: Séverine QUEUNE <[hidden email]>

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Marcel de Rooy <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|Signed Off                  |BLOCKED
         QA Contact|[hidden email]-communit |[hidden email]
                   |y.org                       |

--- Comment #6 from Marcel de Rooy <[hidden email]> ---
QA: Looking here

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Marcel de Rooy <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|BLOCKED                     |Failed QA

--- Comment #7 from Marcel de Rooy <[hidden email]> ---
QA Comment:
When testing with LOC SRU, I am having two records producing the cryptic error
code '500' on the search title=vermeer with your patch. While without your
patch, I do not get this error.
I found some ZOOM error codes in the range 10000 - 10018 or so; i can't place
the returned 500.
The dialog alert contains:
    LIBRARY OF CONGRESS SRU record 19: 500
    LIBRARY OF CONGRESS SRU record 20: 500

Also please provide more feedback as to your statement: "This is used in some
other places of code, its a trick."
What kind of trick is this? Please give the other code locations?

Changed status.

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Fridolin SOMERS <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|Failed QA                   |In Discussion

--- Comment #8 from Fridolin SOMERS <[hidden email]> ---
(In reply to Marcel de Rooy from comment #7)
>
> Also please provide more feedback as to your statement: "This is used in
> some other places of code, its a trick."
> What kind of trick is this? Please give the other code locations?

Indeed I cant find it now.
And I think this issue also exists with SRU for authorities.
Needs work.

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Fridolin SOMERS <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #93588|0                           |1
        is obsolete|                            |

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

didier <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]
                   |                            |om

--- Comment #9 from didier <[hidden email]> ---
(In reply to Marcel de Rooy from comment #7)

> QA Comment:
> When testing with LOC SRU, I am having two records producing the cryptic
> error code '500' on the search title=vermeer with your patch. While without
> your patch, I do not get this error.
> I found some ZOOM error codes in the range 10000 - 10018 or so; i can't
> place the returned 500.
> The dialog alert contains:
>     LIBRARY OF CONGRESS SRU record 19: 500
>     LIBRARY OF CONGRESS SRU record 20: 500
>
> Also please provide more feedback as to your statement: "This is used in
> some other places of code, its a trick."
> What kind of trick is this? Please give the other code locations?
>
> Changed status.

I get LIBRARY OF CONGRESS SRU error without the patch, sometime.

I captured the network traffic and it's an http server error 500, way before
Koha. Maybe an overloaded server?

As far I can tell there's no difference in requests to server in error and in
no error cases.

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Paul Poulain <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]

--- Comment #10 from Paul Poulain <[hidden email]> ---
Suspecting the patch misses something. IIRC, Authorities does not handle
unicode like biblios. Thus, we have UNIMARC and UNIMARCAUTH as available
parameters. I suspect (without having checked) that the patch won't work for
BNF unimarc authorities.

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Fridolin SOMERS <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://bugs.koha-community
                   |                            |.org/bugzilla3/show_bug.cgi
                   |                            |?id=18153

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

--- Comment #11 from Fridolin SOMERS <[hidden email]> ---
Ah I think I found the correct same code :

In Koha::MetadataRecord::Authority :
get_from_breeding {
    my $record=eval
{MARC::Record->new_from_xml(StripNonXmlChars($marcxml),'UTF-8',
        (C4::Context->preference("marcflavour") eq
"UNIMARC"?"UNIMARCAUTH":C4::Context->preference("marcflavour")))};
    return if ($@);
    $record->encoding('UTF-8');

So it is more "$record->encoding('UTF-8')" that is missing. It sets 'a' in
leader position 9.
This is called in SetMarcUnicodeFlag().

See also Bug 18153.

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Fridolin SOMERS <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|In Discussion               |Needs Signoff

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

--- Comment #12 from Fridolin SOMERS <[hidden email]> ---
Created attachment 106397
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=106397&action=edit
Bug 23542: Fix SRU import encoding

When importing records from a SRU server, the diacritics have bad encoding.
I reproduce with BNF server so it may be a UNIMARC issue.

Tests show that difference between Z39.50 server and SRU is that leader
contains 'a' at postion 9.
Looking at MARC::Record->encoding() shows that encoding depends on leader even
for UNIMARC.
So this patch adds a call to MARC::Record->encoding('UTF-8') in case of a SRU
server in C4::Breeding.

Same use exists in Koha::MetadataRecord::Authority::get_from_breeding().

In case of import via Z3950, MarcToUTF8Record() is called,
 which calls SetMarcUnicodeFlag(),
 which calls MARC::Record->encoding('UTF-8')

Test plan :
1) Use a UNIMARC database
2) Configure a connexion to a UNIMARC SRU, for example BNF,
   see
https://doc.biblibre.com/koha/autour_de_koha/serveurs_z3950_sru#serveur_de_la_bnf
3) Go to cataloguing module
4) Click on 'New from Z39.50/SRU'
5) Choose only the SRU target
6) Search for ISBN 2266072889
7) Confirm you see good encoding : diacritic on 'a' of title 'Strate-a-gemmes'
8) Click on 'Marc preview'
9) Confirm you see good encoding
10) Click import
11) Confirm you see good encoding
12) Check also Authorities import via SRU
13) Check also SRU imports on a MARC21 database

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Fridolin SOMERS <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #106397|0                           |1
        is obsolete|                            |

--- Comment #13 from Fridolin SOMERS <[hidden email]> ---
Created attachment 106399
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=106399&action=edit
Bug 23542: Fix SRU import encoding

When importing records from a SRU server, the diacritics have bad encoding.
I reproduce with BNF server so it may be a UNIMARC issue.

Tests show that difference between Z39.50 server and SRU is that leader
contains 'a' at postion 9.
Looking at MARC::Record->encoding() shows that encoding depends on leader even
for UNIMARC.
So this patch adds a call to MARC::Record->encoding('UTF-8') in case of a SRU
server in C4::Breeding.

Same use exists in Koha::MetadataRecord::Authority::get_from_breeding().

In case of import via Z3950, MarcToUTF8Record() is called,
 which calls SetMarcUnicodeFlag(),
 which calls MARC::Record->encoding('UTF-8')

Patch also uses system preference marcflavor instead of SRU syntax to
build MARC::Record, like for z3950.

Test plan :
1) Use a UNIMARC database
2) Configure a connexion to a UNIMARC SRU, for example BNF,
   see
https://doc.biblibre.com/koha/autour_de_koha/serveurs_z3950_sru#serveur_de_la_bnf
3) Go to cataloguing module
4) Click on 'New from Z39.50/SRU'
5) Choose only the SRU target
6) Search for ISBN 2266072889
7) Confirm you see good encoding : diacritic on 'a' of title 'Strate-a-gemmes'
8) Click on 'Marc preview'
9) Confirm you see good encoding
10) Click import
11) Confirm you see good encoding
12) Check also Authorities import via SRU
13) Check also SRU imports on a MARC21 database

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Fridolin SOMERS <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #106399|0                           |1
        is obsolete|                            |

--- Comment #14 from Fridolin SOMERS <[hidden email]> ---
Created attachment 106400
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=106400&action=edit
Bug 23542: Fix SRU import encoding

When importing records from a SRU server, the diacritics have bad encoding.
I reproduce with BNF server so it may be a UNIMARC issue.

Tests show that difference between Z39.50 server and SRU is that leader
contains 'a' at postion 9.
Looking at MARC::Record->encoding() shows that encoding depends on leader even
for UNIMARC.
So this patch adds a call to MARC::Record->encoding('UTF-8') in case of a SRU
server in C4::Breeding.

Same use exists in Koha::MetadataRecord::Authority::get_from_breeding().

In case of import via Z3950, MarcToUTF8Record() is called,
 which calls SetMarcUnicodeFlag(),
 which calls MARC::Record->encoding('UTF-8')

Patch also uses system preference marcflavor instead of SRU syntax to
build MARC::Record, like for z3950.

Test plan :
1) Use a UNIMARC database
2) Configure a connexion to a UNIMARC SRU, for example BNF,
   see
https://doc.biblibre.com/koha/autour_de_koha/serveurs_z3950_sru#serveur_de_la_bnf
3) Go to cataloguing module
4) Click on 'New from Z39.50/SRU'
5) Choose only the SRU target
6) Search for ISBN 2266072889
7) Confirm you see good encoding : diacritic on 'a' of title 'Strate-a-gemmes'
8) Click on 'Marc preview'
9) Confirm you see good encoding
10) Click import
11) Confirm you see good encoding
12) Check also Authorities import via SRU
13) Check also SRU imports on a MARC21 database

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

--- Comment #15 from Fridolin SOMERS <[hidden email]> ---
(In reply to Fridolin SOMERS from comment #11)

> Ah I think I found the correct same code :
>
> In Koha::MetadataRecord::Authority :
> get_from_breeding {
>     my $record=eval
> {MARC::Record->new_from_xml(StripNonXmlChars($marcxml),'UTF-8',
>         (C4::Context->preference("marcflavour") eq
> "UNIMARC"?"UNIMARCAUTH":C4::Context->preference("marcflavour")))};
>     return if ($@);
>     $record->encoding('UTF-8');
>
> So it is more "$record->encoding('UTF-8')" that is missing. It sets 'a' in
> leader position 9.
> This is called in SetMarcUnicodeFlag().
>
> See also Bug 18153.

Here you see that MARC::Records->new_from_usmarc() calls :
  $marc->encoding() eq 'UTF-8'
https://metacpan.org/source/GMCHARLT/MARC-Record-2.0.7/lib/MARC/File/USMARC.pm#L172

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Marcel de Rooy <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|Needs Signoff               |BLOCKED

--- Comment #16 from Marcel de Rooy <[hidden email]> ---
Looking here

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

--- Comment #17 from Marcel de Rooy <[hidden email]> ---
Without your patch with BNF:

200 1  _aStrate-Ã -gemmes
       _bTexte imprimé
       _fTerry Pratchett
       _g[trad. par Dominique Haas]

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

--- Comment #18 from Marcel de Rooy <[hidden email]> ---
The crux is adding the $marcrecord->encoding('UTF-8') !
You should not change the new_from_xml call. We should respect the syntax
parameter here instead of the marc flavour of the Koha instance.
If you configured BNF correctly, it will say UNIMARC in the syntax field.

Amending the patch.

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Marcel de Rooy <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Patch complexity|Trivial patch               |Small patch
             Status|BLOCKED                     |Signed Off

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

Marcel de Rooy <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #106400|0                           |1
        is obsolete|                            |

--- Comment #19 from Marcel de Rooy <[hidden email]> ---
Created attachment 106487
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=106487&action=edit
Bug 23542: Fix SRU import encoding

When importing records from a SRU server, the diacritics have bad encoding.
I reproduce with BNF server so it may be a UNIMARC issue.

Tests show that difference between Z39.50 server and SRU is that leader
contains 'a' at postion 9.
Looking at MARC::Record->encoding() shows that encoding depends on leader even
for UNIMARC.
So this patch adds a call to MARC::Record->encoding('UTF-8') in case of a SRU
server in C4::Breeding.

Same use exists in Koha::MetadataRecord::Authority::get_from_breeding().

In case of import via Z3950, MarcToUTF8Record() is called,
 which calls SetMarcUnicodeFlag(),
 which calls MARC::Record->encoding('UTF-8')

Test plan :
1) Use a UNIMARC database
2) Configure a connexion to a UNIMARC SRU, for example BNF,
   see
https://doc.biblibre.com/koha/autour_de_koha/serveurs_z3950_sru#serveur_de_la_bnf
3) Go to cataloguing module
4) Click on 'New from Z39.50/SRU'
5) Choose only the SRU target
6) Search for ISBN 2266072889
7) Confirm you see good encoding : diacritic on 'a' of title 'Strate-a-gemmes'
8) Click on 'Marc preview'
9) Confirm you see good encoding
10) Click import
11) Confirm you see good encoding
12) Check also Authorities import via SRU
13) Check also SRU imports on a MARC21 database

Signed-off-by: Marcel de Rooy <[hidden email]>
Amended: Removed change to new_from_xml call. We should respect syntax.
But the added MARC::Record encoding does the tric! Which is implicit
for Z3950 targets where MarcToUTF8Record does the same.

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 23542] SRU import encoding issue

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23542

--- Comment #20 from Marcel de Rooy <[hidden email]> ---
(In reply to Marcel de Rooy from comment #18)
> The crux is adding the $marcrecord->encoding('UTF-8') !
> You should not change the new_from_xml call. We should respect the syntax
> parameter here instead of the marc flavour of the Koha instance.
> If you configured BNF correctly, it will say UNIMARC in the syntax field.
>
> Amending the patch.

Also note that we have syntax USMARC for many MARC21 targets. This is the field
value for the MARC21/USMARC option in the syntax combo.
Changing it would break the MARC21 search.

--
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
12