Z39.50 encoding problem (tested on 2 koha servers)

classic Classic list List threaded Threaded
22 messages Options
Reply | Threaded
Open this post in threaded view
|

Z39.50 encoding problem (tested on 2 koha servers)

anjoze
Hi again.
From always (3.2 --> 16.11) I could not get the right encoding when searching with Z3950 on other servers.
I've tried with all the available encodings (utf8, EUC-KR, ISO_5426, ISO_6937, ISO_8859-1, MARC-8 ) but always have strange characters on results.
I alway thought the problem was the other servers that maybe have a strange charset.
So, I've installed 2 koha servers with the same options then I enabled Z39.50 on one and tested searching with z39.50 with the other one. The problem persists!! :( Problems with accent characters (é, ã, í, ç)

Ubuntu Servers:
LANG=pt_PT.UTF-8
LANGUAGE=pt:pt_BR:en
LC_CTYPE="pt_PT.UTF-8"
LC_NUMERIC=pt_PT
LC_TIME=pt_PT
LC_COLLATE="pt_PT.UTF-8"
LC_MONETARY=pt_PT
LC_MESSAGES="pt_PT.UTF-8"
LC_PAPER=pt_PT
LC_NAME=pt_PT
LC_ADDRESS=pt_PT
LC_TELEPHONE=pt_PT
LC_MEASUREMENT=pt_PT
LC_IDENTIFICATION=pt_PT
LC_ALL=

Koha with UNIMARC

I've checked some configuration files like:
/etc/koha/sites/koha/koha-conf.xml
/etc/koha/zebradb/retrieval-info-bib-dom.xml
/etc/koha/unimarc-retrieval-info-bib-grs1.xml
/usr/share/koha/intranet/cgi-bin/cataloguing/z3950_search.pl

Everything is set with:
syntax="unimarc" name="F">
<marc inputformat="marc" outputformat="marcxml"
inputcharset="utf-8"/>

Thanks for any help
Koha version: 16.05.05
       - -
José Anjos
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

barton
On Tue, Jun 27, 2017 at 5:17 AM, anjoze <[hidden email]> wrote:

> José


José,

Have you tried connecting via yaz-client? What about another z39.50 client?
It would be useful to know whether the encoding errors are in Koha or in
zebra.

--Barton
_______________________________________________
Koha mailing list  http://koha-community.org
[hidden email]
https://lists.katipo.co.nz/mailman/listinfo/koha
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

anjoze
Hi Barton,
I've done some tests based on this page:
https://wiki.koha-community.org/wiki/Troubleshooting_Zebra
I will put the results here:
yaz-client -c /etc/koha/zebradb/ccl.properties koha_test:9998/biblios
Connecting...OK.
Sent initrequest.
Connection accepted by v3 target.
ID     : 81
Name   : Zebra Information Server/GFS/YAZ
Version: 4.2.30 98864b44c654645bc16b2c54f822dc2e45a93031
Options: search present delSet triggerResourceCtrl scan sort extendedServices namedResultSets
Elapsed: 0.001579
Z>

Z> f homem
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 1, setno 1
SearchResult-1: term=homem cnt=1
records returned: 0
Elapsed: 0.003632
Z>

Z> format xml
Z> show 1

Sent presentRequest (1+1).
Records: 1
[biblios]Record type: XML
<?xml version="1.0" encoding="UTF-8"?>
<record xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.loc.gov/MARC21/slim" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">

  <leader>     nam  22        4500</leader>
  <controlfield tag="001">2</controlfield>
  <datafield tag="010" ind1=" " ind2=" ">
    <subfield code="a">978-989-628-078-9</subfield>
  </datafield>
  <datafield tag="021" ind1=" " ind2=" ">
    <subfield code="a">PT</subfield>
    <subfield code="b">278876/08</subfield>
  </datafield>
  <datafield tag="090" ind1=" " ind2=" ">
    <subfield code="a">2</subfield>
  </datafield>
  <datafield tag="101" ind1="1" ind2=" ">
    <subfield code="a">por</subfield>
  </datafield>
  <datafield tag="102" ind1=" " ind2=" ">
    <subfield code="a">PT</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">20100225d2008    m||y0frey5003    ba</subfield>
  </datafield>
  <datafield tag="200" ind1="1" ind2=" ">
    <subfield code="a">Um homem de aluguer</subfield>
    <subfield code="f">Neus Arqués</subfield>
    <subfield code="g">trad. Helena Pitta</subfield>
  </datafield>
  <datafield tag="205" ind1=" " ind2=" ">
    <subfield code="a">1ª ed</subfield>
  </datafield>
  <datafield tag="210" ind1=" " ind2=" ">
    <subfield code="a">Matosinhos</subfield>
    <subfield code="c">QuidNovi</subfield>
    <subfield code="d">2008</subfield>
  </datafield>
  <datafield tag="215" ind1=" " ind2=" ">
    <subfield code="a">175 p.</subfield>
    <subfield code="d">24 cm</subfield>
  </datafield>
  <datafield tag="225" ind1="2" ind2=" ">
    <subfield code="a">Ficção estrangeira</subfield>
  </datafield>
  <datafield tag="304" ind1=" " ind2=" ">
    <subfield code="a">Tít. orig.: Um hombre de pago</subfield>
  </datafield>
  <datafield tag="675" ind1=" " ind2=" ">
    <subfield code="a">821.134.2-31"20"</subfield>
    <subfield code="v">BN</subfield>
    <subfield code="z">por</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Arqués</subfield>
    <subfield code="b">Neus</subfield>
    <subfield code="f">1964-</subfield>
    <subfield code="9">1</subfield>
  </datafield>
  <datafield tag="702" ind1=" " ind2=" ">
    <subfield code="a">Pitta</subfield>
    <subfield code="b">Helena</subfield>
    <subfield code="4">730</subfield>
    <subfield code="9">2</subfield>
  </datafield>
  <datafield tag="801" ind1=" " ind2="0">
    <subfield code="a">PT</subfield>
    <subfield code="b">BN</subfield>
    <subfield code="g">RPC</subfield>
  </datafield>
  <datafield tag="830" ind1=" " ind2=" ">
    <subfield code="c">test</subfield>
    <subfield code="d">12-06-2017</subfield>
  </datafield>
  <datafield tag="990" ind1=" " ind2=" ">
    <subfield code="c">MON</subfield>
  </datafield>
  <datafield tag="995" ind1=" " ind2=" ">
    <subfield code="0">0</subfield>
    <subfield code="1">0</subfield>
    <subfield code="4">8231_ARQ</subfield>
    <subfield code="6">000001</subfield>
    <subfield code="9">1</subfield>
    <subfield code="b">TESTE</subfield>
    <subfield code="f">000001</subfield>
    <subfield code="k">82-31 ARQ</subfield>
    <subfield code="o">0</subfield>
    <subfield code="r">MON</subfield>
    <subfield code="s">Oferta</subfield>
  </datafield>
</record>
nextResultSetPosition = 2
Elapsed: 0.039125
 

As you can see, the results are OK, all the characters are OK.
The problem is on koha...


Koha version: 16.05.05
       - -
José Anjos
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

anjoze
More tests

Test 1:
I've enable Z39.50 server on both servers.
Server1 --> Server2: NO problem with characters encoding
Server2 --> Server1: Problem with characters encoding

I've compared multiple files but everything is equal:
/var/lib/koha/koha/biblios
/etc/koha/sites/koha/zebra-biblios-dom.cfg
/etc/koha/zebradb/pqf.properties
/etc/koha/zebradb/ccl.properties

Test 2:
Server2: Save the record in MRCXML
Server1: Stage MARC records for import --> Problem with characters encoding

Where can I find more settings for Zebra or something related?
Thanks
Koha version: 16.05.05
       - -
José Anjos
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

Zeno Tajoli-2
Hi José Anjos,

do this test:
Server1: Save the record in MRCXML
Server2: Stage MARC records for import --> Result = ?

Bye
Zeno Tajoli


Il 28/06/2017 14:30, anjoze ha scritto:

> More tests
>
> *Test 1:*
> I've enable Z39.50 server on both servers.
> Server1 --> Server2: *NO problem* with characters encoding
> Server2 --> Server1: *Problem* with characters encoding
>
> I've compared multiple files but everything is equal:
> /var/lib/koha/koha/biblios
> /etc/koha/sites/koha/zebra-biblios-dom.cfg
> /etc/koha/zebradb/pqf.properties
> /etc/koha/zebradb/ccl.properties
>
> *Test 2:*
> Server2: Save the record in MRCXML
> Server1: Stage MARC records for import --> *Problem* with characters
> encoding
>
> Where can I find more settings for Zebra or something related?
> Thanks
>
>
>
> -----
> Koha version: 16.05.05
>         - -
> José Anjos
> --
> View this message in context: http://koha.1045719.n5.nabble.com/Z39-50-encoding-problem-tested-on-2-koha-servers-tp5938540p5938710.html
> Sent from the Koha-general mailing list archive at Nabble.com.
> _______________________________________________
> Koha mailing list  http://koha-community.org
> [hidden email]
> https://lists.katipo.co.nz/mailman/listinfo/koha
>

--
Zeno Tajoli
/SVILUPPO PRODOTTI CINECA/ - Automazione Biblioteche
Email: [hidden email] Fax: 051/6132198
*CINECA* Consorzio Interuniversitario - Sede operativa di Segrate (MI)
_______________________________________________
Koha mailing list  http://koha-community.org
[hidden email]
https://lists.katipo.co.nz/mailman/listinfo/koha
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

anjoze
Hi Zeno Tajoli,

Server1: Save the record in MRCXML
Server2: Stage MARC records for import --> Result: No problems, all characters OK
Koha version: 16.05.05
       - -
José Anjos
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

Zeno Tajoli-2
Hi,

with those results I think that your problem are based om MySQL setup.
Login into mysql command line and do:
mysql> SHOW VARIABLES LIKE '%char%';

In my Unimarc setup (Koha 3.20.7, Debian 8, Mysql 5.5) the result is:
mysql> SHOW VARIABLES LIKE '%char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

Server1 has the same result of Server2 ?

An other possible source of problem is 'locale' setup.
In my server
koha@debian8:~$ locale
LANG=en_US.UTF-8
LANGUAGE=en_US:en
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

As you see there are much more '.UTF-8' that in your 'locale'.

On the topic:
https://wiki.koha-community.org/wiki/Encoding_and_Character_Sets_in_Koha

Bye
Zeno Tajoli






--
Zeno Tajoli
/SVILUPPO PRODOTTI CINECA/ - Automazione Biblioteche
Email: [hidden email] Fax: 051/6132198
*CINECA* Consorzio Interuniversitario - Sede operativa di Segrate (MI)
_______________________________________________
Koha mailing list  http://koha-community.org
[hidden email]
https://lists.katipo.co.nz/mailman/listinfo/koha
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

anjoze
Here is the result:

Locale is the same on both servers:
LANG=pt_PT.UTF-8
LANGUAGE=pt:pt_BR:en
LC_CTYPE="pt_PT.UTF-8"
LC_NUMERIC=pt_PT
LC_TIME=pt_PT
LC_COLLATE="pt_PT.UTF-8"
LC_MONETARY=pt_PT
LC_MESSAGES="pt_PT.UTF-8"
LC_PAPER=pt_PT
LC_NAME=pt_PT
LC_ADDRESS=pt_PT
LC_TELEPHONE=pt_PT
LC_MEASUREMENT=pt_PT
LC_IDENTIFICATION=pt_PT
LC_ALL=

SERVER1:
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8mb4                    |
| character_set_connection | utf8mb4                    |
| character_set_database   | utf8mb4                    |
| character_set_filesystem | binary                     |
| character_set_results    | utf8mb4                    |
| character_set_server     | utf8mb4                    |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

SERVER2:
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

Shouldn't be Z39.50 independent of Mysql schema?
This way I will always have problems connecting to other Servers because they have different Mysql schema's.
I've tested with my Server1 and Server2  andI have problems with characters when I connect to other public servers.
So, if I must have the same database schema to be able to get the right characters, probably I will never have a Z39.50 working right or I need to have lucky.
Koha version: 16.05.05
       - -
José Anjos
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

Zeno Tajoli-2
Hi Jose',

Il 28/06/2017 17:10, anjoze ha scritto:

> Locale is the same on both servers:
> LANG=pt_PT.UTF-8
> LANGUAGE=pt:pt_BR:en
> LC_CTYPE="pt_PT.UTF-8"
> LC_NUMERIC=pt_PT
> LC_TIME=pt_PT
> LC_COLLATE="pt_PT.UTF-8"
> LC_MONETARY=pt_PT
> LC_MESSAGES="pt_PT.UTF-8"
> LC_PAPER=pt_PT
> LC_NAME=pt_PT
> LC_ADDRESS=pt_PT
> LC_TELEPHONE=pt_PT
> LC_MEASUREMENT=pt_PT
> LC_IDENTIFICATION=pt_PT
> LC_ALL=

Ok, but do a better setup.
Insert "pt_PT.UTF-8" in all variables.
But "LANGUAGE=pt:pt_BR:en" is good.

> *SERVER1*:
> +--------------------------+----------------------------+
> | Variable_name            | Value                      |
> +--------------------------+----------------------------+
> | character_set_client     | utf8mb4                    |
> | character_set_connection | utf8mb4                    |
> | character_set_database   | utf8mb4                    |
> | character_set_filesystem | binary                     |
> | character_set_results    | utf8mb4                    |
> | character_set_server     | utf8mb4                    |
> | character_set_system     | utf8                       |
> | character_sets_dir       | /usr/share/mysql/charsets/ |
> +--------------------------+----------------------------+
>
> *SERVER2:*
> +--------------------------+----------------------------+
> | Variable_name            | Value                      |
> +--------------------------+----------------------------+
> | character_set_client     | utf8                       |
> | character_set_connection | utf8                       |
> | character_set_database   | latin1                     |
> | character_set_filesystem | binary                     |
> | character_set_results    | utf8                       |
> | character_set_server     | latin1                     |
> | character_set_system     | utf8                       |
> | character_sets_dir       | /usr/share/mysql/charsets/ |
> +--------------------------+----------------------------+

Server2 is clearly wrong, you need to fix /etc/mysql/my.conf
Server1 could be OK, but 'utf8mb4' is not Mysql standard on Ubuntu as I
know, check all conf files in /etc/mysql and into subdirs.

> Shouldn't be Z39.50 independent of Mysql schema?
This is not MySQL schema, it is MySQL setup.
Z39.50 is not indipendent from MySQL setup.
When you see data in z39.50 derivate cataloguing, those data are not
coming directly from z39.50 server. They are saved on a Mysql table and
AFTER are showing in the browser.

As you see here:
https://lists.katipo.co.nz/pipermail/koha/2017-June/048353.html
if you use yaz-client [a direct view of z39.50 data] no problems on encoding

And the same is for staged records.

If I have undestand well, Server2 is a test server, correct ?
If so, you could fix locale and MySQL setup and after try to use z39.50
from it.

If all ok, you can start to think about production changes.
To do similar changes on a production server with data IS DIFFICULT and
RISKY
You need to find a good MySQL DBA, follow his/her advice.


--
Zeno Tajoli
/SVILUPPO PRODOTTI CINECA/ - Automazione Biblioteche
Email: [hidden email] Fax: 051/6132198
*CINECA* Consorzio Interuniversitario - Sede operativa di Segrate (MI)
_______________________________________________
Koha mailing list  http://koha-community.org
[hidden email]
https://lists.katipo.co.nz/mailman/listinfo/koha
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

anjoze
I've looked at tables with Workbench and all tables have Collation set utf8_unicode_ci
Then I've added to my mysql conf:
[client]
default-character-set=utf8

[mysql]
default-character-set=utf8

[mysqld]
collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
character-set-server = utf8


Then:
mysql> SHOW VARIABLES LIKE '%char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

Still have the same problem with Z39.50 characters.
Koha version: 16.05.05
       - -
José Anjos
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

Zeno Tajoli-2
Hi Jose',

Il 05/07/2017 15:14, anjoze ha scritto:

> I've looked at tables with Workbench and all tables have Collation set
> utf8_unicode_ci
> Then I've added to my mysql conf:
> /[client]
> default-character-set=utf8
>
> [mysql]
> default-character-set=utf8
>
> [mysqld]
> collation-server = utf8_unicode_ci
> init-connect='SET NAMES utf8'
> character-set-server = utf8/
>
> Then:
> mysql> SHOW VARIABLES LIKE '%char%';
> +--------------------------+----------------------------+
> | Variable_name            | Value                      |
> +--------------------------+----------------------------+
> | character_set_client     | utf8                       |
> | character_set_connection | utf8                       |
> | character_set_database   | utf8                       |
> | character_set_filesystem | binary                     |
> | character_set_results    | utf8                       |
> | character_set_server     | utf8                       |
> | character_set_system     | utf8                       |
> | character_sets_dir       | /usr/share/mysql/charsets/ |
> +--------------------------+----------------------------+
>
> Still have the same problem with Z39.50 characters.

last try:
close apache, plack and zebra.
On linux command line go to the correct dir and with the koha user do:
./cleanup_database.pl --verbose --z3950

Then retest z39.50 searches.

Bye
Zeno Tajoli


--
Zeno Tajoli
/SVILUPPO PRODOTTI CINECA/ - Automazione Biblioteche
Email: [hidden email] Fax: 051/6132198
*CINECA* Consorzio Interuniversitario - Sede operativa di Segrate (MI)
_______________________________________________
Koha mailing list  http://koha-community.org
[hidden email]
https://lists.katipo.co.nz/mailman/listinfo/koha
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

anjoze
Hi Zeno,

The result:
Purging Z39.50 records from import tables.
Done with purging Z39.50 records from import tables.

But still not working :(
The problem still exactly the same after everything we done.
Server2 can get right encoding from Server1
None of both server can get the right encoding from any Portuguese servers.
Ex: Portuguese National Library
BNP - Catálogo da Biblioteca Nacional de Portugal
Hostname catalogo.bnportugal.pt
Port 210
BD bn
Syntaxe unimarc


Talking with other persons from other Koha libraries, none of them have Z39.50 working correctly.

Thanks for your help
Koha version: 16.05.05
       - -
José Anjos
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

Zeno Tajoli-2
Hi Jose,

Il 06/07/2017 11:40, anjoze ha scritto:
> None of both server can get the right encoding from any Portuguese servers.
> Ex: *Portuguese National Library*
> BNP - Catálogo da Biblioteca Nacional de Portugal
> /Hostname catalogo.bnportugal.pt

about this host, in the 'Encoding' field, which value have you insert ?
Try to insert 'ISO_5426'

Bye
Zeno Tajoli

--
Zeno Tajoli
/SVILUPPO PRODOTTI CINECA/ - Automazione Biblioteche
Email: [hidden email] Fax: 051/6132198
*CINECA* Consorzio Interuniversitario - Sede operativa di Segrate (MI)
_______________________________________________
Koha mailing list  http://koha-community.org
[hidden email]
https://lists.katipo.co.nz/mailman/listinfo/koha
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

anjoze
Hi Zeno,

I've tried with all the encoding on both servers without success.
With ISO5426 the Results widow is OK but when I preview or import Marc record it's not ok
EX:
Results window: Title: A criação de caracóis
MARC record : 200 1  _aÂA Âcriação de caracóis

Cheers
Koha version: 16.05.05
       - -
José Anjos
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

Zeno Tajoli-2
Hi Jose'
Il 06/07/2017 12:29, anjoze ha scritto:
> I've tried with all the encoding on both servers without success.
> With ISO5426 the Results widow is OK but when I preview or import Marc
> record it's not ok
> EX:
> Results window: Title: A criação de caracóis
> MARC record : 200 1  _aÂA Âcriação de caracóis

Tring with Tamil demo [http://kpro.tamil.fr/] z39.50 works weel [with
Firefox].
I don't know what to do, you need to debug the perl code.

Bye
Zeno Tajoli

--
Zeno Tajoli
/SVILUPPO PRODOTTI CINECA/ - Automazione Biblioteche
Email: [hidden email] Fax: 051/6132198
*CINECA* Consorzio Interuniversitario - Sede operativa di Segrate (MI)
_______________________________________________
Koha mailing list  http://koha-community.org
[hidden email]
https://lists.katipo.co.nz/mailman/listinfo/koha
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

anjoze
Yes, it is strange.
Later I will try to find the problem but probably it will take a long time.
Than, if I have some conclusion I'll let you know.
Thank you very much for your help
Koha version: 16.05.05
       - -
José Anjos
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

anjoze
In reply to this post by Zeno Tajoli-2
I've done some more tests:
Installed a new VM with mariadb.
Checked mariadb configuration:
character-set-server  = utf8mb4
collation-server      = utf8mb4_general_ci<quote author="anjoze">

This is the default charset/collation and should have full compatibility with all special characters


Checked Locale:
LANG=pt_PT.UTF-8
LANGUAGE=pt:pt_BR:en <-- This is correct on a PT installation
LC_CTYPE="pt_PT.UTF-8"
LC_NUMERIC=pt_PT
LC_TIME=pt_PT
LC_COLLATE="pt_PT.UTF-8"
LC_MONETARY=pt_PT
LC_MESSAGES="pt_PT.UTF-8"
LC_PAPER=pt_PT
LC_NAME=pt_PT
LC_ADDRESS=pt_PT
LC_TELEPHONE=pt_PT
LC_MEASUREMENT=pt_PT
LC_IDENTIFICATION=pt_PT
LC_ALL=


Then I've created a Koha instance with UNIMARC language FR --> Same problem
Then I've created a Koha instance with UNIMARC language EN --> Same problem

I've made a no data dump of DB:  mysqldump --no-data [db_name] -u[user] -p[password] > schemafile.sql
O the beginning I have this:
Server version 10.0.29-MariaDB-0ubuntu0.16.04.1
SET NAMES utf8mb4 */;


Than all the tables have:
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;

ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
/*!40101 SET character_set_client = @saved_cs_client */;



Example of one table:
-- Table structure for table `import_records`
--

DROP TABLE IF EXISTS `import_records`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `import_records` (
  `import_record_id` int(11) NOT NULL AUTO_INCREMENT,
  `import_batch_id` int(11) NOT NULL,
  `branchcode` varchar(10) COLLATE utf8_unicode_ci DEFAULT NULL,
  `record_sequence` int(11) NOT NULL DEFAULT '0',
  `upload_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `import_date` date DEFAULT NULL,
  `marc` longblob NOT NULL,
  `marcxml` longtext COLLATE utf8_unicode_ci NOT NULL,
  `marcxml_old` longtext COLLATE utf8_unicode_ci NOT NULL,
  `record_type` enum('biblio','auth','holdings') COLLATE utf8_unicode_ci NOT NULL DEFAULT 'biblio',
  `overlay_status` enum('no_match','auto_match','manual_match','match_applied') COLLATE utf8_unicode_ci NOT NULL DEFAULT 'no_match',
  `status` enum('error','staged','imported','reverted','items_reverted','ignored') COLLATE utf8_unicode_ci NOT NULL DEFAULT 'staged',
  `import_error` mediumtext COLLATE utf8_unicode_ci,
  `encoding` varchar(40) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `z3950random` varchar(40) COLLATE utf8_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`import_record_id`),
  KEY `branchcode` (`branchcode`),
  KEY `batch_sequence` (`import_batch_id`,`record_sequence`),
  KEY `batch_id_record_type` (`import_batch_id`,`record_type`),
  CONSTRAINT `import_records_ifbk_1` FOREIGN KEY (`import_batch_id`) REFERENCES `import_batches` (`import_batch_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
/*!40101 SET character_set_client = @saved_cs_client */;


Can't see any problem here....
Koha version: 16.05.05
       - -
José Anjos
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

Zeno Tajoli-2
Hi Jose',

Il 07/07/2017 11:21, anjoze ha scritto:
> I've done some more tests:
> Installed a new VM with mariadb.

your new VM:
-- Which O.S. ?
-- Which Koha version ?

Bye
Zeno Tajoli

--
Zeno Tajoli
/SVILUPPO PRODOTTI CINECA/ - Automazione Biblioteche
Email: [hidden email] Fax: 051/6132198
*CINECA* Consorzio Interuniversitario - Sede operativa di Segrate (MI)
_______________________________________________
Koha mailing list  http://koha-community.org
[hidden email]
https://lists.katipo.co.nz/mailman/listinfo/koha
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

anjoze
Hi Zeno,

Ubuntu 16.04LTS 64bits (Language installation PT_PT)

Koha version: 17.05.01.000
Perl version: 5.022001
Zebra version: 2.0.59

Maybe later I will create a new VM and install it in En

Thaks
Koha version: 16.05.05
       - -
José Anjos
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

Zeno Tajoli-2
Hi Jose',

Il 07/07/2017 12:02, anjoze ha scritto:
> Ubuntu 16.04LTS 64bits (Language installation PT_PT)
>
> Koha version: 17.05.01.000
> Perl version: 5.022001
> Zebra version: 2.0.59
> Maybe later I will create a new VM and install it in En

I try to replicate your setup.
And I find that the problem is in Koha code, not in MySQL on system setup.

I fill a bug here:
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=18910

I don't think is a easy bug to fix a proper way.

Bye
Zeno Tajoli



--
Zeno Tajoli
/SVILUPPO PRODOTTI CINECA/ - Automazione Biblioteche
Email: [hidden email] Fax: 051/6132198
*CINECA* Consorzio Interuniversitario - Sede operativa di Segrate (MI)
_______________________________________________
Koha mailing list  http://koha-community.org
[hidden email]
https://lists.katipo.co.nz/mailman/listinfo/koha
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

anjoze
Zeno,
Thank you very much for your work.
Cheers,
José Anjos
Koha version: 16.05.05
       - -
José Anjos
Reply | Threaded
Open this post in threaded view
|

Re: Z39.50 encoding problem (tested on 2 koha servers)

Fridolin SOMERS
In reply to this post by Zeno Tajoli-2
Thank you so much Zeno.
We where looking for this bug since weeks ;)

Le 07/07/2017 à 16:12, Tajoli Zeno a écrit :

> Hi Jose',
>
> Il 07/07/2017 12:02, anjoze ha scritto:
>> Ubuntu 16.04LTS 64bits (Language installation PT_PT)
>>
>> Koha version: 17.05.01.000
>> Perl version: 5.022001
>> Zebra version: 2.0.59
>> Maybe later I will create a new VM and install it in En
>
> I try to replicate your setup.
> And I find that the problem is in Koha code, not in MySQL on system setup.
>
> I fill a bug here:
> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=18910
>
> I don't think is a easy bug to fix a proper way.
>
> Bye
> Zeno Tajoli
>
>
>

--
Fridolin SOMERS
Biblibre - Pôles support et système
[hidden email]
_______________________________________________
Koha mailing list  http://koha-community.org
[hidden email]
https://lists.katipo.co.nz/mailman/listinfo/koha