Zebra config problem (still 1)

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Zebra config problem (still 1)

Paul POULAIN-2
OK guys, the problem with CQL2RPN seems solved.

But the other one is still here.

I explain again (sorry to bug you, but i'm really stuck) :

zebra.cfg is :
================================
profilePath: ${srcdir:-.}:/usr/local/share/idzebra/tab/
attset: bib1.att
attset: explain.att
recordId: (bib1, Identifier-standard)
recordType: grs.xml
lockDir: lock
setTmpDir: tmp
keyTmpDir: tmp
memMax: 100
perm.anonymous: rw
encoding utf-8
storeKeys:1
storeData:1
===============================

I have a collection.abs in my directory, to be able to read
MARC::Record->as_XML :
========================
name collection
attset bib1.att
esetname F @
esetname B @
marc usmarc.mar
xpath disable
all any
melm 090$a identifier-standard,identifier-standard:p
melm 700 author,author:p
melm 200$a title,title:p
melm 200$e title,title:p
melm 020$a isbn
melm 011$a issn

(yes, on my sample the 090$a is my primary key)

========================

and when I run the following script :
my $Zpackage = $Zconn->package();
$Zpackage->option(databaseName => 'Koha');
$Zpackage->option(action => "specialUpdate");
$Zpackage->option(record => $record);
$Zpackage->send("update");
$Zpackage->destroy;

I get :
15:40:34-03/02 zebrasrv(1) [log] Received DB Update
15:40:34-03/02 zebrasrv(1) [log] action
15:40:34-03/02 zebrasrv(1) [log] specialUpdate
15:40:34-03/02 zebrasrv(1) [log] database: Koha
15:40:34-03/02 zebrasrv(1) [log][app2] zebra_register_open rw = 1
useshadow=0 p=0x8146570,n=,rp=(none)
15:40:34-03/02 zebrasrv(1) [log]
profilePath=.:/usr/local/share/idzebra/tab/
cwd=/home/paul/koha.dev/head/misc/zebra/unimarc
15:40:34-03/02 zebrasrv(1) [log] record 0 type XML
15:40:34-03/02 zebrasrv(1) [log] 3129 bytes:
<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.l ...
15:40:34-03/02 zebrasrv(1) [warn] Record didn't contain match fields in
(bib1, identifier-standard)
15:40:34-03/02 zebrasrv(1) [warn] Bad match criteria (recordID)
15:40:34-03/02 zebrasrv(1) [log] zebra_update_record returned res=1
15:40:34-03/02 zebrasrv(1) [warn] zebra_update_record failed r=1
15:40:34-03/02 zebrasrv(1) [log] zebra_end_trans
15:40:34-03/02 zebrasrv(1) [log] sorting section 1
15:40:34-03/02 zebrasrv(1) [log] Iterations . . .     36
15:40:34-03/02 zebrasrv(1) [log] Distinct words .     18
15:40:34-03/02 zebrasrv(1) [log] Updates. . . . .     14
15:40:34-03/02 zebrasrv(1) [log] Deletions. . . .      2
15:40:34-03/02 zebrasrv(1) [log] Insertions . . .      2
15:40:34-03/02 zebrasrv(1) [log][app2] zebra_register_close p=0x8146570
15:40:34-03/02 zebrasrv(1) [log] Records:       0 i/u/d 0/0/0
15:40:34-03/02 zebrasrv(1) [log] user/system: 1/0
15:40:34-03/02 zebrasrv(1) [request] EsRequest  ERROR 224 update_record
failed
15:40:34-03/02 zebrasrv(1) [session] Connection closed by client


If I remove the recordId line in zebra.cfg, everything goes well, except
I need this line to be able to modify my records !

PS : bib1 is the standard bib1.att file, containing :
att 1007            Identifier-standard
--
Paul POULAIN et Henri Damien LAURENT
Consultants indépendants
en logiciels libres et bibliothéconomie (http://www.koha-fr.org)


_______________________________________________
Koha-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/koha-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zebra config problem (still 1)

Adam Dickmeiss
Paul POULAIN wrote:

> OK guys, the problem with CQL2RPN seems solved.
>
> But the other one is still here.
>
> I explain again (sorry to bug you, but i'm really stuck) :
>
> zebra.cfg is :
> ================================
> profilePath: ${srcdir:-.}:/usr/local/share/idzebra/tab/
> attset: bib1.att
> attset: explain.att
> recordId: (bib1, Identifier-standard)
> recordType: grs.xml

[snip]

> xpath disable
> all any
> melm 090$a    identifier-standard,identifier-standard:p
> melm 700    author,author:p
[snip]

> cwd=/home/paul/koha.dev/head/misc/zebra/unimarc
> 15:40:34-03/02 zebrasrv(1) [log] record 0 type XML
> 15:40:34-03/02 zebrasrv(1) [log] 3129 bytes:
> <?xml version="1.0" encoding="UTF-8"?>
> <collection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.l ...


Your root element is collection. Not record. I don't think melm will
match that. Had you used record as root element - it should do it.

It's always a good idea to try things out with
   zebraidx -s update testrec.xml
and see what gets matched.. (Look for the Idx: lines).

/ Adam

> 15:40:34-03/02 zebrasrv(1) [warn] Record didn't contain match fields in
> (bib1, identifier-standard)
> 15:40:34-03/02 zebrasrv(1) [warn] Bad match criteria (recordID)
> 15:40:34-03/02 zebrasrv(1) [log] zebra_update_record returned res=1
> 15:40:34-03/02 zebrasrv(1) [warn] zebra_update_record failed r=1
> 15:40:34-03/02 zebrasrv(1) [log] zebra_end_trans
> 15:40:34-03/02 zebrasrv(1) [log] sorting section 1
> 15:40:34-03/02 zebrasrv(1) [log] Iterations . . .     36
> 15:40:34-03/02 zebrasrv(1) [log] Distinct words .     18
> 15:40:34-03/02 zebrasrv(1) [log] Updates. . . . .     14
> 15:40:34-03/02 zebrasrv(1) [log] Deletions. . . .      2
> 15:40:34-03/02 zebrasrv(1) [log] Insertions . . .      2
> 15:40:34-03/02 zebrasrv(1) [log][app2] zebra_register_close p=0x8146570
> 15:40:34-03/02 zebrasrv(1) [log] Records:       0 i/u/d 0/0/0
> 15:40:34-03/02 zebrasrv(1) [log] user/system: 1/0
> 15:40:34-03/02 zebrasrv(1) [request] EsRequest  ERROR 224 update_record
> failed
> 15:40:34-03/02 zebrasrv(1) [session] Connection closed by client
>
>
> If I remove the recordId line in zebra.cfg, everything goes well, except
> I need this line to be able to modify my records !
>
> PS : bib1 is the standard bib1.att file, containing :
> att 1007            Identifier-standard



_______________________________________________
Koha-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/koha-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zebra config problem (still 1)

Tümer Garip
In reply to this post by Paul POULAIN-2
Adam wrote in reply to Paul:
> cwd=/home/paul/koha.dev/head/misc/zebra/unimarc
> 15:40:34-03/02 zebrasrv(1) [log] record 0 type XML
> 15:40:34-03/02 zebrasrv(1) [log] 3129 bytes:
> <?xml version="1.0" encoding="UTF-8"?>
> <collection
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.l ...

>> Your root element is collection. Not record. I don't
think melm will match that. Had you used record as root >>
element - it should do it.
>>
>>
>> It's always a good idea to try things out with
>> zebraidx -s update testrec.xml
>> and see what gets matched.. (Look for the Idx: lines).


Ooops, this is getting more complicated for me.
MARC21 slim defines the MARCXML as
<collection><record></record></collection>. The perl MARC module
produces XML according to this.

When you run zebraidx with a proper MARCXML you get an error saying
collection.abs could not be found.
So you are forced to rename marc21.abs as collection.abs by zebraidx or
by ZOOM in that sense.

To make things more complicated everything seems to work until you try
to use the recordID feature.
If your XML file omits the MARC tags below 10 then RecordID works
everytime. However if you have tags below 10 out of 50k records I got
30% with bad-match error.

So I am trying to eliminate whether its KOHA reproduced records doing or
not. I'll post the outcome on this list.

Regards,
Tumer Garip



_______________________________________________
Koha-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/koha-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zebra config problem (still 1)

Paul POULAIN-2
In reply to this post by Adam Dickmeiss
Adam Dickmeiss a écrit :
(answer to Adam question at the end)

I want to completly describe my history with zebra, to let you be aware
of all I did, and maybe understand why I begin to really feel
*discouraged* :
* just in case you don't know : i've been Koha Release Manager for
version 2.0 and 2.2. I'm the main -almost only- author of the MARC
support in Koha.
* when the 3.0 Release Manager was nominated, Joshua, he suggested to
adopt Zebra. At first, I was not very happy with this proposal, as it
adds a new tool for Koha, and makes install more complex. But other args
convinced me it was the way to go.
* Thus I set up zebra on my computer, and began to move MARC stuff to
zebra. I succedeed to have something working correctly after something
like a week of work. The problem being that the zebra indexing was done
through a perl exec() and zebraidx.
So, I waited for Perl-ZOOM very impatiently, letting the code as it for
some months (2-3 ?).
When Perl-ZOOM arrived, I was very very happy.
But now i'm really no more happy at all, as I ran into many many many
problems and feel quite stuck and alone with the problem.
I don't want to count how many days I've spend on koha/zebra without
success, but that's something like 6-7 full days, probably more :-(

Here is a summary of all my problems :
* at 1st, I tried to setup a iso2709 (full MARC) DB. I ran into "Error
updating 10002 => Encoding failed". After investigating and asking this
list,
(http://lists.gnu.org/archive/html/koha-zebra/2006-01/msg00015.html and
following thread) it appears that iso2709 support was problematic and
that I had better going XML.
That seemed a good idea to me, as XML is highly more comprehensive and a
sex-appealing technology ;-)
* Thus, I changes some code in Koha to use MARCXML package
(http://search.cpan.org/~esummers/MARC-XML-0.81/lib/MARC/File/XML.pm)
* But I still ran into the "Error updating 10002" After investigating a
little bit more, adam finaly caught the culprit
(http://lists.gnu.org/archive/html/koha-zebra/2006-01/msg00034.html).
This time it was a compilation problem !!!
* Could it be my last problem ? no, unfortunatly. I ran into the 2
recent problems : impossible to search, failure to index with RecordId.
* It appears finally to Mike
(http://lists.gnu.org/archive/html/koha-zebra/2006-01/msg00038.html)
that the search features were not in official yaz package, and a new
package has been released !
* I'm still stuck with the indexing problem. I really thought I wanted
to do something simple : index MARCXML data (produced by ed package)
into zebra. Why it does not work is NOT clear to me.
I solved a problem with marc21.abs to be renamed to collection.abs, but
didn't saw anything on this, and if Tümer had not seen this, I would not
have found it myself ! (and i't still unclear to me why you have a
marc21.abs where MACXML speaks of <collection> tag)

Now,I'm afraid there's still something undocumented somewhere, or
bugged, or unreleased, or something like this.
I really begin to feel discouraged and alone.
Many thanks to Tümer that pointed me some problems, but seems as stuck
as me :-(

I end with an answer to Adam suggestion with zebraidx -s update
testrec.xml :

 >> <?xml version="1.0" encoding="UTF-8"?>
 >> <collection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 >> xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.l ...
 > Your root element is collection. Not record. I don't think melm will
 > match that. Had you used record as root element - it should do it.
 >
 > It's always a good idea to try things out with
 >   zebraidx -s update testrec.xml
 > and see what gets matched.. (Look for the Idx: lines).

for XML :
 > <?xml version="1.0" encoding="UTF-8"?>
 > <collection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.loc.gov/MARC21/slim 
http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd"
xmlns="http://www.loc.gov/MARC21/slim">
 > <record>
 > <leader>00543     2200181   4500</leader>
 > <controlfield tag="001">19</controlfield>
 > <datafield tag="010" ind1=" " ind2=" ">
 > <subfield code="a">2010140001</subfield>
 > <subfield code="d">45 F</subfield>
 > </datafield>
 > <datafield tag="090" ind1=" " ind2=" ">
 > <subfield code="9">16</subfield>
 > <subfield code="a">16</subfield>
 > </datafield>
 > <datafield tag="100" ind1=" " ind2=" ">
 > <subfield code="a">1995                y0fre 0103    ba</subfield>
 > </datafield>
 > <datafield tag="101" ind1=" " ind2=" ">
 > <subfield code="a">fre</subfield>
 > </datafield>
 > <datafield tag="105" ind1=" " ind2=" ">
 > <subfield code="a">y       00  y</subfield>
 > </datafield>
 > <datafield tag="200" ind1="1" ind2=" ">
 > <subfield code="a">Pour l'honneur de l'esprit humain</subfield>
 > <subfield code="b">LIVR</subfield>
 > <subfield code="e">Les mathematiques aujourd'hui</subfield>
 > <subfield code="f">Jean DIEUDONNE</subfield>
 > </datafield>
 > <datafield tag="995" ind1=" " ind2=" ">
 > <subfield code="b">CDI</subfield>
 > <subfield code="c">CDI</subfield>
 > <subfield code="e">SL</subfield>
 > <subfield code="f">Non inventorie</subfield>
 > <subfield code="j">000006</subfield>
 > <subfield code="o">2</subfield>
 > <subfield code="9">27</subfield>
 > </datafield>
 > </record>
 > </collection>

with zebraidx -s update testrec.xml I get (many lines snipped, complete
log at end of mail) :
 > Record type: 'collection'
 >     Local tag: 'collection'
 >          tag=collection/
 >                 Local tag: 'subfield'
 >                      tag=subfield/datafield/record/collection/
 >                     Data: '16'
 >               Idx: [w]bib1:Identifier-standard [1007] data XData:"16"
 >               Idx: [p]bib1:Identifier-standard [1007] data XData:"16"
 >                         Idx: [w]bib1:Any [1016] data XData:"16"
 >                      tag=subfield/datafield/record/collection/
 >                 Data: '
 >                 '
 >             Local tag: 'datafield'
 >                  tag=datafield/record/collection/
 >                 Data: '
 >                         '
 >                 Local tag: 'subfield'
 >                      tag=subfield/datafield/record/collection/
 >                     Data: 'Pour l'honneur de l'esprit humain'
 > Idx: [w]bib1:Title [4] data XData:"Pour l'honneur de l'esprit humain"
 > Idx: [p]bib1:Title [4] data XData:"Pour l'honneur de l'esprit humain"
 > Idx: [w]bib1:Any [1016] data XData:"Pour l'honneur de l'esprit humain"
 >                      tag=subfield/datafield/record/collection/
 >                 Data: '
 >                         '
 > 11:31:48-08/02 zebraidx(26418) [log] zebra_end_trans
 > 11:31:48-08/02 zebraidx(26418) [log] sorting section 1
 > 11:31:48-08/02 zebraidx(26418) [log] Iterations . . .     42
 > 11:31:48-08/02 zebraidx(26418) [log] Distinct words .     20
 > 11:31:48-08/02 zebraidx(26418) [log] Updates. . . . .     17
 > 11:31:48-08/02 zebraidx(26418) [log] Deletions. . . .      1
 > 11:31:48-08/02 zebraidx(26418) [log] Insertions . . .      2
 > 11:31:48-08/02 zebraidx(26418) [log][app2] zebra_register_close
p=0x8106c70
 > 11:31:48-08/02 zebraidx(26418) [log] Records:       0 i/u/d 0/0/0
 > 11:31:48-08/02 zebraidx(26418) [log] user/system: 0/0
 > 11:31:48-08/02 zebraidx(26418) [log][app2] zebra_stop
 > 11:31:48-08/02 zebraidx(26418) [log] zebraidx times:  0.06  0.00  0.00
 > [paul@bureau unimarc]$


If I read correctly, The Identifier-standard [1007] is correctly
detected, but it does not work anymore.






The complete log from zebraidx :
==========================================================

> Record type: 'collection'
>     Local tag: 'collection'
>          tag=collection/
>         Data: '
>
>         '
>         Local tag: 'record'
>              tag=record/collection/
>             Data: '
>                 '
>             Local tag: 'leader'
>                  tag=leader/record/collection/
>                 Data: '00543     2200181   4500'
>                  tag=leader/record/collection/
>             Data: '
>                 '
>             Local tag: 'controlfield'
>                  tag=controlfield/record/collection/
>                 Data: '19'
>                  tag=controlfield/record/collection/
>             Data: '
>                 '
>             Local tag: 'datafield'
>                  tag=datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '2010140001'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '45 F'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                 '
>                  tag=datafield/record/collection/
>             Data: '
>                 '
>             Local tag: 'datafield'
>                  tag=datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '16'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '16'
>                         Idx: [w]bib1:Identifier-standard [1007] data XData:"16"
>                         Idx: [p]bib1:Identifier-standard [1007] data XData:"16"
>                         Idx: [w]bib1:Any [1016] data XData:"16"
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                 '
>                  tag=datafield/record/collection/
>             Data: '
>                 '
>             Local tag: 'datafield'
>                  tag=datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '1995                y0fre 0103    ba'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                 '
>                  tag=datafield/record/collection/
>             Data: '
>                 '
>             Local tag: 'datafield'
>                  tag=datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'fre'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                 '
>                  tag=datafield/record/collection/
>             Data: '
>                 '
>             Local tag: 'datafield'
>                  tag=datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'y       00  y'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                 '
>                  tag=datafield/record/collection/
>             Data: '
>                 '
>             Local tag: 'datafield'
>                  tag=datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'Pour l'honneur de l'esprit humain'
>                         Idx: [w]bib1:Title [4] data XData:"Pour l'honneur de l'esprit humain"
>                         Idx: [p]bib1:Title [4] data XData:"Pour l'honneur de l'esprit humain"
>                         Idx: [w]bib1:Any [1016] data XData:"Pour l'honneur de l'esprit humain"
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'LIVR'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'Les mathematiques aujourd'hui'
>                         Idx: [w]bib1:Title [4] data XData:"Les mathematiques aujourd'hui"
>                         Idx: [p]bib1:Title [4] data XData:"Les mathematiques aujourd'hui"
>                         Idx: [w]bib1:Any [1016] data XData:"Les mathematiques aujourd'hui"
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'Jean DIEUDONNE'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                 '
>                  tag=datafield/record/collection/
>             Data: '
>                 '
>             Local tag: 'datafield'
>                  tag=datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'CDI'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'CDI'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'SL'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'Non inventorie'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '000006'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '2'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '27'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                 '
>                  tag=datafield/record/collection/
>             Data: '
>         '
>              tag=record/collection/
>         Data: '
> '
>          tag=collection/
> -------------
>
> 11:31:48-08/02 zebraidx(26418) [log] zebra_end_trans
> 11:31:48-08/02 zebraidx(26418) [log] sorting section 1
> 11:31:48-08/02 zebraidx(26418) [log] Iterations . . .     42
> 11:31:48-08/02 zebraidx(26418) [log] Distinct words .     20
> 11:31:48-08/02 zebraidx(26418) [log] Updates. . . . .     17
> 11:31:48-08/02 zebraidx(26418) [log] Deletions. . . .      1
> 11:31:48-08/02 zebraidx(26418) [log] Insertions . . .      2
> 11:31:48-08/02 zebraidx(26418) [log][app2] zebra_register_close p=0x8106c70
> 11:31:48-08/02 zebraidx(26418) [log] Records:       0 i/u/d 0/0/0
> 11:31:48-08/02 zebraidx(26418) [log] user/system: 0/0
> 11:31:48-08/02 zebraidx(26418) [log][app2] zebra_stop
> 11:31:48-08/02 zebraidx(26418) [log] zebraidx times:  0.06  0.00  0.00
> [paul@bureau unimarc]$                                                      

--
Paul POULAIN et Henri Damien LAURENT
Consultants indépendants
en logiciels libres et bibliothéconomie (http://www.koha-fr.org)


_______________________________________________
Koha-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/koha-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zebra config problem (still 1)

Adam Dickmeiss
Paul,

please send your marcxml record file + zebra.cfg + collection.abs . I
might be able to see what's wrong, then

/ Adam

Paul POULAIN wrote:

> Adam Dickmeiss a écrit :
> (answer to Adam question at the end)
>
> I want to completly describe my history with zebra, to let you be aware
> of all I did, and maybe understand why I begin to really feel
> *discouraged* :
> * just in case you don't know : i've been Koha Release Manager for
> version 2.0 and 2.2. I'm the main -almost only- author of the MARC
> support in Koha.
> * when the 3.0 Release Manager was nominated, Joshua, he suggested to
> adopt Zebra. At first, I was not very happy with this proposal, as it
> adds a new tool for Koha, and makes install more complex. But other args
> convinced me it was the way to go.
> * Thus I set up zebra on my computer, and began to move MARC stuff to
> zebra. I succedeed to have something working correctly after something
> like a week of work. The problem being that the zebra indexing was done
> through a perl exec() and zebraidx.
> So, I waited for Perl-ZOOM very impatiently, letting the code as it for
> some months (2-3 ?).
> When Perl-ZOOM arrived, I was very very happy.
> But now i'm really no more happy at all, as I ran into many many many
> problems and feel quite stuck and alone with the problem.
> I don't want to count how many days I've spend on koha/zebra without
> success, but that's something like 6-7 full days, probably more :-(
>
> Here is a summary of all my problems :
> * at 1st, I tried to setup a iso2709 (full MARC) DB. I ran into "Error
> updating 10002 => Encoding failed". After investigating and asking this
> list,
> (http://lists.gnu.org/archive/html/koha-zebra/2006-01/msg00015.html and
> following thread) it appears that iso2709 support was problematic and
> that I had better going XML.
> That seemed a good idea to me, as XML is highly more comprehensive and a
> sex-appealing technology ;-)
> * Thus, I changes some code in Koha to use MARCXML package
> (http://search.cpan.org/~esummers/MARC-XML-0.81/lib/MARC/File/XML.pm)
> * But I still ran into the "Error updating 10002" After investigating a
> little bit more, adam finaly caught the culprit
> (http://lists.gnu.org/archive/html/koha-zebra/2006-01/msg00034.html).
> This time it was a compilation problem !!!
> * Could it be my last problem ? no, unfortunatly. I ran into the 2
> recent problems : impossible to search, failure to index with RecordId.
> * It appears finally to Mike
> (http://lists.gnu.org/archive/html/koha-zebra/2006-01/msg00038.html)
> that the search features were not in official yaz package, and a new
> package has been released !
> * I'm still stuck with the indexing problem. I really thought I wanted
> to do something simple : index MARCXML data (produced by ed package)
> into zebra. Why it does not work is NOT clear to me.
> I solved a problem with marc21.abs to be renamed to collection.abs, but
> didn't saw anything on this, and if Tümer had not seen this, I would not
> have found it myself ! (and i't still unclear to me why you have a
> marc21.abs where MACXML speaks of <collection> tag)
>
> Now,I'm afraid there's still something undocumented somewhere, or
> bugged, or unreleased, or something like this.
> I really begin to feel discouraged and alone.
> Many thanks to Tümer that pointed me some problems, but seems as stuck
> as me :-(
>
> I end with an answer to Adam suggestion with zebraidx -s update
> testrec.xml :
>
>  >> <?xml version="1.0" encoding="UTF-8"?>
>  >> <collection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>  >> xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.l ...
>  > Your root element is collection. Not record. I don't think melm will
>  > match that. Had you used record as root element - it should do it.
>  >
>  > It's always a good idea to try things out with
>  >   zebraidx -s update testrec.xml
>  > and see what gets matched.. (Look for the Idx: lines).
>
> for XML :
>  > <?xml version="1.0" encoding="UTF-8"?>
>  > <collection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xsi:schemaLocation="http://www.loc.gov/MARC21/slim 
> http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd"
> xmlns="http://www.loc.gov/MARC21/slim">  
>  > <record>
>  >     <leader>00543     2200181   4500</leader>
>  >     <controlfield tag="001">19</controlfield>
>  >     <datafield tag="010" ind1=" " ind2=" ">
>  >         <subfield code="a">2010140001</subfield>
>  >         <subfield code="d">45 F</subfield>
>  >     </datafield>
>  >     <datafield tag="090" ind1=" " ind2=" ">
>  >         <subfield code="9">16</subfield>
>  >         <subfield code="a">16</subfield>
>  >     </datafield>
>  >     <datafield tag="100" ind1=" " ind2=" ">
>  >         <subfield code="a">1995                y0fre 0103    
> ba</subfield>
>  >     </datafield>
>  >     <datafield tag="101" ind1=" " ind2=" ">
>  >         <subfield code="a">fre</subfield>
>  >     </datafield>
>  >     <datafield tag="105" ind1=" " ind2=" ">
>  >         <subfield code="a">y       00  y</subfield>
>  >     </datafield>
>  >     <datafield tag="200" ind1="1" ind2=" ">
>  >         <subfield code="a">Pour l'honneur de l'esprit humain</subfield>
>  >         <subfield code="b">LIVR</subfield>
>  >         <subfield code="e">Les mathematiques aujourd'hui</subfield>
>  >         <subfield code="f">Jean DIEUDONNE</subfield>
>  >     </datafield>
>  >     <datafield tag="995" ind1=" " ind2=" ">
>  >         <subfield code="b">CDI</subfield>
>  >         <subfield code="c">CDI</subfield>
>  >         <subfield code="e">SL</subfield>
>  >         <subfield code="f">Non inventorie</subfield>
>  >         <subfield code="j">000006</subfield>
>  >         <subfield code="o">2</subfield>
>  >         <subfield code="9">27</subfield>
>  >     </datafield>
>  > </record>
>  > </collection>
>
> with zebraidx -s update testrec.xml I get (many lines snipped, complete
> log at end of mail) :
>  > Record type: 'collection'
>  >     Local tag: 'collection'
>  >          tag=collection/
>  >                 Local tag: 'subfield'
>  >                      tag=subfield/datafield/record/collection/
>  >                     Data: '16'
>  >               Idx: [w]bib1:Identifier-standard [1007] data XData:"16"
>  >               Idx: [p]bib1:Identifier-standard [1007] data XData:"16"
>  >                         Idx: [w]bib1:Any [1016] data XData:"16"
>  >                      tag=subfield/datafield/record/collection/
>  >                 Data: '
>  >                 '
>  >             Local tag: 'datafield'
>  >                  tag=datafield/record/collection/
>  >                 Data: '
>  >                         '
>  >                 Local tag: 'subfield'
>  >                      tag=subfield/datafield/record/collection/
>  >                     Data: 'Pour l'honneur de l'esprit humain'
>  > Idx: [w]bib1:Title [4] data XData:"Pour l'honneur de l'esprit humain"
>  > Idx: [p]bib1:Title [4] data XData:"Pour l'honneur de l'esprit humain"
>  > Idx: [w]bib1:Any [1016] data XData:"Pour l'honneur de l'esprit humain"
>  >                      tag=subfield/datafield/record/collection/
>  >                 Data: '
>  >                         '
>  > 11:31:48-08/02 zebraidx(26418) [log] zebra_end_trans
>  > 11:31:48-08/02 zebraidx(26418) [log] sorting section 1
>  > 11:31:48-08/02 zebraidx(26418) [log] Iterations . . .     42
>  > 11:31:48-08/02 zebraidx(26418) [log] Distinct words .     20
>  > 11:31:48-08/02 zebraidx(26418) [log] Updates. . . . .     17
>  > 11:31:48-08/02 zebraidx(26418) [log] Deletions. . . .      1
>  > 11:31:48-08/02 zebraidx(26418) [log] Insertions . . .      2
>  > 11:31:48-08/02 zebraidx(26418) [log][app2] zebra_register_close
> p=0x8106c70
>  > 11:31:48-08/02 zebraidx(26418) [log] Records:       0 i/u/d 0/0/0
>  > 11:31:48-08/02 zebraidx(26418) [log] user/system: 0/0
>  > 11:31:48-08/02 zebraidx(26418) [log][app2] zebra_stop
>  > 11:31:48-08/02 zebraidx(26418) [log] zebraidx times:  0.06  0.00  0.00
>  > [paul@bureau unimarc]$
>
>
> If I read correctly, The Identifier-standard [1007] is correctly
> detected, but it does not work anymore.
>
>
>
>
>
>
> The complete log from zebraidx :
> ==========================================================
>
>> Record type: 'collection'
>>     Local tag: 'collection'
>>          tag=collection/
>>         Data: '
>>
>>         '
>>         Local tag: 'record'
>>              tag=record/collection/
>>             Data: '
>>                 '
>>             Local tag: 'leader'
>>                  tag=leader/record/collection/
>>                 Data: '00543     2200181   4500'
>>                  tag=leader/record/collection/
>>             Data: '
>>                 '
>>             Local tag: 'controlfield'
>>                  tag=controlfield/record/collection/
>>                 Data: '19'
>>                  tag=controlfield/record/collection/
>>             Data: '
>>                 '
>>             Local tag: 'datafield'
>>                  tag=datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: '2010140001'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: '45 F'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                 '
>>                  tag=datafield/record/collection/
>>             Data: '
>>                 '
>>             Local tag: 'datafield'
>>                  tag=datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: '16'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: '16'
>>                         Idx: [w]bib1:Identifier-standard [1007] data
>> XData:"16"
>>                         Idx: [p]bib1:Identifier-standard [1007] data
>> XData:"16"
>>                         Idx: [w]bib1:Any [1016] data XData:"16"
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                 '
>>                  tag=datafield/record/collection/
>>             Data: '
>>                 '
>>             Local tag: 'datafield'
>>                  tag=datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: '1995                y0fre 0103    ba'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                 '
>>                  tag=datafield/record/collection/
>>             Data: '
>>                 '
>>             Local tag: 'datafield'
>>                  tag=datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: 'fre'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                 '
>>                  tag=datafield/record/collection/
>>             Data: '
>>                 '
>>             Local tag: 'datafield'
>>                  tag=datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: 'y       00  y'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                 '
>>                  tag=datafield/record/collection/
>>             Data: '
>>                 '
>>             Local tag: 'datafield'
>>                  tag=datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: 'Pour l'honneur de l'esprit humain'
>>                         Idx: [w]bib1:Title [4] data XData:"Pour
>> l'honneur de l'esprit humain"
>>                         Idx: [p]bib1:Title [4] data XData:"Pour
>> l'honneur de l'esprit humain"
>>                         Idx: [w]bib1:Any [1016] data XData:"Pour
>> l'honneur de l'esprit humain"
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: 'LIVR'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: 'Les mathematiques aujourd'hui'
>>                         Idx: [w]bib1:Title [4] data XData:"Les
>> mathematiques aujourd'hui"
>>                         Idx: [p]bib1:Title [4] data XData:"Les
>> mathematiques aujourd'hui"
>>                         Idx: [w]bib1:Any [1016] data XData:"Les
>> mathematiques aujourd'hui"
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: 'Jean DIEUDONNE'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                 '
>>                  tag=datafield/record/collection/
>>             Data: '
>>                 '
>>             Local tag: 'datafield'
>>                  tag=datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: 'CDI'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: 'CDI'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: 'SL'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: 'Non inventorie'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: '000006'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: '2'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                         '
>>                 Local tag: 'subfield'
>>                      tag=subfield/datafield/record/collection/
>>                     Data: '27'
>>                      tag=subfield/datafield/record/collection/
>>                 Data: '
>>                 '
>>                  tag=datafield/record/collection/
>>             Data: '
>>         '
>>              tag=record/collection/
>>         Data: '
>> '
>>          tag=collection/
>> -------------
>>
>> 11:31:48-08/02 zebraidx(26418) [log] zebra_end_trans
>> 11:31:48-08/02 zebraidx(26418) [log] sorting section 1
>> 11:31:48-08/02 zebraidx(26418) [log] Iterations . . .     42
>> 11:31:48-08/02 zebraidx(26418) [log] Distinct words .     20
>> 11:31:48-08/02 zebraidx(26418) [log] Updates. . . . .     17
>> 11:31:48-08/02 zebraidx(26418) [log] Deletions. . . .      1
>> 11:31:48-08/02 zebraidx(26418) [log] Insertions . . .      2
>> 11:31:48-08/02 zebraidx(26418) [log][app2] zebra_register_close
>> p=0x8106c70
>> 11:31:48-08/02 zebraidx(26418) [log] Records:       0 i/u/d 0/0/0
>> 11:31:48-08/02 zebraidx(26418) [log] user/system: 0/0
>> 11:31:48-08/02 zebraidx(26418) [log][app2] zebra_stop
>> 11:31:48-08/02 zebraidx(26418) [log] zebraidx times:  0.06  0.00  0.00
>> [paul@bureau
>> unimarc]$                                                      
>
>



_______________________________________________
Koha-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/koha-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zebra config problem (still 1)

Adam Dickmeiss
In reply to this post by Tümer Garip
Tümer Garip wrote:

> Adam wrote in reply to Paul:
>
>> cwd=/home/paul/koha.dev/head/misc/zebra/unimarc
>> 15:40:34-03/02 zebrasrv(1) [log] record 0 type XML
>> 15:40:34-03/02 zebrasrv(1) [log] 3129 bytes:
>> <?xml version="1.0" encoding="UTF-8"?>
>> <collection
>
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
> xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.l ...
>
>
>>> Your root element is collection. Not record. I don't
>
> think melm will match that. Had you used record as root >>
> element - it should do it.
>
>>>
>>> It's always a good idea to try things out with
>>> zebraidx -s update testrec.xml
>>> and see what gets matched.. (Look for the Idx: lines).
>
>
>
> Ooops, this is getting more complicated for me.
> MARC21 slim defines the MARCXML as
> <collection><record></record></collection>. The perl MARC module
> produces XML according to this.
>
The MARC21 schema also allows <record>..</record>.. The collection
element was only added as a container (XML allows exactly one root).

> When you run zebraidx with a proper MARCXML you get an error saying
> collection.abs could not be found.
> So you are forced to rename marc21.abs as collection.abs by zebraidx or
> by ZOOM in that sense.
>
> To make things more complicated everything seems to work until you try
> to use the recordID feature.
> If your XML file omits the MARC tags below 10 then RecordID works
> everytime. However if you have tags below 10 out of 50k records I got
> 30% with bad-match error.
>
> So I am trying to eliminate whether its KOHA reproduced records doing or
> not. I'll post the outcome on this list.
I hope to get a sample .. So that I can try this out myself.

/ Adam

>
> Regards,
> Tumer Garip
>
>



_______________________________________________
Koha-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/koha-devel
Loading...