A few ZOOM::Package questions

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

A few ZOOM::Package questions

Joshua Ferraro-3
Hi guys,

Just got a few questions about ZOOM::Package:

In the documentation, the example listed is:

$p->send("createdb");

I'm assuming this is a typo and the actual service
type is just 'create' ... right?

Does Zebra support the 'itemorder' service type? if so, how is it
used?

Will ZOOM support the 'drop' send() method at some point?

What are the case scenerios where the 'commit' method would be
useful?

What is the xmlupdate method? :-)

We're trying to figure out when it would be useful (if ever) to
use recordIdOpaque and/or recordIdName. The only reference to them
I can find is here: http://www.indexdata.dk/zebra/NEWS where it
says:

"Allow Remote insert/delete/replace/update with record, recordIdNumber
(sysno) and/or recordIdOpaque(user supplied record Id). If both
IDs are omitted internal record ID match is assumed (recordId: - in
zebra cfg)."

Could someone provide a couple case scenerios where these would be used?

When would it be useful to populate the databaseName option? Isn't
that already provided by the $conn object? or can we connect to
multiple databases in that object and then specify which one we
want the specific package operation to be performed using that
option?

Finally, no mention is made of the record update 'syntax' option in
the ZOOM documentation. Is it supported? if so, what are the valid
values?

Thanks,

--
Joshua Ferraro               VENDOR SERVICES FOR OPEN-SOURCE SOFTWARE
President, Technology       migration, training, maintenance, support
LibLime                                Featuring Koha Open-Source ILS
[hidden email] |Full Demos at http://liblime.com/koha |1(888)KohaILS


_______________________________________________
Koha-zebra mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/koha-zebra
Reply | Threaded
Open this post in threaded view
|

Re: A few ZOOM::Package questions

Sebastian Hammer
I can answer one of these.. I think the other ones are mostly for Mike.

Joshua Ferraro wrote:

>Hi guys,
>
>Just got a few questions about ZOOM::Package:
>
>In the documentation, the example listed is:
>
>$p->send("createdb");
>
>I'm assuming this is a typo and the actual service
>type is just 'create' ... right?
>
>Does Zebra support the 'itemorder' service type? if so, how is it
>used?
>  
>
No, this is generally used to support ILL, which of course isn't
relevant for Zebra.

There are a couple of profiles at the Z39.50 maintenance agency
describing how to use ItemOrder to carry bits and pieces of the ISO ILL
protocol through Z39.50 ItemOrder. Mind you, this is very rarely used,
to my knowledge.. the more common approaches involve ILL directly over
TCP/IP or SMPT. Of course, some consortia use NCIP to support ILL-like
operations, as well.

--Seb

>Will ZOOM support the 'drop' send() method at some point?
>
>What are the case scenerios where the 'commit' method would be
>useful?
>
>What is the xmlupdate method? :-)
>
>We're trying to figure out when it would be useful (if ever) to
>use recordIdOpaque and/or recordIdName. The only reference to them
>I can find is here: http://www.indexdata.dk/zebra/NEWS where it
>says:
>
>"Allow Remote insert/delete/replace/update with record, recordIdNumber
>(sysno) and/or recordIdOpaque(user supplied record Id). If both
>IDs are omitted internal record ID match is assumed (recordId: - in
>zebra cfg)."
>
>Could someone provide a couple case scenerios where these would be used?
>
>When would it be useful to populate the databaseName option? Isn't
>that already provided by the $conn object? or can we connect to
>multiple databases in that object and then specify which one we
>want the specific package operation to be performed using that
>option?
>
>Finally, no mention is made of the record update 'syntax' option in
>the ZOOM documentation. Is it supported? if so, what are the valid
>values?
>
>Thanks,
>
>  
>

--
Sebastian Hammer, Index Data
[hidden email]   www.indexdata.com
Ph: (603) 209-6853



_______________________________________________
Koha-zebra mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/koha-zebra
Reply | Threaded
Open this post in threaded view
|

Re: A few ZOOM::Package questions

Mike Taylor-2
In reply to this post by Joshua Ferraro-3
> Date: Fri, 17 Feb 2006 08:44:05 -0800
> From: Joshua Ferraro <[hidden email]>
>
> Just got a few questions about ZOOM::Package:
>
> In the documentation, the example listed is:
>
> $p->send("createdb");
>
> I'm assuming this is a typo and the actual service
> type is just 'create' ... right?

Yes!  Thanks for spotting this, I've now fixed it in CVS.

> Will ZOOM support the 'drop' send() method at some point?

ZOOM already does support it; the problem is that Zebra doesn't --
there's a known bug whereby that call corrupts the registers.  So
until that's fixed (and it is on the list), which you _can_ send a
"drop" request from ZOOM, you'd better not!  :-)

> What are the case scenerios where the 'commit' method would be
> useful?

When using shadow registers in Zebra.

> What is the xmlupdate method? :-)

Don't know -- that's one for Adam to answer.

> We're trying to figure out when it would be useful (if ever) to use
> recordIdOpaque and/or recordIdName. The only reference to them I can
> find is here: http://www.indexdata.dk/zebra/NEWS where it says:
>
> "Allow Remote insert/delete/replace/update with record,
> recordIdNumber (sysno) and/or recordIdOpaque(user supplied record
> Id). If both IDs are omitted internal record ID match is assumed
> (recordId: - in zebra cfg)."
>
> Could someone provide a couple case scenerios where these would be
> used?

I've not heard of "recordIdNumber" before, but I am guessing that it
just means "the unique ID of the record, as extracted from the record
itself using the rules specified in the Zebra configuration".  As you
know, every record in a Zebra database is identified by a, uh,
identifier.  That can either be drawn from the record itself (which
is probably what you want when adding MARC records that have IDs in
them), or you can specify an ID explicitly when you add the record.

I don't know about case scenerios -- again, Adam would have a better
handle on that.

> When would it be useful to populate the databaseName option?

Always, when creating or destroying a database.

> Isn't that already provided by the $conn object?

No.  You might have a connection pointing at one database, and want to
create or destroy a different one.

> or can we connect to multiple databases in that object and then
> specify which one we want the specific package operation to be
> performed using that option?

There's no "connect to multiple databases" functionality in ZOOM, no.
(What would such functionality mean?)

> Finally, no mention is made of the record update 'syntax' option in
> the ZOOM documentation. Is it supported? if so, what are the valid
> values?

Sorry -- again, I don't know.  Adam's yer man.

 _/|_ ___________________________________________________________________
/o ) \/  Mike Taylor  <[hidden email]>  http://www.miketaylor.org.uk
)_v__/\  "I wrote `Microsoft's a monopolist' and the New York Times wanted
         to edit it to say, `Microsoft is innovative'" -- Steve Wozniak
         on writing for the NYT.



_______________________________________________
Koha-zebra mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/koha-zebra
Reply | Threaded
Open this post in threaded view
|

Re: A few ZOOM::Package questions

Joshua Ferraro-3
On Mon, Feb 20, 2006 at 11:39:54AM +0000, Mike Taylor wrote:
> > What are the case scenerios where the 'commit' method would be
> > useful?
>
> When using shadow registers in Zebra.
Which we definitely want to use. So do we 'commit' after
multiple update services or after each one?

A somewhat related question is what you recommend for handling
the $Zconn object. Our early proof-of-concept code had a new
connection object being created and destroyed for every update
action -- so ... should their be a single $Zconn for the whole
system? or should we handle each incoming request as a separate
connection (for search, update, create, drop (when available).
One idea proposed was having our Context.pm module handle the
connection in the same way we currently handle the dbh connection
to MySQL ... but I know Z39.50 is a bit more stateful than MySQL...
any suggestions?

> > What is the xmlupdate method? :-)
>
> Don't know -- that's one for Adam to answer.
Cool ... hopefully he's cced on this...

> > We're trying to figure out when it would be useful (if ever) to use
> > recordIdOpaque and/or recordIdName. The only reference to them I can
> > find is here: http://www.indexdata.dk/zebra/NEWS where it says:
> >
> > "Allow Remote insert/delete/replace/update with record,
> > recordIdNumber (sysno) and/or recordIdOpaque(user supplied record
> > Id). If both IDs are omitted internal record ID match is assumed
> > (recordId: - in zebra cfg)."
> >
> > Could someone provide a couple case scenerios where these would be
> > used?
>
> I've not heard of "recordIdNumber" before, but I am guessing that it
> just means "the unique ID of the record, as extracted from the record
> itself using the rules specified in the Zebra configuration".  As you
> know, every record in a Zebra database is identified by a, uh,
> identifier.  That can either be drawn from the record itself (which
> is probably what you want when adding MARC records that have IDs in
> them), or you can specify an ID explicitly when you add the record.
>
> I don't know about case scenerios -- again, Adam would have a better
> handle on that.
Ditto.

> > When would it be useful to populate the databaseName option?
>
> Always, when creating or destroying a database.
>
> > Isn't that already provided by the $conn object?
>
> No.  You might have a connection pointing at one database, and want to
> create or destroy a different one.
>
> > or can we connect to multiple databases in that object and then
> > specify which one we want the specific package operation to be
> > performed using that option?
>
> There's no "connect to multiple databases" functionality in ZOOM, no.
> (What would such functionality mean?)
In my estimation, that would mean that you could connect to multiple
databases in the connection object, then specify which one you want
to interact with for a given operation. One way this could be used would
be for a Zebra installation that had two databases, one for full-text
items in Dublin Core, and one using MARC records -- the MARC editor would
first commit the full-text stuff to the DC db then the MARC records
to the MARC db. But I think I understand now that the connection object
only allows a single connection at a time.

> > Finally, no mention is made of the record update 'syntax' option in
> > the ZOOM documentation. Is it supported? if so, what are the valid
> > values?
>
> Sorry -- again, I don't know.  Adam's yer man.
Ditto.

Cheers,

--
Joshua Ferraro               VENDOR SERVICES FOR OPEN-SOURCE SOFTWARE
President, Technology       migration, training, maintenance, support
LibLime                                Featuring Koha Open-Source ILS
[hidden email] |Full Demos at http://liblime.com/koha |1(888)KohaILS


_______________________________________________
Koha-zebra mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/koha-zebra
Reply | Threaded
Open this post in threaded view
|

Re: A few ZOOM::Package questions -- some Adam questions

Sebastian Hammer
Joshua Ferraro wrote:

>On Mon, Feb 20, 2006 at 11:39:54AM +0000, Mike Taylor wrote:
>  
>
>>>What are the case scenerios where the 'commit' method would be
>>>useful?
>>>      
>>>
>>When using shadow registers in Zebra.
>>    
>>
>Which we definitely want to use. So do we 'commit' after
>multiple update services or after each one?
>  
>
Either way should be fine.. all changes you've made prior to the commit
become visible instantly when the commit process begins (because at that
time, the UI begins to read the 'dirty' blocks of the index from the
shadow files instead of from the main index.

>A somewhat related question is what you recommend for handling
>the $Zconn object. Our early proof-of-concept code had a new
>connection object being created and destroyed for every update
>action -- so ... should their be a single $Zconn for the whole
>system? or should we handle each incoming request as a separate
>connection (for search, update, create, drop (when available).
>One idea proposed was having our Context.pm module handle the
>connection in the same way we currently handle the dbh connection
>to MySQL ... but I know Z39.50 is a bit more stateful than MySQL...
>any suggestions?
>  
>
I'm not sure it makes much of a difference. If you do have multiple
records you need to update, it should perform better if you can update
them all at once.. but usually I guess you'll be doing single records.

>>>What is the xmlupdate method? :-)
>>>      
>>>
>>Don't know -- that's one for Adam to answer.
>>    
>>
>Cool ... hopefully he's cced on this...
>  
>
He is now, at least.

>>>We're trying to figure out when it would be useful (if ever) to use
>>>recordIdOpaque and/or recordIdName. The only reference to them I can
>>>find is here: http://www.indexdata.dk/zebra/NEWS where it says:
>>>
>>>"Allow Remote insert/delete/replace/update with record,
>>>recordIdNumber (sysno) and/or recordIdOpaque(user supplied record
>>>Id). If both IDs are omitted internal record ID match is assumed
>>>(recordId: - in zebra cfg)."
>>>
>>>Could someone provide a couple case scenerios where these would be
>>>used?
>>>      
>>>
>>I've not heard of "recordIdNumber" before, but I am guessing that it
>>just means "the unique ID of the record, as extracted from the record
>>itself using the rules specified in the Zebra configuration".  As you
>>know, every record in a Zebra database is identified by a, uh,
>>identifier.  That can either be drawn from the record itself (which
>>is probably what you want when adding MARC records that have IDs in
>>them), or you can specify an ID explicitly when you add the record.
>>
>>I don't know about case scenerios -- again, Adam would have a better
>>handle on that.
>>    
>>
>Ditto.
>  
>
I think chapter 5 should provide a reasonable explanation of the
options. If you want to update records remotely, as you do, you will
either need to explicitly provide an identifier, or have Zebra derive
one from the record (i.e. a field 001 or similar).  

>>>When would it be useful to populate the databaseName option?
>>>      
>>>
>>Always, when creating or destroying a database.
>>
>>    
>>
>>>Isn't that already provided by the $conn object?
>>>      
>>>
>>No.  You might have a connection pointing at one database, and want to
>>create or destroy a different one.
>>
>>    
>>
>>>or can we connect to multiple databases in that object and then
>>>specify which one we want the specific package operation to be
>>>performed using that option?
>>>      
>>>
>>There's no "connect to multiple databases" functionality in ZOOM, no.
>>(What would such functionality mean?)
>>    
>>
>In my estimation, that would mean that you could connect to multiple
>databases in the connection object, then specify which one you want
>to interact with for a given operation. One way this could be used would
>be for a Zebra installation that had two databases, one for full-text
>items in Dublin Core, and one using MARC records -- the MARC editor would
>first commit the full-text stuff to the DC db then the MARC records
>to the MARC db. But I think I understand now that the connection object
>only allows a single connection at a time.
>  
>
In Z39.50, you don't really 'connect' to a database... you connect to a
target. In the wire protocol, the database name is provided for each
search, update, etc. operation. It is meant as a convenience when you
provide the database name at the yaz_connect stage, but the name is
really just stored internally until a search is set off.

If you yaz_connect to multiple databases in yaz_connect (this can be
done by separating the names with the '+' character in the ZOOM AAPI,
then the server, if it supports multiple database names (Zebra does,
AFAIK), will search across those logical database names.

>>>Finally, no mention is made of the record update 'syntax' option in
>>>the ZOOM documentation. Is it supported? if so, what are the valid
>>>values?
>>>      
>>>
>>Sorry -- again, I don't know.  Adam's yer man.
>>    
>>
>Ditto.
>  
>
TO my knowledgem, Zebra only suports updating via XML at the moment.

--Seb

--
Sebastian Hammer, Index Data
[hidden email]   www.indexdata.com
Ph: (603) 209-6853



_______________________________________________
Koha-zebra mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/koha-zebra
Reply | Threaded
Open this post in threaded view
|

Re: A few ZOOM::Package questions

Adam Dickmeiss
In reply to this post by Joshua Ferraro-3
Joshua Ferraro wrote:

> On Mon, Feb 20, 2006 at 11:39:54AM +0000, Mike Taylor wrote:
>
>>>What are the case scenerios where the 'commit' method would be
>>>useful?
>>
>>When using shadow registers in Zebra.
>
> Which we definitely want to use. So do we 'commit' after
> multiple update services or after each one?
>
> A somewhat related question is what you recommend for handling
> the $Zconn object. Our early proof-of-concept code had a new
> connection object being created and destroyed for every update
> action -- so ... should their be a single $Zconn for the whole
> system? or should we handle each incoming request as a separate
> connection (for search, update, create, drop (when available).
> One idea proposed was having our Context.pm module handle the
> connection in the same way we currently handle the dbh connection
> to MySQL ... but I know Z39.50 is a bit more stateful than MySQL...
> any suggestions?
>
>
>>>What is the xmlupdate method? :-)
>>
>>Don't know -- that's one for Adam to answer.
xmlupdate is a privately defined extended service. The package itself is
an XML document. Semantics is XML itself. Zebra doesn't support this. So
you shouldn't worry about it. 'xmlupdate' is a bad name, admitted. It
should have been called 'xmlesprivate' or similar.

>
> Cool ... hopefully he's cced on this...
>
>
>>>We're trying to figure out when it would be useful (if ever) to use
>>>recordIdOpaque and/or recordIdName. The only reference to them I can
>>>find is here: http://www.indexdata.dk/zebra/NEWS where it says:
>>>
>>>"Allow Remote insert/delete/replace/update with record,
>>>recordIdNumber (sysno) and/or recordIdOpaque(user supplied record
>>>Id). If both IDs are omitted internal record ID match is assumed
>>>(recordId: - in zebra cfg)."
>>>
>>>Could someone provide a couple case scenerios where these would be
>>>used?
>>
>>I've not heard of "recordIdNumber" before, but I am guessing that it
>>just means "the unique ID of the record, as extracted from the record
>>itself using the rules specified in the Zebra configuration".  As you
>>know, every record in a Zebra database is identified by a, uh,
>>identifier.  That can either be drawn from the record itself (which
>>is probably what you want when adding MARC records that have IDs in
>>them), or you can specify an ID explicitly when you add the record.
>>

recordIdNumber is the ID that Zebra automatically assigns to each record
  (currently an integer). It's the <localnumber>..</localnumber> that is
part of XML in grs.-class records .. i.e.

</Date-of-Last-Modification>

   <idzebra xmlns="http://www.indexdata.dk/zebra/">
     <size>2704</size>
     <localnumber>14</localnumber>
     <filename>gils/esdd0006.grs</filename>
   </idzebra>
</gils>

>>I don't know about case scenerios -- again, Adam would have a better
>>handle on that.
>
> Ditto.
Not of much help, I must admit. I don't think recordIdNumber has been
used in any real-life application.

>
>
>>>When would it be useful to populate the databaseName option?
>>
>>Always, when creating or destroying a database.
>>
>>
>>>Isn't that already provided by the $conn object?
>>
>>No.  You might have a connection pointing at one database, and want to
>>create or destroy a different one.
>>
>>
>>>or can we connect to multiple databases in that object and then
>>>specify which one we want the specific package operation to be
>>>performed using that option?
>>
>>There's no "connect to multiple databases" functionality in ZOOM, no.
>>(What would such functionality mean?)
>
> In my estimation, that would mean that you could connect to multiple
> databases in the connection object, then specify which one you want
> to interact with for a given operation. One way this could be used would
> be for a Zebra installation that had two databases, one for full-text
> items in Dublin Core, and one using MARC records -- the MARC editor would
> first commit the full-text stuff to the DC db then the MARC records
> to the MARC db. But I think I understand now that the connection object
> only allows a single connection at a time.
>
>
>>>Finally, no mention is made of the record update 'syntax' option in
>>>the ZOOM documentation. Is it supported? if so, what are the valid
>>>values?
>>
syntax is supplied recordsyntax attached to each record in an update. By
default XML is used (if syntax is not given).

Again, Zebra doesn't use this. Other "better" servers might. Had syntax
been a string it could have been the record type (such as grs.xml). But
it isn't - being an OID. And currently we don't map from OID to filter.

/ Adam

>>Sorry -- again, I don't know.  Adam's yer man.
>
> Ditto.
>
> Cheers,
>



_______________________________________________
Koha-zebra mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/koha-zebra
Reply | Threaded
Open this post in threaded view
|

Re: A few ZOOM::Package questions -- some Adam questions

Mike Taylor-2
In reply to this post by Sebastian Hammer
To expand on a couple of Seb's answers ...

> Date: Tue, 21 Feb 2006 10:27:03 -0500
> From: Sebastian Hammer <[hidden email]>
>
>> Which we definitely want to use. So do we 'commit' after
>> multiple update services or after each one?
>
> Either way should be fine.. all changes you've made prior to the
> commit become visible instantly when the commit process begins
> (because at that time, the UI begins to read the 'dirty' blocks of
> the index from the shadow files instead of from the main index.

Either way will work "mechanically", so you have a choice to make.
That choice should be driven primarily by notion of what constitutes
an atomic transaction, semantically speaking, in your application.
For example, if you change an author-name in an authority record, and
also change the corresponding author fields in all your bibliographic
records, than that is semantically speaking a single operation (even
though it's implemented by updates to multiple records), so it should
all be a single transaction, i.e. don't call "commit" until you've
made all the changes.  A secondary concern is efficiency: in general,
it will (I think?  Adam?) be more efficient to do many updates
followed by a single commit, than to do many update-commit pairs.

>> A somewhat related question is what you recommend for handling the
>> $Zconn object. Our early proof-of-concept code had a new connection
>> object being created and destroyed for every update action -- so
>> ... should their be a single $Zconn for the whole system? or should
>> we handle each incoming request as a separate connection (for
>> search, update, create, drop (when available).  One idea proposed
>> was having our Context.pm module handle the connection in the same
>> way we currently handle the dbh connection to MySQL ... but I know
>> Z39.50 is a bit more stateful than MySQL...  any suggestions?
>
> I'm not sure it makes much of a difference. If you do have multiple
> records you need to update, it should perform better if you can
> update them all at once.. but usually I guess you'll be doing single
> records.

It's certainly conventional to make a single connection and use that
for the whole application.  When connecting to a remote server (which
is of course the classic scenario for Z39.50) this is a _big_
efficiency win: for example, the old Library of Congress server used
to take about twenty times as long to do a connect-init-search
sequence as a single search.  However, in the case of a Zebra-enabled
Koha, you presumably run the Zebra on the same box as the client code,
so you should have no DNS lookup and negligible connection time; and
Zebra handles Init very quickly.  Still and all, Init is another
round-trip, so if you're going to be doing many searches, it's still
going to be noticably better to re-use the same connection for them
all instead of repeatedly tearing it down, remaking it and re-doing
the Init round-trip.

>>>> or can we connect to multiple databases in that object and then
>>>> specify which one we want the specific package operation to be
>>>> performed using that option?
>>>
>>> There's no "connect to multiple databases" functionality in ZOOM, no.
>>> (What would such functionality mean?)
>>
>> In my estimation, that would mean that you could connect to
>> multiple databases in the connection object, then specify which one
>> you want to interact with for a given operation. One way this could
>> be used would be for a Zebra installation that had two databases,
>> one for full-text items in Dublin Core, and one using MARC records
>> -- the MARC editor would first commit the full-text stuff to the DC
>> db then the MARC records to the MARC db.

Oh, you can certainly do _that_.  Just reset the connection object's
"databaseName" option before each operation:

        $conn->option(databaseName => marcDB);
        # Do stuff in that the database of MARC records
        $conn->option(databaseName => dc);
        # Do stuff in that the database of Dublin Core records

>> BUT I think I understand now that the connection object only allows
>> a single connection at a time.

Yes; but connections are different from databases!

> If you yaz_connect to multiple databases in yaz_connect (this can be
> done by separating the names with the '+' character in the ZOOM
> AAPI, then the server, if it supports multiple database names (Zebra
> does, AFAIK), will search across those logical database names.

... i.e. it implements a union catalogue.  That can be useful, but (if
I've understood you correctly) it's a _different_ useful thing from
the one you're asking about.

 _/|_ ___________________________________________________________________
/o ) \/  Mike Taylor  <[hidden email]>  http://www.miketaylor.org.uk
)_v__/\  "Sure it stinks, but only a little stink; not the horrendous
         stench you might find in some other alleged ``science'' reports"
         -- Thomas R. Holtz, Jr. in a mellow mood



_______________________________________________
Koha-zebra mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/koha-zebra