Formatting control number searches

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Formatting control number searches

barton
I'm working on generating links for 7XX linking fields in XSLT, a-la Bug 15140 - Add MARC21 776 to OPAC and staff display (https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=15140).

This Takes the 773$w, and uses the XSLT template 'extractControlNumber' which strips the leading MARCOrgCode, e.g. turning 
(OCoLC)2776588 into 2776588 ... and then the search URL is


The problem is that 2776588 isn't the actual OCLC number -- The formatting rules are here: https://help.oclc.org/Metadata_Services/OCLC-MARC_records/Content_designators_for_bibliographic_data/20Bibliographic_record_control_fields

In short: 0-8 digits: 0 pad to 8 digits, prepend 'ocm': 

2776588 -> ocm02776588

9 digits, prepend 'ocn'

> 9 digits, prepend 'on'

I tried formatting the search with a leading '*':


as well as with leading and trailing '*'


Neither of these worked.

Formatting the control number according to the OCLC rules above did work:


So I have three questions:

1) Is there a way to format a Control-number search that doesn't force me to generate the control number according to the OCLC rules above?

2) If not, does anyone have pointers about how to

* Test for (OCoLC) ->
  * Test length of control number
    * < 9 -> 0 pad to 8 digits, prepend 'ocm'
    * 9  -> prepend 'ocn'
    * > 9  -> prepen 'on'

3) Are there other control number formatting rules for other MARCOrgCodes? 

_______________________________________________
Koha-devel mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

Re: Formatting control number searches

Katrin Fischer-2

Hi Barton,

Control-number is the index on 001. 001 should have the number and 003 the MarcOrgCode, that's why it's stripped from $w for search. I don't know about OCLCs practices, so can't tell how numbers are handled there. The examples here show a number with ocm in 001:

http://www.loc.gov/marc/bibliographic/bd001.html

The description for $w (http://www.loc.gov/marc/bibliographic/bd76x78x.html) doesn't have a matching example:

"System control number of the related record preceded by the MARC code, enclosed in parentheses, for the agency to which the control number applies."

Hope this helps,

Katrin


On 24.05.2018 21:15, Barton Chittenden wrote:
I'm working on generating links for 7XX linking fields in XSLT, a-la Bug 15140 - Add MARC21 776 to OPAC and staff display (https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=15140).

This Takes the 773$w, and uses the XSLT template 'extractControlNumber' which strips the leading MARCOrgCode, e.g. turning 
(OCoLC)2776588 into 2776588 ... and then the search URL is


The problem is that 2776588 isn't the actual OCLC number -- The formatting rules are here: https://help.oclc.org/Metadata_Services/OCLC-MARC_records/Content_designators_for_bibliographic_data/20Bibliographic_record_control_fields

In short: 0-8 digits: 0 pad to 8 digits, prepend 'ocm': 

2776588 -> ocm02776588

9 digits, prepend 'ocn'

> 9 digits, prepend 'on'

I tried formatting the search with a leading '*':


as well as with leading and trailing '*'


Neither of these worked.

Formatting the control number according to the OCLC rules above did work:


So I have three questions:

1) Is there a way to format a Control-number search that doesn't force me to generate the control number according to the OCLC rules above?

2) If not, does anyone have pointers about how to

* Test for (OCoLC) ->
  * Test length of control number
    * < 9 -> 0 pad to 8 digits, prepend 'ocm'
    * 9  -> prepend 'ocn'
    * > 9  -> prepen 'on'

3) Are there other control number formatting rules for other MARCOrgCodes? 


_______________________________________________
Koha-devel mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


_______________________________________________
Koha-devel mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

Re: Formatting control number searches

barton


On Thu, May 24, 2018 at 3:34 PM, Katrin Fischer <[hidden email]> wrote:

Hi Barton,

Control-number is the index on 001. 001 should have the number and 003 the MarcOrgCode, that's why it's stripped from $w for search. I don't know about OCLCs practices, so can't tell how numbers are handled there. The examples here show a number with ocm in 001:

http://www.loc.gov/marc/bibliographic/bd001.html

From that link:

Contains the control number assigned by the organization creating, using, or distributing the record. For interchange purposes, documentation of the structure of the control number and input conventions should be provided to exchange partners by the organization initiating the interchange.

That potentially means that we would have to write XSLT to transform the links in $w for each 'exchange partner' -- i.e. test the Marc Org Code, then apply a bunch of rules to generate a value that we can search for.

The examples don't leave me brimming with confidence that most exchange partners will use the same format for $w (after the Org Code) as for the 001:

001#880524405##
003CaOONL

001###86104385#
003DLC

001ocm14919759
003OCoLC
001#####9007496
003DNLM

The description for $w (http://www.loc.gov/marc/bibliographic/bd76x78x.html) doesn't have a matching example:

"System control number of the related record preceded by the MARC code, enclosed in parentheses, for the agency to which the control number applies."

Hope this helps,

Katrin

Well, at the very least, it lets me know what I'm getting myself into.

I wonder if there's a way of translating the values found in $w into 001 outside of XSLT -- that's a language not well suited to the task. Could we do it in perl, and stash the results in some 9XX field?

I was kind of hoping that we would be able to use whatever we got back from extractControlNumber as a base for any transformations. That may or may not be a safe assumption.
 



_______________________________________________
Koha-devel mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

Re: Formatting control number searches

Katrin Fischer-2

I don't know about other sources, but in Germany I've never encountered data where the numbers in $w minus MarcOrgCode don't match the one in 001. Maybe this problem is just specific to OCLC?

Katrin


On 25.05.2018 20:06, Barton Chittenden wrote:


On Thu, May 24, 2018 at 3:34 PM, Katrin Fischer <[hidden email]> wrote:

Hi Barton,

Control-number is the index on 001. 001 should have the number and 003 the MarcOrgCode, that's why it's stripped from $w for search. I don't know about OCLCs practices, so can't tell how numbers are handled there. The examples here show a number with ocm in 001:

http://www.loc.gov/marc/bibliographic/bd001.html

From that link:

Contains the control number assigned by the organization creating, using, or distributing the record. For interchange purposes, documentation of the structure of the control number and input conventions should be provided to exchange partners by the organization initiating the interchange.

That potentially means that we would have to write XSLT to transform the links in $w for each 'exchange partner' -- i.e. test the Marc Org Code, then apply a bunch of rules to generate a value that we can search for.

The examples don't leave me brimming with confidence that most exchange partners will use the same format for $w (after the Org Code) as for the 001:











001 #880524405##
003 CaOONL












001 ###86104385#
003 DLC












001 ocm14919759
003 OCoLC










001 #####9007496
003 DNLM

The description for $w (http://www.loc.gov/marc/bibliographic/bd76x78x.html) doesn't have a matching example:

"System control number of the related record preceded by the MARC code, enclosed in parentheses, for the agency to which the control number applies."

Hope this helps,

Katrin

Well, at the very least, it lets me know what I'm getting myself into.

I wonder if there's a way of translating the values found in $w into 001 outside of XSLT -- that's a language not well suited to the task. Could we do it in perl, and stash the results in some 9XX field?

I was kind of hoping that we would be able to use whatever we got back from extractControlNumber as a base for any transformations. That may or may not be a safe assumption.
 




_______________________________________________
Koha-devel mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

Re: Formatting control number searches

barton
Just for reference, I figured how how to do the OCLC number reformatting in XSLT.

If we have a file 'format_oclc.marcxml':

<?xml version="1.0" encoding="UTF-8"?>
<record

  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="w">(OCoLC)2776588</subfield>
    <subfield code="w">(OCoLC)12776588</subfield>
    <subfield code="w">(OCoLC)112776588</subfield>
  </datafield>
</record>

And an xslt file 'format_oclc.xslt':

<!DOCTYPE stylesheet >
<xsl:stylesheet version="1.0" xmlns:marc="http://www.loc.gov/MARC21/slim" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >

    <xsl:template name="format_OCLC_number">
        <xsl:param name="controlnumber"/>
        <xsl:variable name="OCLC_number" select="substring-after($controlnumber,'(OCoLC)')"/>
        <xsl:variable name="OCLC_length" select="string-length($OCLC_number)"/>
        <xsl:if test="$OCLC_number">
            <xsl:choose>
                <xsl:when test="$OCLC_length &lt; 8">
                    <xsl:variable name="OCLC_number_padding" select="substring( '00000000', 1, 8 - $OCLC_length)"/>
                    <xsl:variable name="formatted_OCLC_number" select="concat( $OCLC_number_padding, $OCLC_number )"/>
                    <xsl:value-of select="concat( 'ocm', $formatted_OCLC_number )" />
                </xsl:when>
                <xsl:when test="$OCLC_length = 8">
                    <xsl:value-of select="concat( 'ocn', $OCLC_number )" />
                </xsl:when>
                <xsl:otherwise>
                    <xsl:value-of select="concat( 'on', $OCLC_number )" />
                </xsl:otherwise>
            </xsl:choose>
        </xsl:if>
    </xsl:template>

    <xsl:template match="marc:record">
        <xsl:for-each select="marc:datafield[@tag=773]">
            <xsl:for-each select="current()/marc:subfield[@code='w']">
                <xsl:call-template name="format_OCLC_number">
                    <xsl:with-param name="controlnumber" select="current()"/>
                </xsl:call-template>
                <xsl:value-of select="'
                '" />
            </xsl:for-each>
        </xsl:for-each>
    </xsl:template>

</xsl:stylesheet>

we can run xsltproc:

$ xsltproc format_oclc.xslt format_oclc.marcxml
<?xml version="1.0"?>
ocm02776588                 ocn12776588                 on112776588           

I'll be filing a bug/patch relatively soon that incorporates this.


On Sat, May 26, 2018 at 4:58 AM, Katrin Fischer <[hidden email]> wrote:

I don't know about other sources, but in Germany I've never encountered data where the numbers in $w minus MarcOrgCode don't match the one in 001. Maybe this problem is just specific to OCLC?

Katrin


On 25.05.2018 20:06, Barton Chittenden wrote:


On Thu, May 24, 2018 at 3:34 PM, Katrin Fischer <[hidden email]> wrote:

Hi Barton,

Control-number is the index on 001. 001 should have the number and 003 the MarcOrgCode, that's why it's stripped from $w for search. I don't know about OCLCs practices, so can't tell how numbers are handled there. The examples here show a number with ocm in 001:

http://www.loc.gov/marc/bibliographic/bd001.html

From that link:

Contains the control number assigned by the organization creating, using, or distributing the record. For interchange purposes, documentation of the structure of the control number and input conventions should be provided to exchange partners by the organization initiating the interchange.

That potentially means that we would have to write XSLT to transform the links in $w for each 'exchange partner' -- i.e. test the Marc Org Code, then apply a bunch of rules to generate a value that we can search for.

The examples don't leave me brimming with confidence that most exchange partners will use the same format for $w (after the Org Code) as for the 001:











001 #880524405##
003 CaOONL












001 ###86104385#
003 DLC












001 ocm14919759
003 OCoLC










001 #####9007496
003 DNLM

The description for $w (http://www.loc.gov/marc/bibliographic/bd76x78x.html) doesn't have a matching example:

"System control number of the related record preceded by the MARC code, enclosed in parentheses, for the agency to which the control number applies."

Hope this helps,

Katrin

Well, at the very least, it lets me know what I'm getting myself into.

I wonder if there's a way of translating the values found in $w into 001 outside of XSLT -- that's a language not well suited to the task. Could we do it in perl, and stash the results in some 9XX field?

I was kind of hoping that we would be able to use whatever we got back from extractControlNumber as a base for any transformations. That may or may not be a safe assumption.
 





_______________________________________________
Koha-devel mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/