[Bug 21972] New: Record matching rule for authorities only works for first 20 authority records

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug 21972] New: Record matching rule for authorities only works for first 20 authority records

bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=21972

            Bug ID: 21972
           Summary: Record matching rule for authorities only works for
                    first 20 authority records
 Change sponsored?: ---
           Product: Koha
           Version: 18.11
          Hardware: All
                OS: All
            Status: NEW
          Severity: major
          Priority: P5 - low
         Component: MARC Authority data support
          Assignee: [hidden email]
          Reporter: [hidden email]
        QA Contact: [hidden email]

Created attachment 82955
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=82955&action=edit
MARC File containing geographical authority data

The summary is already my first interpretation, the following is happening:

I am trying trying to import some custom authority data (see attached mrc file)
and have defined a matching rule accordingly:

Match threshold: 1000
Record type: Authority record

Match point 1:
Search index: indentifier-standard
Score: 500
Tag: 024
Subfields: a
Offset/Length: 0

Match point 2:
Search index: indentifier-standard
Score: 500
Tag: 024
Subfields: 2
Offset/Length: 0

While using the command line interface (staging + commit) to automatically
update authority data, I noticed that the number of successfully matched
records was far too low.

As a strict test I then did the following:
1. Import the data using the bulkimport script
2. Upload the same data in the staging Interface
3. See which records match and which do not, in order to find a pattern in
those who fail.

The only pattern I encountered was the following: Only the (initially) first 20
imported records are actually matched. Meaning only authorities with ID 1 to 20
are successfully matched with the staged import, everything else stands as "No
Match".

I tried this with different sets of place data, still the same result.

--
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
_______________________________________________
Koha-bugs mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 21972] Record matching rule for authorities only works for first 20 authority records

bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=21972

--- Comment #1 from Simon Hohl <[hidden email]> ---
For 1. I should add: Using a fresh Koha install (without any authority data
imported yet).

--
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
_______________________________________________
Koha-bugs mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 21972] Record matching rule for authorities only works for first 20 authority records

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=21972

--- Comment #2 from Simon Hohl <[hidden email]> ---
Some further testing:

1) If I remove the match point for subfield 024$2, the staged records are all
matched correctly. But this is not a viable solution for a production
environment, because I basically stop checking if the ID in 024$a really is an
iDAI.gazetteer ID.
2) I suspect the error is caused by this line:

my ( $authresults, $total ) = $searcher->search_auth_compat( $search_query, 0,
20 );
See: https://gitlab.com/koha-community/Koha/blob/master/C4/Matcher.pm#L718

Is it possible, that the matching rule searches for existing authority records
024$2 subfields with the same value (= iDAI.gazetteer), but the result has a
cut off at 20 results? Then, because 'iDAI.gazetteer' is the 024$2 value of ALL
place authority records, the result will just contain the first 20 returned and
everything else is discarded and the check on 024$a is somehow ignored.

I have never coded anything in Perl, so I am not quite sure how the different
rules get applied - I may be wrong.

--
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 21972] Record matching rule for authorities only works for first 20 authority records

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=21972

--- Comment #3 from Simon Hohl <[hidden email]> ---
I worked around this by creating a composite subfield 024$9 containing both
values from $2 and $a. Then I updated the merging rule to only check $9. But I
would still consider this a bug.

--
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

[Bug 21972] Record matching rule for authorities only works for first 20 authority records

bugzilla-daemon
In reply to this post by bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=21972

Martin Renvoize <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|18.11                       |master
                 CC|                            |martin.renvoize@ptfs-europe
                   |                            |.com

--- Comment #4 from Martin Renvoize <[hidden email]> ---
This does indeed feel somewhat wrong to me.. hard limiting results to 20.
However, it's not 18.11.x specific as this code is also in master, so I'm
marking it as a bug there.

--
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
_______________________________________________
Koha-bugs mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/