Low Zebra performance for persistent/re-used connections

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Low Zebra performance for persistent/re-used connections

David Cook

Hi all,

 

I’m writing a custom script that uses C4::Context->Zconn to create a Zebra connection and then I’m sending a bunch of queries.

 

If I don’t have the GATEWAY_INTERFACE environmental variable defined, that method returns the same connection. (Note that I’m working with Unix sockets and not TCP sockets.)

 

When re-using the same connection, my script uses 100% CPU.

 

When using new connections for every request, my script uses 3% CPU.

 

When re-using the same connection, Zebra seems pretty fast for the first 1000-2000 queries (I didn’t capture exact numbers), but performance degrades rapidly. Around 3000 queries, Zebra is only processing about 4-6 queries per second. 

 

When using new connections, Zebra is handling about 93 queries per second.

 

Zebra is forking a process for every connection, so it seems like the overhead should be greater creating a new process for every request, but performance is exponentially better when forcing new connections for each request  than when re-using the connection.  

 

It looks like Zebra doesn’t expect a single connection/worker process to handle too many requests. Or maybe it’s an issue with the ZOOM library. I don’t know. I don’t know enough about the internals of either of them.

 

So this could affect any command line tool that handles high volumes of Zebra queries. While re-using connections makes sense in theory, it seems to actually cause problems when talking to Zebra.

 

David Cook

Systems Librarian

Prosentient Systems

72/330 Wattle St

Ultimo, NSW 2007

Australia

 

Office: 02 9212 0899

Direct: 02 8005 0595

 


_______________________________________________
Koha-devel mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
Reply | Threaded
Open this post in threaded view
|

Re: Low Zebra performance for persistent/re-used connections

David Cook

And now I try again today with re-using the connection and the performance is amazing and the CPU usage is much higher than 3% but nowhere near 100%. Looks like I must’ve just been misusing the ZOOM library perhaps..

 

David Cook

Systems Librarian

Prosentient Systems

72/330 Wattle St

Ultimo, NSW 2007

Australia

 

Office: 02 9212 0899

Direct: 02 8005 0595

 

From: David Cook [mailto:[hidden email]]
Sent: Monday, 17 December 2018 6:07 PM
To: '[hidden email]' <[hidden email]>
Subject: Low Zebra performance for persistent/re-used connections

 

Hi all,

 

I’m writing a custom script that uses C4::Context->Zconn to create a Zebra connection and then I’m sending a bunch of queries.

 

If I don’t have the GATEWAY_INTERFACE environmental variable defined, that method returns the same connection. (Note that I’m working with Unix sockets and not TCP sockets.)

 

When re-using the same connection, my script uses 100% CPU.

 

When using new connections for every request, my script uses 3% CPU.

 

When re-using the same connection, Zebra seems pretty fast for the first 1000-2000 queries (I didn’t capture exact numbers), but performance degrades rapidly. Around 3000 queries, Zebra is only processing about 4-6 queries per second. 

 

When using new connections, Zebra is handling about 93 queries per second.

 

Zebra is forking a process for every connection, so it seems like the overhead should be greater creating a new process for every request, but performance is exponentially better when forcing new connections for each request  than when re-using the connection.  

 

It looks like Zebra doesn’t expect a single connection/worker process to handle too many requests. Or maybe it’s an issue with the ZOOM library. I don’t know. I don’t know enough about the internals of either of them.

 

So this could affect any command line tool that handles high volumes of Zebra queries. While re-using connections makes sense in theory, it seems to actually cause problems when talking to Zebra.

 

David Cook

Systems Librarian

Prosentient Systems

72/330 Wattle St

Ultimo, NSW 2007

Australia

 

Office: 02 9212 0899

Direct: 02 8005 0595

 


_______________________________________________
Koha-devel mailing list
[hidden email]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/