Tuesday, December 14, 2010

Slowness with RPC as root on AIX

I was responsible for writing a little RPC client (and server) program that talks to our Unidata database to retrieve some arbitrary values.  The client has been running fine on Linux, but when we moved it to our AIX server, we noticed something interesting: in a loop of 1000 calls, it would take about 13 seconds on Linux, and 45 seconds on AIX.  It was absolutely maddening, and made no sense since it was the exact same code, just recompiled (and our AIX box has FAR superior hardware specs).

Then I made a discovery: when we ran the program as a non-root user, we got comparable times to the Linux box.  So, what is it about root on AIX that was causing the hold up?  Well, apparently, in AIX there is something built in that uses reserved, or privileged, ports for that kind of communication when running as root.  So, instead of having thousands of ports the client program could communicate on, it had a very limited subset of ports, which resulted in waiting for one of those ports to be available before completing the request.

Long story short, the result was to set the sticky bit on the client application (which has owner and group as non-root), to force it to always run in non-user space.  In case you come across a similar issue, here are the steps to fix it:
1) Use chmod and chgrp to make sure that the permissions on the application are non-root for ownership/group
2) Enable the setuid sticky bit on the application with the following syntax:
chmod u+s AppNameHere

This caused quite the headache and resulted in a support call to IBM, so hopefully this will help someone out there!

No comments: