Archive for the ‘dtrace’ Category

It didn’t take long for the fragile 15 year old Solaris app to break again. Six years ago we moved it to Solaris 10 quite successfully thanks to the library interposers. This enabled us to change the hostId for a process to match the old hostId it expected from nine year old hardware. It was much much happier on new hardware.
A few months ago I wanted to move it into a virtual Solaris environment (a Zone) rather than waste an entire physical box on this application(s). Solaris had recently added the ability to set your own desired hostId for a virtual environment so it looked like it should be a slam dunk. The issue that we ran into which Dtrace solved was that te inodes for virtual root directory were wanky. The Dtrace memory edit, which is the function copyin, fixed that.
All’s well that ends well, but this hasn’t ended yet.
It’s running on a test system and it is time to move it to a production system so we can have our test platform back.
So I loaded up some new T3-1B systems up with Solaris 10 release 10. And I shouldn’t have. Or maybe its good I did, find the issue earlier than later.
The problem is with this release the System Info HW Provider is now ‘Oracle Corporation’ instead of Sun_Microsystems. This unfortunately breaks flexlm as it wants to confirm the license it is trying to use is running on the correct hardware as it was issued for. 😦

I’ve been hacking and slashing my way to a unified solution and it’s mostly there. It starts up but I’m getting an error associated with creating/manging the lock file. I’ll pick that up tomorrow, but without further ado here is my SI_HW_Provider fix.

To replace the “Oracle Corporation” with Sun_Microsystems

#!/usr/sbin/dtrace -Cs
#include

#pragma D option destructive

syscall::systeminfo:entry
/arg0==SI_HW_PROVIDER /
{
self->mach = arg1;
}

syscall::systeminfo:return
/self->mach && execname ==”binaryname”/
{
copyoutstr(“Sun_Microsystems “, self->mach,18);
self->mach=0;
}

#include
syscall::getdents*:entry
/ execname == “binaryname” /
{ self->buf = arg1; }

syscall::getdents*:return
/self->buf && arg1 > 0/
{
this->dep = (struct dirent *)copyin(self->buf, sizeof (struct dirent));
this->dep->d_ino = 4;
copyout(this->dep, self->buf, sizeof (struct dirent));
}

Dtrace Saves the Day

Posted: July 5, 2011 in dtrace, solaris, tech
Tags:

Background

Ten years ago when I got here we had an X Windows app that was old then. I think it qualifies for the Antique Roadshow now. The application and data had been from the beginning of the Station project when it was just Freedom not “International”.

The tool which probably should remain nameless to protect the guilty was sold and new versions came out that ran on Windows only. We had an administrator spend about six months migrating the data and datafiles to the Windows version, only to have the users refuse to run it. So they stayed on the original X window app designed and initially installed for Solaris 2.5.1 just as it was on the old e4000.

Five years ago we really wanted to get rid of the e-series of SPARC systems, especially this 4000, and get to Solaris 10 at the same time. We contacted the vendor of the application to find it had been sold again, and the latest owner was not able to issue new license files for the Unix application. A real Unix admin doesn’t take ‘no’ for an answer so an administrator found a tool that is compiled with the target hostid and loaded as an LD_PRELOAD option. This effectively changes the hostid for any processes (like the Flexible License Manager of flexlm) to the old hostid. Using this we were able to migrate the database, binaries, and data files to a Blade 2000 running Solaris 10. The users were very happy with the upgrade and we were too as we now had a single Unix OS to support for the environment.

2011

It’s now 2011 and the Blade 2000 needs to be retired. The hostid won’t be a problem thanks to our last fix, but the systems are so powerful there has to be a better way- specifically to use the Solaris Containers/Zones and not require a dedicated box to support this tired old product. A way to easily keep this running another ten years for the life of Station… That’s my goal.

A new feature for Zones allows administrators to assign a hostid to a zone– so the original problem goes away. This should make our life easier. But try as we might, the Flex License Manager process (lmgrd) starts but will not start the subprocess containing the licenses.

Googling for the error reveals some others have encountered this or similar errors:

 7/05 14:17:03 (lmgrd) FLEXlm (v2.40) started on xxxxxx (Sun) (7/5/111)
 7/05 14:17:03 (lmgrd) License file: "/tools/flexlm/xxxxx.license.dat"
 7/05 14:17:03 (lmgrd) Started YYYY
 7/05 14:17:03 (YYYY) Cannot open daemon lock file
 7/05 14:17:03 (lmgrd) MULTIPLE "YYYY" servers running.
 7/05 14:17:03 (lmgrd) Please kill, and run lmreread
 7/05 14:17:03 (lmgrd)
 7/05 14:17:03 (lmgrd) This error probably results from either:
 7/05 14:17:03 (lmgrd)   1. Another copy of lmgrd running
 7/05 14:17:03 (lmgrd)   2. A prior lmgrd was killed with "kill -9"
 7/05 14:17:03 (lmgrd)       (which would leave the vendor daemon running)
 7/05 14:17:03 (lmgrd) To correct this, do a "ps -ax | grep YYYY"
 7/05 14:17:03 (lmgrd)   (or equivalent "ps" command)
 7/05 14:17:03 (lmgrd) and kill the "YYYY" process
 7/05 14:17:03 (lmgrd)

While a solution was not found immediately an answer began emerge. In particular a link here where Peter says “This can be fixed using a fairly simple dtrace script to fool cdslmd into seeing /. and /.. with the same inode number (getdents64()).”

In a non-global zone under Solaris 10 the root file system does not start it’s inode numbering at 0, because it is not a root filesystem. It is just a directory from the global zone, an existing filesystem. In our case, the fourth zone is getting pretty high up the inode tree and we are at six digits! It turns out this is the problem with the license manager daemon– no idea why, but it is the problem. To get around this issue I tried creating a new file system for the root directory for the non-global zone and was successful in that the zone ran and the license manager started. But this had unintended side effects and broke Solaris’ Live Upgrade process and other utilities for system maintenance. So in other words- worse than a fix, the medicine would kill you faster.

Back to Peter’s assertion that dtrace could fix this with LD_PRELOAD much like the old hostid fix. But how? I researched all the functions and probes within dtrace but the heights it can go is way way way over my head. All my research into dtrace’s power kept leading me to Brendan Gregg’s blog and book, so I reached out to Brendan through twitter and managed to pique his curiosity and in the space of a day had a blog post leading me to a solution.

My solution in four “easy” steps:

  1. Grant Dtrace privileges to the non-global zone
  2. Grant dtrace_user privilege to the user running the flexible license server within the non-global zone
  3. Write a dtrace script that detects the YYYY license package running and will fix the inode numbers as they pass through memory and cleanly terminates
  4. Add new dtrace script to execute and detach before the normal license manager process starts– and sleep ten seconds just in case the system is busy.

So to bring it out in more detail, read on.

Grant Dtrace privileges to the non-global zone

Assuming your zone is already running, use this command to add the privilege

zonecfg -z zonename ‘set limitpriv=”default,dtrace_proc,dtrace_user” ‘

Then halt and boot the zone.

When you login as root to the zone you should be able use ‘dtrace -l’ to see a list of probes available. This would be zilch before.

Grant dtrace_user privilege to the user running the flexible license server within the non-global zone

Login is as root the zone and use this command to grant dtrace_user privilege to the user account who will run the license manager. This should not be root…

echo “flexlm::::defaultpriv=basic,dtrace_user” >>/etc/user_attr

Now login or su to the user account and test the privilege is assigned by using the same command we used to test as root, above, ‘dtrace -l’.

Write a dtrace script that detects the YYYY license package running and will fix the inode numbers as they pass through memory and cleanly terminates

This is the magic section and it is all owed to Brendan Gregg. Your license process is the second process that is started by the license manager daemon. In my case the license manager is lmgrd, and my license is in all caps like “YYYY” but that is not always the case. Whatever the name of your second license process is what needs to be added to the dtrace script below. This script will run before the license manager starts up and will detect the actual license process and fix its inode stat in memory before it sees the unmatched values of “.” and “..”. Just be sure to replace the “zonename” with your zonename, “procname” with your license process name, and the inode value with your inode value.

#!/usr/sbin/dtrace -Cs
/* line 7 number must be be changed to the zonename and process name */
/* line 16 number must be be changed to the inode of root dir's .. */
#pragma D option destructive
#include
syscall::getdents*:entry
/zonename == "zonename" && execname == "procname"/
{
        self->buf = arg1;
}
syscall::getdents*:return
/self->buf && arg1 > 0/
{
/* modify first entry of ls(1) getdents() */
this->dep = (struct dirent *)copyin(self->buf, sizeof (struct dirent));
this->dep->d_ino = 415469;
copyout(this->dep, self->buf, sizeof (struct dirent));
exit(0);
}
syscall::getdents*:return
/self->buf/
{
self->buf = 0;
}

Save that and make sure it has execute privilege for your flexlm user. If you want to test it, change procname to “ls”. Run this as your flexlm user in one terminal window and in another terminal window as your flexlm user, run ‘ls -ai /’. Compare the output with the dtrace running and without. The dtrace script should terminate quietly once the ls has completed. When satisfied with the function change procname to the license process and proceed to modify your SMF start script.

Add new dtrace script to execute and detach before the normal license manager process starts– and sleep ten seconds just in case the system is busy.

My Service Management Facility is set to execute /tools/flexlm/SMF/flexlm. I added two lines above the actual lmgrd start process. You may start it differently, but here is my example.

case "$1" in
'start')
   echo "Starting up FLEXlm ..."
   su - flexlm -c "/tools/flexlm/fix_inode_start.sh " &
        sleep 15
   su - flexlm -c "/tools/flexlm/lmgrd -c /tools/flexlm/xxxxx.license.dat -l /tools/flexlm/log " &
;;

And that is it. It’s enough to run– it needs some tweaking and to be a bit better, but it is enough to know we can go forward by moving this system to a zone.

root@xxxxxx:~ $ps -fu flexlm
     UID   PID  PPID   C    STIME TTY         TIME CMD
root@xxxxxx:~ $svcadm enable flexlm
root@xxxxxx:~ $ps -fu flexlm
     UID   PID  PPID   C    STIME TTY         TIME CMD
  flexlm  2873  2872   1 14:49:10 ?           0:02 /usr/sbin/dtrace -Cs /tools/flexlm/fix_inode_start.sh
root@xxxxxx:~ $ps -fu flexlm
     UID   PID  PPID   C    STIME TTY         TIME CMD
  flexlm  2906  2897   0 14:49:26 ?           0:00 YYYY -T xxxxxx 4 -c /tools/flexlm/xxxxxx.license.dat
  flexlm  2873  8499   1 14:49:10 ?           0:03 /usr/sbin/dtrace -Cs /tools/flexlm/fix_inode_start.sh
  flexlm  2897  8499   0 14:49:25 ?           0:00 /tools/flexlm/lmgrd -c /tools/flexlm/xxxxxx.license.dat -l /tools/flexlm/log
root@xxxxxxx:~ $ps -fu flexlm
     UID   PID  PPID   C    STIME TTY         TIME CMD
  flexlm  2906  2897   0 14:49:26 ?           0:00 YYYY -T xxxxxx 4 -c /tools/flexlm/xxxxxx.license.dat
  flexlm  2897  8499   0 14:49:25 ?           0:00 /tools/flexlm/lmgrd -c /tools/flexlm/xxxxxx.license.dat -l /tools/flexlm/log