File descriptor leak

nemesis9 · 15 July 2011 21:56

Our custom Redhat Linux system uses fvwm. The primary application on this system is Vmware. A user logs in and has a menu of all the VMs she is allowed to run. We have things pretty locked down, so the user has little control over anything. One thing the user can control is how the VM is stopped when they are done. We have a Fvwm menu item called “Player Preferences” which allows the user to select one of two Vmware options regarding how the VM shall be stopped when the user presses the ‘X’ in the upper right corner of the window manager border. When they select one of these options, we write the appropriate entry into the .vmx file.
These two options are: 1) Powerdown the VM; or 2) Suspend the VM

There are no issues with option 1. However, if the user selects option 2, then the following behavior will occur.

After bringing up and suspending the VM two or three times, the user will then not be able to do anything, meaning all the fvwm menu items don’t respond. Since the user has no way to bring up a shell or anything, and logout, reboot, shutdown don’t work from their menu, they are forced to power down their machine with the power button.

A debug version of the system shows that fvwm is holding nearly 1000 FIFO’s in the users name, and hence the menu items no longer work since no more FIFOs can be opened. Note that these FIFOs are just the ones held by the fvwm command under the users UID. This is seen using the lsof command.

If the user chooses the powerdown option for the VM, none of this happens. In this case, just at login, fvwm will hold about 80 FIFOs; after the VM is launced, fvwm will hold about 200 FIFOs, then when the VM is powered down, we go back to about 80 FIFOs.

In the VM suspend case, the number of FIFO’s continually escalates up to the limit at which the system is unresponsive. In a debug build of the system, I can bring up an xterm with a hot-key combination and kill the fvwm process.

I am just wondering if anyone has any ideas, is there more info I could provide, what things can I do to narrow down the problem, etc.

Thanks,
Tony

thomasadam · 15 July 2011 23:36

How is this anything to do with FVWM?

– Thomas Adam

nemesis9 · 18 July 2011 16:31

I didn’t say it was fvwm, I just asked if anyone had any ideas. Perhaps you could start by educating me on how it is definitely not fvwm.

thomasadam · 18 July 2011 17:34

Do you see this on proper hardware?

– Thomas Adam

nemesis9 · 18 July 2011 18:31

Yes, it a production DELL Optiplex 755, core 2 duo, nvidia g71 (Quadro FX 1500).
I am suspecting vmplayer at the moment, but the FIFOs show up as owned by the user under the fvwm process. How are these FIFOs created, does vmplayer create them, I would hope they would be cleaned up once the app has exited, but apparently they are not.

I put a watch on “lsof | grep fvwm | wc -l” the number increases each time I launch vmplayer, but this number never decreases until eventually we are hung.

Another case where this occurs is running “xmessage “Hello””. This allocates 2 FIFOs each instance, but they are never destroyed after xmessage stops.

Thanks,
Tony

thomasadam · 18 July 2011 18:35

Well, obviously. I would have thought that obvious, no?

FVWM itself doesn’t go hoarding FIFOs, except when it creates two to communicate with any FVWM modules; that’s the only time. So these FIFOs you speak of, whilst under FVWM, aren’t being spawned by FVWM at all.

– Thomas Adam

nemesis9 · 18 July 2011 21:19

No, it is definitely NOT obvious, and it is not vmplayer, since it occurs with any X application, including xmessage as I pointed out in the previous post. This leads me to believe it may be at a lower level, like the X libraries coupled with some configuration perhaps.

thomasadam · 18 July 2011 21:45

But fundamentally you have two different things here:

Processes spawned as a child of fvwm (pretty-much anything launched via X once fvwm loads)
A bunch of FIFOs being opened.

There is nothing which FVWM does to hold open FIFOs which isn’t about module communication. That’s the extent of what FVWM does with FIFOs to be able to communicate with the module: one channel for writing, the other for reading.

If it’s anything else, then it’s not FVWM’s problem – but don’t be blind-sighted by them appearing to be under FVWM, just due to process management, for example. At this point, I’d recommend looking in to using ltrace and netstat, etc., to see where these things are coming from, as well as perhaps using limits.conf (assuming you use PAM?) to control how many FIFOs a given process can create to see what breaks.

– Thomas Adam