[Haifux] Login console freezes: Eli's weekly riddle
eli at billauer.co.il
Fri Oct 29 19:04:04 MSD 2010
Maxim Kovgan wrote:
> 1) I hope I didn't miss a sentence in one of the emails "I've searched the google for 48 hours for similar problem..."
I always check Google first. Just to find out that it's adapting itself
to "common people". For example, looking for "pts", I get hits on the
word points. Looks like we're loosing Google too.
As for the arguments about TTYs blocking, I don't think that is relevant
during the open() call. Opening a TTY/PTY is doing nothing except
allocating memory and registering stuff, as far as I know. No reason it
should take any time. Unless...
Since there's a little kernel hacker hiding in me, I decided to put
printk's all over the pty.c file. I ended up with 37 of them, and here's
the interesting part (in the modified code):
static int ptmx_open(struct inode *inode, struct file *filp)
struct tty_struct *tty;
/* find a device that is not in use. */
printk(KERN_ALERT "34: pty_open to lock\n");
printk(KERN_ALERT "35: pty_open locked\n");
OK, so I wrote "pty_code" where it should be "ptmx_open". But the
numbers are the real identifiers.
And here's a small snippet from my /var/log/messages:
Oct 29 16:14:34 ocho kernel: 02: pty_close to lock ffff880200f48800
Oct 29 16:14:34 ocho kernel: 03: pty_close did lock ffff880200f48800
Oct 29 16:14:58 ocho kernel: 34: pty_open to lock
Oct 29 16:15:13 ocho kernel: 35: pty_open locked
Hmmm... 15 seconds to acquire a lock. On other 34-35 pairs this took
no-time, of course. In case you wonder, tty_lock() and tty_unlock() gets
and releases the global mutex for all TTY related stuff. Introduced
somewhere between my old 2.6.35 and 2.6.36. Could it have something to
do with my problem?
Or maybe, it has something to do with pty_close() looking like this (the
printks above omitted)?
static void (struct tty_struct *tty, struct file *filp)
[ ... do the stuff, but no hassle with locks... ]
Hmmm... Calling tty_unlock() first and then tty_lock(). And then return.
Maybe I missed something here, but this doesn't look good to me. I
didn't get it wrong. All other calls are lock(), then unlock(). Of course.
I've sent a little note to the relevant kernel maintainer about this. In
the meanwhile, I'll stick to my older kernel. Had enough with those oopses.
Thanks to Guy again for that little tip.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Haifux