[Haifux] Login console freezes: Eli's weekly riddle

Eli Billauer eli at billauer.co.il
Fri Oct 29 19:04:04 MSD 2010


Maxim Kovgan wrote:

> 1) I hope I didn't miss a sentence in one of the emails "I've searched the google for 48 hours for similar problem..."
>   
I always check Google first. Just to find out that it's adapting itself 
to "common people". For example, looking for "pts", I get hits on the 
word points. Looks like we're loosing Google too.

As for the arguments about TTYs blocking, I don't think that is relevant 
during the open() call. Opening a TTY/PTY is doing nothing except 
allocating memory and registering stuff, as far as I know. No reason it 
should take any time. Unless...

Since there's a little kernel hacker hiding in me, I decided to put 
printk's all over the pty.c file. I ended up with 37 of them, and here's 
the interesting part (in the modified code):

<snip>
static int ptmx_open(struct inode *inode, struct file *filp)
{
    struct tty_struct *tty;
    int retval;
    int index;

    nonseekable_open(inode, filp);

    /* find a device that is not in use. */
    printk(KERN_ALERT  "34: pty_open to lock\n");
    tty_lock();
    printk(KERN_ALERT  "35: pty_open locked\n");
<snip>

OK, so I wrote "pty_code" where it should be "ptmx_open". But the 
numbers are the real identifiers.

And here's a small snippet from my /var/log/messages:

Oct 29 16:14:34 ocho kernel: 02: pty_close to lock ffff880200f48800
Oct 29 16:14:34 ocho kernel: 03: pty_close did lock ffff880200f48800
Oct 29 16:14:58 ocho kernel: 34: pty_open to lock
Oct 29 16:15:13 ocho kernel: 35: pty_open locked

Hmmm... 15 seconds to acquire a lock. On other 34-35 pairs this took 
no-time, of course. In case you wonder, tty_lock() and tty_unlock() gets 
and releases the global mutex for all TTY related stuff. Introduced 
somewhere between my old 2.6.35 and 2.6.36. Could it have something to 
do with my problem?

Or maybe, it has something to do with pty_close() looking like this (the 
printks above omitted)?

static void (struct tty_struct *tty, struct file *filp)
{
    [ ... do the stuff, but no hassle with locks... ]
        tty_*un*lock();
        tty_vhangup(tty->link);
        tty_lock();
    }
}

Hmmm... Calling tty_unlock() first and then tty_lock(). And then return. 
Maybe I missed something here, but this doesn't look good to me. I 
didn't get it wrong. All other calls are lock(), then unlock(). Of course.

I've sent a little note to the relevant kernel maintainer about this. In 
the meanwhile, I'll stick to my older kernel. Had enough with those oopses.

Thanks to Guy again for that little tip.

   Eli

-- 
Web: http://www.billauer.co.il

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://haifux.org/pipermail/haifux/attachments/20101029/4f4629d4/attachment.html 


More information about the Haifux mailing list