[Haifux] some additions and eratta to today's lecture

guy keren choo at actcom.co.il
Tue Mar 15 02:35:47 MSK 2011


1. etzion asked about controlling the age of dirty pages before pdflush 
   flushes them - the default value is 30 seconds, and can be seen by:

    cat /proc/sys/vm/dirty_expire_centisecs

    (the time there is in milli-seconds). it can be changed by echoing
the desired time into that file, e.g. to change it to 40 seconds:

     echo 40 > /proc/sys/vm/dirty_expire_centisecs

this parameter (and some other pdflush-related parameters) is described
in the link i mentioned during the meeting today - that talks about
configuring pdflush for write-intensive workloads:

    http://www.westnet.com/~gsmith/content/linux-pdflush.htm

2. if you read the above link - you'll see that in case there are too
many dirty pages - the writing to disk is done directly by processes
calling the write() system call (if you'll check the stack trace -
you'll see these processes waiting for the page transfer to complete).
this is done to serve as a flow-control mechanism, that slows down the
processes that fill up the dirty cache.

3. regarding whether the write system call copied the data directly into
the page-cache, or into a temporary buffer - it indeed copies the data
directly into the page-cache. the generic write() system call passes
control to the file-system's code - and this eventually allocates the
pages, then maps the page with the user-data into the kernel's address
space, and copy the data into the page-cache's page. note: i checked
this for the ext3 file-system in kernel 2.6.18 - but the code it uses
resides in generic kernel code - so i think other file-systems will
behave the same.

4. regarding the output of iostat -x:

    the 'wrqm/s' (write requests merged per-second) field, shows the
number of write requests that were merged into existing requests by the
elevator. i.e. if 3 requests were merged together, '2' will be added to
this counter (the first of these requests was not merged. the other two
were merged into the first). this, at least according to the code of
kernel 2.6.18


if i forgot some question, or you have a question about what we covered
today - please shout!

--guy




More information about the Haifux mailing list