[Haifux] some additions and eratta to today's lecture

guy keren choo at actcom.co.il
Tue Mar 15 04:52:39 MSK 2011

someone reminded me about the "small trail through the linux kernel"
link i mentioned. it is:


note that it is from 2001 and relates to kernel 2.4 (or even older) -
but the general has not completely changed.

you can find a more up-to-date info about this in the book "the linux
kernel, 3rd edition" - in the VFS chapter.


On Tue, 2011-03-15 at 01:35 +0200, guy keren wrote:
> 1. etzion asked about controlling the age of dirty pages before pdflush 
>    flushes them - the default value is 30 seconds, and can be seen by:
>     cat /proc/sys/vm/dirty_expire_centisecs
>     (the time there is in milli-seconds). it can be changed by echoing
> the desired time into that file, e.g. to change it to 40 seconds:
>      echo 40 > /proc/sys/vm/dirty_expire_centisecs
> this parameter (and some other pdflush-related parameters) is described
> in the link i mentioned during the meeting today - that talks about
> configuring pdflush for write-intensive workloads:
>     http://www.westnet.com/~gsmith/content/linux-pdflush.htm
> 2. if you read the above link - you'll see that in case there are too
> many dirty pages - the writing to disk is done directly by processes
> calling the write() system call (if you'll check the stack trace -
> you'll see these processes waiting for the page transfer to complete).
> this is done to serve as a flow-control mechanism, that slows down the
> processes that fill up the dirty cache.
> 3. regarding whether the write system call copied the data directly into
> the page-cache, or into a temporary buffer - it indeed copies the data
> directly into the page-cache. the generic write() system call passes
> control to the file-system's code - and this eventually allocates the
> pages, then maps the page with the user-data into the kernel's address
> space, and copy the data into the page-cache's page. note: i checked
> this for the ext3 file-system in kernel 2.6.18 - but the code it uses
> resides in generic kernel code - so i think other file-systems will
> behave the same.
> 4. regarding the output of iostat -x:
>     the 'wrqm/s' (write requests merged per-second) field, shows the
> number of write requests that were merged into existing requests by the
> elevator. i.e. if 3 requests were merged together, '2' will be added to
> this counter (the first of these requests was not merged. the other two
> were merged into the first). this, at least according to the code of
> kernel 2.6.18
> if i forgot some question, or you have a question about what we covered
> today - please shout!
> --guy
> _______________________________________________
> Haifux mailing list
> Haifux at haifux.org
> http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux

More information about the Haifux mailing list