]> git.openfabrics.org - ~shefty/rdma-dev.git/commit
ext4: fix ext4_flush_completed_IO wait semantics
authorDmitry Monakhov <dmonakhov@openvz.org>
Fri, 5 Oct 2012 15:31:55 +0000 (11:31 -0400)
committerTheodore Ts'o <tytso@mit.edu>
Fri, 5 Oct 2012 15:31:55 +0000 (11:31 -0400)
commitc278531d39f3158bfee93dc67da0b77e09776de2
treeb83341e04d54b3f1cd8171f43ec77bbfba06e571
parent041bbb6d369811e948ae01f3d00414264076be35
ext4: fix ext4_flush_completed_IO wait semantics

BUG #1) All places where we call ext4_flush_completed_IO are broken
    because buffered io and DIO/AIO goes through three stages
    1) submitted io,
    2) completed io (in i_completed_io_list) conversion pended
    3) finished  io (conversion done)
    And by calling ext4_flush_completed_IO we will flush only
    requests which were in (2) stage, which is wrong because:
     1) punch_hole and truncate _must_ wait for all outstanding unwritten io
      regardless to it's state.
     2) fsync and nolock_dio_read should also wait because there is
        a time window between end_page_writeback() and ext4_add_complete_io()
        As result integrity fsync is broken in case of buffered write
        to fallocated region:
        fsync                                      blkdev_completion
 ->filemap_write_and_wait_range
                                                   ->ext4_end_bio
                                                     ->end_page_writeback
          <-- filemap_write_and_wait_range return
 ->ext4_flush_completed_IO
     sees empty i_completed_io_list but pended
     conversion still exist
                                                     ->ext4_add_complete_io

BUG #2) Race window becomes wider due to the 'ext4: completed_io
locking cleanup V4' patch series

This patch make following changes:
1) ext4_flush_completed_io() now first try to flush completed io and when
   wait for any outstanding unwritten io via ext4_unwritten_wait()
2) Rename function to more appropriate name.
3) Assert that all callers of ext4_flush_unwritten_io should hold i_mutex to
   prevent endless wait

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
fs/ext4/ext4.h
fs/ext4/extents.c
fs/ext4/file.c
fs/ext4/fsync.c
fs/ext4/indirect.c
fs/ext4/page-io.c