Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull request for 4.15 #7

Closed
wants to merge 109 commits into from
Closed

Pull request for 4.15 #7

wants to merge 109 commits into from

Commits on Oct 30, 2017

  1. Btrfs: remove batch plug in run_scheduled_IO

    Block layer has a limit on plug, ie. BLK_MAX_REQUEST_COUNT == 16, so
    we don't gain benefits by batching 64 bios here.
    
    Signed-off-by: Liu Bo <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    219d33b View commit details
    Browse the repository at this point in the history
  2. Btrfs: move finish_wait out of the loop

    If we're still going to wait after schedule(), we don't have to do
    finish_wait() to remove our %wait_queue_entry since prepare_to_wait()
    won't add the same %wait_queue_entry twice.
    
    Signed-off-by: Liu Bo <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    69cc715 View commit details
    Browse the repository at this point in the history
  3. Btrfs: use wait_event instead of a single function

    Since TASK_UNINTERRUPTIBLE has been used here, wait_event() can do the
    same job.
    
    Signed-off-by: Liu Bo <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    45bac0f View commit details
    Browse the repository at this point in the history
  4. Btrfs: protect conditions within root->log_mutex while waiting

    Both wait_for_commit() and wait_for_writer() are checking the
    condition out of the mutex lock.
    
    This refactors code a bit to be lock safe.
    
    Signed-off-by: Liu Bo <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    49e83f5 View commit details
    Browse the repository at this point in the history
  5. btrfs: Remove redundant forward declarations

    Some static functions are needlessly forward declared. Let's remove those
    declarations since they add no value.
    
    Signed-off-by: Nikolay Borisov <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    lorddoskias authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    f78541d View commit details
    Browse the repository at this point in the history
  6. btrfs: declare TRACE_DEFINE_ENUM for each of show_flush_state enum

    So that perf can show the state symbol.
    
    Signed-off-by: Anand Jain <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    012e513 View commit details
    Browse the repository at this point in the history
  7. Btrfs: make some volumes.c functions static

    These aren't used outside of volumes.c.
    
    Signed-off-by: Omar Sandoval <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    osandov authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    c9162bd View commit details
    Browse the repository at this point in the history
  8. Btrfs: fix __user casting in ioctl.c

    Signed-off-by: Omar Sandoval <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    osandov authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    718dc5f View commit details
    Browse the repository at this point in the history
  9. btrfs: copy fsid to super_block s_uuid

    We didn't copy fsid to struct super_block.s_uuid so Overlay disables
    index feature with btrfs as the lower FS.
    
    kernel: overlayfs: fs on '/lower' does not support file handles, falling back to index=off.
    
    Fix this by publishing the fsid through struct super_block.s_uuid.
    
    [ dsterba: I think that setting s_uuid is the last missing bit. Overlay
      needs the file handle encoding support from the lower filesystem, which
      is supported. Filling the whole filesystem id is correct, the subvolume
      id is encoded in the file handle buffer from inside btrfs_encode_fh. ]
    
    Signed-off-by: Anand Jain <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    ee87cf5 View commit details
    Browse the repository at this point in the history
  10. Btrfs: search parity device wisely

    After mapping block with BTRFS_MAP_WRITE, parities have been sorted to
    the end position, so this search can start from the first parity
    stripe.
    
    Signed-off-by: Liu Bo <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ copied changelog as a comment ]
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    9cd3a7e View commit details
    Browse the repository at this point in the history
  11. Btrfs: do not async submit for nodatasum inodes

    While we submit direct writes, if the inode is flagged with nodatasum,
    there's no benefit to submit asynchronously, because
    
    a) we don't have to calculate checksum across processors,
    
    b) and direct IO has started a plug, but async submit makes us queue
    IO on each device's scheduled IO list instead of DIO's plug list, so
    that IOs get much less merges in general.
    
    Lets use sync submit for nodatasum inodes.
    
    Signed-off-by: Liu Bo <[email protected]>
    Reviewed-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    9b4a9b2 View commit details
    Browse the repository at this point in the history
  12. btrfs: Remove unused variable

    Src was initially part of 31ff1cd ("Btrfs: Copy into the log tree in
    big batches"), however 16e7549 ("Btrfs: incompatible format change
    to remove hole extents") changed parameters passed to copy_items which
    made the src variable redundant.
    
    Signed-off-by: Nikolay Borisov <[email protected]>
    Reviewed-by: Timofey Titovets <[email protected]>
    Reviewed-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    lorddoskias authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    8ca1995 View commit details
    Browse the repository at this point in the history
  13. btrfs: Remove unused parameters from various functions

    iterate_dir_item:found_key - introduced in 31db9f7 ("Btrfs:
      introduce BTRFS_IOC_SEND for btrfs send/receive"), yet never used.
    
    record_ref:num - ditto
    
    This is a first pass with the low-hanging fruit. There are still quite a
    few unsued parameters in some function which have to abide by a callback
    interface.
    
    Signed-off-by: Nikolay Borisov <[email protected]>
    Reviewed-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    lorddoskias authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    a035751 View commit details
    Browse the repository at this point in the history
  14. btrfs: Remove unused arguments from btrfs_changed_cb_t

    btrfs_changed_cb_t represents the signature of the callback being passed
    to btrfs_compare_trees. Currently there is only one such callback,
    namely changed_cb in send.c. This function doesn't really uses the first
    2 parameters, i.e. the roots. Since there are not other functions
    implementing the btrfs_changed_cb_t let's remove the unused parameters
    from the prototype and implementation.
    
    Signed-off-by: Nikolay Borisov <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Reviewed-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    lorddoskias authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    ee8c494 View commit details
    Browse the repository at this point in the history
  15. btrfs: Remove unused parameter from check_direct_IO

    Introduced by 5a5f79b ("Btrfs: allow unaligned DIO") and never
    used. The buffered fallback from unaligned DIO works as expected.
    
    Signed-off-by: Nikolay Borisov <[email protected]>
    Reviewed-by: Timofey Titovets <[email protected]>
    Reviewed-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    lorddoskias authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    8c70c9f View commit details
    Browse the repository at this point in the history
  16. btrfs: Rework error handling of add_extent_mapping in __btrfs_alloc_c…

    …hunk
    
    Currently the code executes add_extent_mapping and if it is successful
    it links the new mapping, it then proceeds to unlock the extent mapping
    tree and check for failure and handle them. Instead, rework the code to
    only perform a single check if add_extent_mapping has failed and handle
    it, otherwise the code continues in a linear fashion. No functional
    changes
    
    Signed-off-by: Nikolay Borisov <[email protected]>
    Reviewed-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    lorddoskias authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    1efb72a View commit details
    Browse the repository at this point in the history
  17. btrfs: Remove redundant argument of __link_block_group

    __link_block_group is called from only 2 places and at each call site the
    space_info being passed is the same as the space info assigned to the passed
    cache struct. Let's remove the redundant argument and make the function
    reference the space_info from the passed block_group_cache. No functional
    changes
    
    Signed-off-by: Nikolay Borisov <[email protected]>
    Reviewed-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ renamed to link_block_group ]
    Signed-off-by: David Sterba <[email protected]>
    lorddoskias authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    c434d21 View commit details
    Browse the repository at this point in the history
  18. btrfs: tests: Fix a memory leak in error handling path in 'run_test()'

    If 'btrfs_alloc_path()' fails, we must free the resources already
    allocated, as done in the other error handling paths in this function.
    
    Signed-off-by: Christophe JAILLET <[email protected]>
    Reviewed-by: Qu Wenruo <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    tititiou36 authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    9ca2e97 View commit details
    Browse the repository at this point in the history
  19. btrfs: Clean up dead code in root-tree

    The value of variable 'can_recover' is never used after being set, thus
    it should be removed, as it was never used since the first commit
    68a7342 ("Btrfs: cleanup orphaned root orphan item").
    
    Signed-off-by: Christos Gkekas <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    chggr authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    fa0d088 View commit details
    Browse the repository at this point in the history
  20. btrfs: avoid null pointer dereference on fs_info when calling btrfs_crit

    There are checks on fs_info in __btrfs_panic to avoid dereferencing a
    null fs_info, however, there is a call to btrfs_crit that may also
    dereference a null fs_info. Fix this by adding a check to see if fs_info
    is null and only print the s_id if fs_info is non-null.
    
    Detected by CoverityScan CID#401973 ("Dereference after null check")
    
    Fixes: efe120a ("Btrfs: convert printk to btrfs_ and fix BTRFS prefix")
    Signed-off-by: Colin Ian King <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Colin Ian King authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    3993b11 View commit details
    Browse the repository at this point in the history
  21. btrfs: convert all mount option checking code to use btrfs_test_opt

    Signed-off-by: Satoru Takeuchi <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    satoru-takeuchi authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    d8953d6 View commit details
    Browse the repository at this point in the history
  22. Btrfs: make plug in writing meta blocks really work

    We have started plug in btrfs_write_and_wait_marked_extents() but the
    generated IOs actually go to device's schedule IO list where the work
    is doing in another task, thus the started plug doesn't make any
    sense.
    
    And since we wait for IOs immediately after writing meta blocks, it's
    the same case as writing log tree, doing sync submit can merge more
    IOs.
    
    Signed-off-by: Liu Bo <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    6300463 View commit details
    Browse the repository at this point in the history
  23. Btrfs: remove bio_flags which indicates a meta block of log-tree

    Since both committing transaction and writing log-tree are doing
    plugging on metadata IO, we can unify to use %sync_writers to benefit
    both cases, instead of checking bio_flags while writing meta blocks of
    log-tree.
    
    We can remove this bio_flags because in order to write dirty blocks,
    log tree also uses btrfs_write_marked_extents(), inside which we
    have enabled %sync_writers, therefore, every write goes in a
    synchronous way, so does checksuming.
    
    Please also note that, bio_flags is applied per-context while
    %sync_writers is applied per-inode, so this might incur some overhead, ie.
    
    1) while log tree is flushing its dirty blocks via
       btrfs_write_marked_extents(), in which %sync_writers is increased
       by one.
    
    2) in the meantime, some writeback operations may happen upon btrfs's
       metadata inode, so these writes go synchronously, too.
    
    However, AFAICS, the overhead is not a big one while the win is that
    we unify the two places that needs synchronous way and remove a
    special hack/flag.
    
    This removes the bio_flags related stuff for writing log-tree.
    
    Signed-off-by: Liu Bo <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    18fdc67 View commit details
    Browse the repository at this point in the history
  24. Btrfs: fix confusing worker helper info in stacktrace

    We've seen the following backtrace stack in ftrace or dmesg log,
    
      kworker/u16:10-4244  [000] 241942.480955: function:             btrfs_put_ordered_extent
      kworker/u16:10-4244  [000] 241942.480956: kernel_stack:         <stack trace>
    => finish_ordered_fn (ffffffffa0384475)
    => btrfs_scrubparity_helper (ffffffffa03ca577)        <-----"incorrect"
    => btrfs_freespace_write_helper (ffffffffa03ca98e)    <-----"correct"
    => process_one_work (ffffffff81117b2f)
    => worker_thread (ffffffff81118c2a)
    => kthread (ffffffff81121de0)
    => ret_from_fork (ffffffff81d7087a)
    
    btrfs_freespace_write_helper is actually calling normal_worker_helper
    instead of btrfs_scrubparity_helper, so somehow kernel has parsed the
    incorrect function address while unwinding the stack,
    btrfs_scrubparity_helper really shouldn't be shown up.
    
    It's caused by compiler doing inline for our helper function, adding a
    noinline tag can fix that.
    
    Signed-off-by: Liu Bo <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ use noinline_for_stack ]
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    6939f66 View commit details
    Browse the repository at this point in the history
  25. btrfs: return -ENOMEM on allocation failure in btrfsic

    Forward the correct return value -ENOMEM from btrfsic_dev_state_alloc()
    too.
    
    Signed-off-by: Allen Pais <[email protected]>
    Reviewed-by: Anand Jain <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ adjust changelog ]
    Signed-off-by: David Sterba <[email protected]>
    Allen Pais authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    3afb0c5 View commit details
    Browse the repository at this point in the history
  26. btrfs: make array types static const, reduces object code size

    Don't populate the read-only array types on the stack, instead make
    it static const.  Makes the object code smaller by nearly 60 bytes:
    
    Before:
       text	   data	    bss	    dec	    hex	filename
      90536	   6552	     64	  97152	  17b80	fs/btrfs/ioctl.o
    
    After:
       text	   data	    bss	    dec	    hex	filename
      90414	   6616	     64	  97094	  17b46	fs/btrfs/ioctl.o
    
    Signed-off-by: Colin Ian King <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Colin Ian King authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    315d8e9 View commit details
    Browse the repository at this point in the history
  27. Btrfs: fix memory leak in raid56

    The local bio_list may have pending bios when doing cleanup, it can
    end up with memory leak if they don't get freed.
    
    Signed-off-by: Liu Bo <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    785884f View commit details
    Browse the repository at this point in the history
  28. Btrfs: send, apply asynchronous page cache readahead to enhance page …

    …read
    
    By analyzing the perf on btrfs send, we found it take large amount of
    cpu time on page_cache_sync_readahead. This effort can be reduced after
    switching to asynchronous one. Overall performance gain on HDD and SSD
    were 9 and 15 percent if simply send a large file.
    
    Signed-off-by: Kuanling Huang <[email protected]>
    Reviewed-by: Nikolay Borisov <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    peterh-syno authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    eef16ba View commit details
    Browse the repository at this point in the history
  29. btrfs: change how we decide to commit transactions during flushing

    Nikolay reported that generic/273 was failing currently with ENOSPC.
    Turns out this is because we get to the point where the outstanding
    reservations are greater than the pinned space on the fs.  This is a
    mistake, previously we used the current reservation amount in
    may_commit_transaction, not the entire outstanding reservation amount.
    Fix this to find the minimum byte size needed to make progress in
    flushing, and pass that into may_commit_transaction.  From there we can
    make a smarter decision on whether to commit the transaction or not.
    This fixes the failure in generic/273.
    
    From Nikolai, IOW: when we go to the final stage of deciding whether to
    do trans commit, instead of passing all the reservations from all
    tickets we just pass the reservation for the current ticket. Otherwise,
    in case all reservations exceed pinned space, then we don't commit
    transaction and fail prematurely. Before we passed num_bytes from
    flush_space, where num_bytes was the sum of all pending reserverations,
    but now all we do is take the first ticket and commit the trans if we
    can satisfy that.
    
    Fixes: 957780e ("Btrfs: introduce ticketed enospc infrastructure")
    Cc: [email protected] # 4.8
    Reported-by: Nikolay Borisov <[email protected]>
    Signed-off-by: Josef Bacik <[email protected]>
    Reviewed-by: Nikolay Borisov <[email protected]>
    Tested-by: Nikolay Borisov <[email protected]>
    [ added Nikolai's comment ]
    Signed-off-by: David Sterba <[email protected]>
    Josef Bacik authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    996478c View commit details
    Browse the repository at this point in the history
  30. Btrfs: cleanup 'start' subtraction from try uncompressed inline extent

    Was added in:
      c8b9781
      "Btrfs: Add zlib compression support"
    Survive to near time (from 08.10.2008).
    
    Because 'start' checked for zero before branch, so it's safe to remove
    that subtraction.
    
    Signed-off-by: Timofey Titovets <[email protected]>
    Reviewed-by: Satoru Takeuchi <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    nefelim4ag authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    6018ba0 View commit details
    Browse the repository at this point in the history
  31. btrfs: Refactor check_leaf function for later expansion

    Current check_leaf() function does a good job checking key order and
    item offset/size.
    
    However it only checks from slot 0 to the last but one slot, this is
    good but makes later expansion hard.
    
    So this refactoring iterates from slot 0 to the last slot.
    For key comparison, it uses a key with all 0 as initial key, so all
    valid keys should be larger than that.
    
    And for item size/offset checks, it compares current item end with
    previous item offset.
    For slot 0, use leaf end as a special case.
    
    This makes later item/key offset checks and item size checks easier to
    be implemented.
    
    Also, makes check_leaf() to return -EUCLEAN other than -EIO to indicate
    error.
    
    Signed-off-by: Qu Wenruo <[email protected]>
    Reviewed-by: Nikolay Borisov <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Qu Wenruo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    c3267bb View commit details
    Browse the repository at this point in the history
  32. btrfs: Check if item pointer overlaps with the item itself

    Function check_leaf() checks if any item pointer points outside of the
    leaf, but it doesn't check if the pointer overlaps with the item itself.
    
    Normally only the last item may be the victim, but adding such check is
    never a bad idea anyway.
    
    Signed-off-by: Qu Wenruo <[email protected]>
    Reviewed-by: Nikolay Borisov <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Qu Wenruo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    7f43d4a View commit details
    Browse the repository at this point in the history
  33. btrfs: Add sanity check for EXTENT_DATA when reading out leaf

    Add extra checks for item with EXTENT_DATA type.  This checks the
    following thing:
    
    0) Key offset
       All key offsets must be aligned to sectorsize.
       Inline extent must have 0 for key offset.
    
    1) Item size
       Uncompressed inline file extent size must match item size.
       (Compressed inline file extent has no information about its on-disk size.)
       Regular/preallocated file extent size must be a fixed value.
    
    2) Every member of regular file extent item
       Including alignment for bytenr and offset, possible value for
       compression/encryption/type.
    
    3) Type/compression/encode must be one of the valid values.
    
    This should be the most comprehensive and strict check in the context
    of btrfs_item for EXTENT_DATA.
    
    Signed-off-by: Qu Wenruo <[email protected]>
    Reviewed-by: Nikolay Borisov <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ switch to BTRFS_FILE_EXTENT_TYPES, similar to what
      BTRFS_COMPRESS_TYPES does ]
    Signed-off-by: David Sterba <[email protected]>
    Qu Wenruo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    40c3c40 View commit details
    Browse the repository at this point in the history
  34. btrfs: Add checker for EXTENT_CSUM

    EXTENT_CSUM checker is a relatively easy one, only needs to check:
    
    1) Objectid
       Fixed to BTRFS_EXTENT_CSUM_OBJECTID
    
    2) Key offset alignment
       Must be aligned to sectorsize
    
    3) Item size alignedment
       Must be aligned to csum size
    
    Signed-off-by: Qu Wenruo <[email protected]>
    Reviewed-by: Nikolay Borisov <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Qu Wenruo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    4b865ca View commit details
    Browse the repository at this point in the history
  35. btrfs: undo writable superblocke when sprouting fails

    When new device is being added to seed FS, seed FS is marked writable,
    but when we fail to bring in the new device, we missed to undo the
    writable part. This patch fixes it.
    
    Signed-off-by: Anand Jain <[email protected]>
    Reviewed-by: Nikolay Borisov <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    0af2c4b View commit details
    Browse the repository at this point in the history
  36. btrfs: fix BUG_ON in btrfs_init_new_device()

    Instead of BUG_ON return error to the caller. And handle the fail
    condition by calling the abort transaction and going through the
    error path.
    
    Signed-off-by: Anand Jain <[email protected]>
    Reviewed-by: Nikolay Borisov <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    d31c32f View commit details
    Browse the repository at this point in the history
  37. btrfs: error out if btrfs_attach_transaction() fails

    btrfs_init_new_device() calls btrfs_attach_transaction() to
    commit sys chunks, and it should error out if it fails.
    
    Signed-off-by: Anand Jain <[email protected]>
    Reviewed-by: Qu Wenruo <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    7132a26 View commit details
    Browse the repository at this point in the history
  38. btrfs: Explicitly handle btrfs_update_root failure

    btrfs_udpate_root can fail and it aborts the transaction, the correct
    way to handle an aborted transaction is to explicitly end with
    btrfs_end_transaction.  Even now the code is correct since
    btrfs_commit_transaction would handle an aborted transaction but this is
    more of an implementation detail. So let's be explicit in handling
    failure in btrfs_update_root.
    
    Furthermore btrfs_commit_transaction can also fail and by ignoring it's
    return value we could have left the in-memory copy of the root item in
    an inconsistent state. So capture the error value which allows us to
    correctly revert the RO/RW flags in case of commit failure.
    
    Signed-off-by: Nikolay Borisov <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    lorddoskias authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    9417ebc View commit details
    Browse the repository at this point in the history
  39. btrfs: Refactor transaction handling in received subvolume ioctl

    If btrfs_transaction_commit fails it will proceed to call
    cleanup_transaction, which in turn already does btrfs_abort_transaction.
    So let's remove the unnecessary code duplication. Also let's be explicit
    about handling failure of btrfs_uuid_tree_add by calling
    btrfs_end_transaction.
    
    Signed-off-by: Nikolay Borisov <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    lorddoskias authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    efd3815 View commit details
    Browse the repository at this point in the history
  40. btrfs: Fix bool initialization/comparison

    Bool initializations should use true and false. Bool tests don't need
    comparisons.
    
    Signed-off-by: Thomas Meyer <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    thomasmey authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    897ca81 View commit details
    Browse the repository at this point in the history
  41. btrfs: prefix sysfs attribute struct names

    Currently struct names for sysfs are generated only based on the
    attribute names. This means that attribute names cannot be reused in
    multiple places throughout the complete btrfs sysfs hierarchy.
    
    E.g. allocation/data/total_bytes and allocation/data/single/total_bytes
    result in the same struct name btrfs_attr_total_bytes. A workaround for
    this case was made in the past by ad hoc creating an extra macro
    wrapper, BTRFS_RAID_ATTR, that inserts some extra text in the struct
    name.
    
    Instead of polluting sysfs.h with such kind of extra macro definitions,
    and only doing so when there are collisions, use a prefix which gets
    inserted in the struct name, so we keep everything nicely grouped
    together by default.
    
    Current collections of attributes are:
    * (the toplevel, empty prefix)
    * allocation
    * space_info
    * raid
    * features
    
    Signed-off-by: Hans van Kranenburg <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    knorrie authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    a969f4c View commit details
    Browse the repository at this point in the history
  42. btrfs: use appropriate replacements for __sb_{start,end}_write calls

    Commit a53f4f8 ("btrfs: Don't call btrfs_start_transaction() on
    frozen fs to avoid deadlock.") started using internal calls and we
    replace them with more suitable ones.
    
    Signed-off-by: Rakesh Pandit <[email protected]>
    Reviewed-by: Nikolay Borisov <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    rakeshpandit authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    a7e3c5f View commit details
    Browse the repository at this point in the history
  43. Btrfs: compress_file_range remove dead variable num_bytes

    Remove dead assigment of num_bytes.
    
    Also as num_bytes only used in the will_compress block as copy of
    total_in just replace that with total_in and drop num_bytes entirely.
    
    Signed-off-by: Timofey Titovets <[email protected]>
    Reviewed-by: Nikolay Borisov <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    nefelim4ag authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    1170862 View commit details
    Browse the repository at this point in the history
  44. btrfs: Move leaf and node validation checker to tree-checker.c

    It's no doubt the comprehensive tree block checker will become larger,
    so moving them into their own files is quite reasonable.
    
    Signed-off-by: Qu Wenruo <[email protected]>
    [ wording adjustments ]
    Signed-off-by: David Sterba <[email protected]>
    Qu Wenruo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    557ea5d View commit details
    Browse the repository at this point in the history
  45. btrfs: tree-checker: Enhance btrfs_check_node output

    Use inline function to replace macro since we don't need
    stringification.
    (Macro still exists until all callers get updated)
    
    And add more info about the error, and replace EIO with EUCLEAN.
    
    For nr_items error, report if it's too large or too small, and output
    the valid value range.
    
    For node block pointer, added a new alignment checker.
    
    For key order, also output the next key to make the problem more
    obvious.
    
    Signed-off-by: Qu Wenruo <[email protected]>
    [ wording adjustments, unindented long strings ]
    Signed-off-by: David Sterba <[email protected]>
    Qu Wenruo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    bba4f29 View commit details
    Browse the repository at this point in the history
  46. btrfs: tree-checker: Enhance output for btrfs_check_leaf

    Enhance the output to print:
    1) the eason
    2) the ad value, if reason is not sufficient
    3) good value (range)
    
    Signed-off-by: Qu Wenruo <[email protected]>
    [ wording, unidented long strings ]
    Signed-off-by: David Sterba <[email protected]>
    Qu Wenruo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    478d01b View commit details
    Browse the repository at this point in the history
  47. btrfs: tree-checker: Enhance output for check_csum_item

    Output the bad value and expected good value (or its alignment).
    
    Signed-off-by: Qu Wenruo <[email protected]>
    [ unindent long strings ]
    Signed-off-by: David Sterba <[email protected]>
    Qu Wenruo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    d508c5f View commit details
    Browse the repository at this point in the history
  48. btrfs: tree-checker: Enhance output for check_extent_data_item

    Output the invalid member name and its bad value, along with its
    expected value range or alignment.
    
    Signed-off-by: Qu Wenruo <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Qu Wenruo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    8806d71 View commit details
    Browse the repository at this point in the history
  49. Btrfs: remove nr_async_bios

    This was intended to congest higher layers to not send bios, but as
    
    1) the congested bit has been taken by writeback
    
    Async bios come from buffered writes and DIO writes.
    
    For DIO writes, we want to submit them ASAP, while for buffered writes,
    writeback uses balance_dirty_pages() to throttle how much dirty pages we
    can have.
    
    2) and no one is waiting for %nr_async_bios down to zero,
    
    Historically, it was introduced along with changes which let
    checksumming workload spread accross different cpus.  And at that time,
    pdflush was used instead of per-bdi flushing, perhaps pdflush did not
    have the necessary information for writeback to do throttling.
    
    We can safely remove them now.
    
    Signed-off-by: Liu Bo <[email protected]>
    [ additional explanation from mails, removed unused variable 'limit' ]
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    f851689 View commit details
    Browse the repository at this point in the history
  50. Btrfs: do not make defrag wait on async_delalloc_pages

    By setting compression for a defrag task, the task will start IO at
    the end of defrag.
    
    After the combo of filemap_flush(), we've already made sure that
    dirty pages have made progress via async compress thread because the
    second filemap_flush() will wait for page lock, which won't be
    unlocked until those pages have been marked as writeback and ordered
    extents have been queued.
    
    And this is for per-inode defrag, it's not helpful to wait on a global
    %async_delalloc_pages and %nr_async_submits from fs_info.
    
    Although waiting on %nr_async_submits means that all bios are
    submitted down to per-device schedule IO lists, it doesn't wait for
    their completions, thus users still need to do fsync/sync to make sure
    the data is on disk.  While with this change, it makes sure that pages
    are marked with writeback bits and will be submitted asynchronously
    shortly, therefore, the behavior of defrag option '-c' remains unchanged.
    
    Signed-off-by: Liu Bo <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    80e03a2 View commit details
    Browse the repository at this point in the history
  51. Btrfs: remove nr_async_submits and async_submit_draining

    Now that we have the combo of flushing twice, which can make sure IO
    have started since the second flush will wait for page lock which
    won't be unlocked unless setting page writeback and queuing ordered
    extents, we don't need %async_submit_draining, %async_delalloc_pages
    and %nr_async_submits to tell whether the IO has actually started.
    
    Moreover, all the flushers in use are followed by functions that wait
    for ordered extents to complete, so %nr_async_submits, which tracks
    whether bio's async submit has made progress, doesn't really make
    sense.
    
    However, %async_delalloc_pages is still required by shrink_delalloc()
    as that function doesn't flush twice in the normal case (just issues a
    writeback with WB_REASON_FS_FREE_SPACE).
    
    Signed-off-by: Liu Bo <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    736cd52 View commit details
    Browse the repository at this point in the history
  52. btrfs: tree-checker: use %zu format string for size_t

    We now get a harmless compile-time on 32-bit architectures:
    
    fs/btrfs/tree-checker.c: In function 'check_extent_data_item':
    fs/btrfs/tree-checker.c:189:70: error: format '%lu' expects argument of type 'long unsigned int', but argument 6 has type 'unsigned int' [-Werror=format=]
    
    This changes the format string to use %zu instead of %lu for size_t.
    
    Fixes: c1f6520 ("btrfs: tree-checker: Enhance output for check_extent_data_item")
    Signed-off-by: Arnd Bergmann <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    arndb authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    709a95c View commit details
    Browse the repository at this point in the history
  53. btrfs: Clean up unused variables in free-space-tree.c

    Remove variables 'start' and 'end', which are set but never used.
    
    Signed-off-by: Christos Gkekas <[email protected]>
    Reviewed-by: Omar Sandoval <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    chggr authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    9e882d6 View commit details
    Browse the repository at this point in the history
  54. btrfs: add_missing_dev() should return the actual error

    add_missing_dev() can return device pointer so that IS_ERR/PTR_ERR can
    be used to check for the actual error that occurred in the function.
    
    Signed-off-by: Anand Jain <[email protected]>
    Reviewed-by: Liu Bo <[email protected]>
    [ minor error message adjustment ]
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    adfb69a View commit details
    Browse the repository at this point in the history
  55. btrfs: fix EIO misuse to report missing degraded option

    EIO is only for the IO failure to the device, avoid it. Use ENOENT as
    that's the closest error code describing what happened.
    
    Signed-off-by: Anand Jain <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ update changelog ]
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    45dbdbc View commit details
    Browse the repository at this point in the history
  56. btrfs: declare btrfs_report_missing_device() static

    Signed-off-by: Anand Jain <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    5a2b8e6 View commit details
    Browse the repository at this point in the history
  57. btrfs: fix use of error or warning for missing device

    When device is missing without the -o degraded option then its an error
    so report it as an error instead of a warning.  And when -o degraded
    option is provided, log the missing device as warning.
    
    Signed-off-by: Anand Jain <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ switch error to bool ]
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    2b902df View commit details
    Browse the repository at this point in the history
  58. btrfs: fix send ioctl on 32bit with 64bit kernel

    We pass in a pointer in our send arg struct, this means the struct size
    doesn't match with 32bit user space and 64bit kernel space.  Fix this by
    adding a compat mode and doing the appropriate conversion.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ move structure to the beginning, next to receive 32bit compat ]
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    2351f43 View commit details
    Browse the repository at this point in the history
  59. btrfs: scrub: get rid of sector_t

    The use of sector_t is not necessry, it's just for a warning.  Switch to
    u64 and rename the variable and use byte units instead of 512b, ie.
    dropping the >> 9 shifts.  The messages are adjusted as well.
    
    Reviewed-by: Liu Bo <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    6aa2126 View commit details
    Browse the repository at this point in the history
  60. btrfs: rename page offset parameter in submit_extent_page

    We're going to remove sector_t and will use 'offset', so this patch
    frees the name.
    
    Reviewed-by: Liu Bo <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    6c5a4e2 View commit details
    Browse the repository at this point in the history
  61. btrfs: get rid of sector_t and use u64 offset in submit_extent_page

    The use of sector_t in the callchain of submit_extent_page is not
    necessary.  Switch to u64 and rename the variable and use byte units
    instead of 512b, ie.  dropping the >> 9 shifts and avoiding the
    con(tro)versions of sector_t.
    
    Reviewed-by: Liu Bo <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    6273b7f View commit details
    Browse the repository at this point in the history
  62. btrfs: add ref-verify mount option

    This adds the infrastructure for turning ref verify on and off for a
    mount, to be used by a later patch.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ enhnance btrfs_print_mod_info to print if ref-verify is compiled in ]
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    fb59237 View commit details
    Browse the repository at this point in the history
  63. btrfs: pass root to various extent ref mod functions

    We need the actual root for the ref verifier tool to work, so change
    these functions to pass the root around instead.  This will be used in
    a subsequent patch.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    84f7d8e View commit details
    Browse the repository at this point in the history
  64. Btrfs: add a extent ref verify tool

    We were having corruption issues that were tied back to problems with
    the extent tree.  In order to track them down I built this tool to try
    and find the culprit, which was pretty successful.  If you compile with
    this tool on it will live verify every ref update that the fs makes and
    make sure it is consistent and valid.  I've run this through with
    xfstests and haven't gotten any false positives.  Thanks,
    
    Signed-off-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ update error messages, add fixup from Dan Carpenter to handle errors
      of read_tree_block ]
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    fd708b8 View commit details
    Browse the repository at this point in the history
  65. Btrfs: only check delayed ref usage in should_end_transaction

    We were only doing btrfs_check_space_for_delayed_refs() if the metadata
    space was full, ie we couldn't allocate chunks.  This assumes we'll be
    able to allocate chunks during transaction commit, but since nothing
    does a LIMIT flush during the transaction commit this won't actually
    happen unless we happen to run shy of actual space.  We already take
    into account a full fs in btrfs_check_space_for_delayed_refs() so just
    kill this extra check to make sure we're ending the transaction when we
    need to.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    7c77743 View commit details
    Browse the repository at this point in the history
  66. btrfs: add a helper to return a head ref

    Simplify the error handling in __btrfs_run_delayed_refs by breaking out
    the code used to return a head back to the delayed_refs tree for
    processing into a helper function.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    2eadaa2 View commit details
    Browse the repository at this point in the history
  67. btrfs: move extent_op cleanup to a helper

    Move the extent_op cleanup for an empty head ref to a helper function to
    help simplify __btrfs_run_delayed_refs.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    b00e625 View commit details
    Browse the repository at this point in the history
  68. btrfs: breakout empty head cleanup to a helper

    Move this code out to a helper function to further simplivy
    __btrfs_run_delayed_refs.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    194ab0b View commit details
    Browse the repository at this point in the history
  69. btrfs: move ref_mod modification into the if (ref) logic

    We only use this logic if our ref isn't a ref_head, so move it up into
    the if (ref) case since we know that this is a normal ref and not a
    delayed ref head.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    1ce7a5e View commit details
    Browse the repository at this point in the history
  70. btrfs: move all ref head cleanup to the helper function

    We do a couple different cleanup operations on the ref head.  We adjust
    counters, we'll free any reserved space if we didn't end up using the
    ref, and we clear the pending csum bytes.  Move all these disparate
    things into cleanup_ref_head and clean up the logic in
    __btrfs_run_delayed_refs so that it handles the !ref case a lot cleaner,
    as well as making run_one_delayed_ref() only deal with real refs and not
    the ref head.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    c1103f7 View commit details
    Browse the repository at this point in the history
  71. btrfs: remove delayed_ref_node from ref_head

    This is just excessive information in the ref_head, and makes the code
    complicated.  It is a relic from when we had the heads and the refs in
    the same tree, which is no longer the case.  With this removal I've
    cleaned up a bunch of the cruft around this old assumption as well.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    d278850 View commit details
    Browse the repository at this point in the history
  72. btrfs: remove type argument from comp_tree_refs

    We can get this from the ref we've passed in.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    3b60d43 View commit details
    Browse the repository at this point in the history
  73. btrfs: add assertions for releasing trans handle reservations

    These are useful for debugging problems where we mess with
    trans->block_rsv to make sure we're not screwing something up.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    bf2681c View commit details
    Browse the repository at this point in the history
  74. btrfs: use BLK_STS defines where needed

    At few places we could use BLK_STS_OK and BLK_STS_NOSUPP.
    
    Signed-off-by: Anand Jain <[email protected]>
    Reviewed-by: Satoru Taekeuchi <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ dropped first hunk btrfs_endio_direct_read ]
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    2dbe0c7 View commit details
    Browse the repository at this point in the history
  75. btrfs: cleanup extent locking sequence

    Code cleanup for better understanding:
    Variable needs_unlock to be called extent_locked to show state as
    opposed to action. Changed the type to int, to reduce code in the
    critical path.
    
    Signed-off-by: Goldwyn Rodrigues <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    goldwynr authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    79f015f View commit details
    Browse the repository at this point in the history
  76. btrfs: use need_full_stripe() in __btrfs_map_block()

    A cleanup patch, use need_full_stripe() to replace the open code.
    
    Signed-off-by: Anand Jain <[email protected]>
    Reviewed-by: Qu Wenruo <[email protected]>
    Reviewed-by: Nikolay Borisov <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    de48373 View commit details
    Browse the repository at this point in the history
  77. btrfs: fix false EIO for missing device

    When one of the device is missing, bbio_error() takes care of setting
    the error status. And if its only IO that is pending in that stripe, it
    fails to check the status of the other IO at %bbio_error before setting
    the error %bi_status for the %orig_bio. Fix this by checking if
    %bbio->error has exceeded the %bbio->max_errors.
    
    Reproducer as below fdatasync error is seen intermittently.
    
     mount -o degraded /dev/sdc /btrfs
     dd status=none if=/dev/zero of=$(mktemp /btrfs/XXX) bs=4096 count=1 conv=fdatasync
    
     dd: fdatasync failed for ‘/btrfs/LSe’: Input/output error
    
     The reason for the intermittences of the problem is because
     the following conditions have to be met, which depends on timing:
     In btrfs_map_bio()
      - the RAID1 the missing device has to be at %dev_nr = 1
     In bbio_error()
      . before bbio_error() is called the bio of the not-missing
        device at %dev_nr = 0 must be completed so that the below
        condition is true
         if (atomic_dec_and_test(&bbio->stripes_pending)) {
    
    Signed-off-by: Anand Jain <[email protected]>
    Reviewed-by: Liu Bo <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    102ed2c View commit details
    Browse the repository at this point in the history
  78. btrfs: Use bd_dev to generate index when dev_state_hashtable add items.

    Fix missing change from commit f8f84b2
    ("btrfs: index check-integrity state hash by a dev_t").
    
    Function btrfsic_dev_state_hashtable_lookup uses dev_t to generate hashval
    when look in up a btrfsic_dev_state in hash table. So when we add a
    btrfsic_dev_state into the hash table, it should also use dev_t.
    
    Reproducer of this bug:
    Use MOUNT_OPTIONS="-o check_int" when running xfstest, device can not be
    mounted successfully. So xfstest can not run.
    
    Signed-off-by: Gu JinXiang <[email protected]>
    Reviewed-by: Nikolay Borisov <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Gu JinXiang authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    859a58a View commit details
    Browse the repository at this point in the history
  79. btrfs: Replace opencoded sizes with their symbolic constants

    Currently btrfs' code uses a mix of opencoded sizes and defines from sizes.h.
    Let's unifiy the code base to always use the symbolic constants. No functional
    changes
    
    Signed-off-by: Nikolay Borisov <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    lorddoskias authored and kdave committed Oct 30, 2017
    Configuration menu
    Copy the full SHA
    d4417e2 View commit details
    Browse the repository at this point in the history

Commits on Nov 1, 2017

  1. btrfs: allow to set compression level for zlib

    Preliminary support for setting compression level for zlib, the
    following works:
    
    $ mount -o compess=zlib                 # default
    $ mount -o compess=zlib0                # same
    $ mount -o compess=zlib9                # level 9, slower sync, less data
    $ mount -o compess=zlib1                # level 1, faster sync, more data
    $ mount -o remount,compress=zlib3	# level set by remount
    
    The compress-force works the same as compress'.  The level is visible in
    the same format in /proc/mounts. Level set via file property does not
    work yet.
    
    Required patch: "btrfs: prepare for extensions in compression options"
    
    Signed-off-by: David Sterba <[email protected]>
    kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    f51d2b5 View commit details
    Browse the repository at this point in the history
  2. btrfs: allow setting zlib compression level via :9

    This is bikeshedding, but it seems people are drastically more likely to
    understand "zlib:9" as compression level rather than an algorithm
    version compared to "zlib9".
    
    Based on feedback on the mailinglist, the ":9" will be the only accepted
    syntax. The level must be a single digit. Unrecognized format will
    result to the default, for forward compatibility in a similar way the
    compression algorithm specifier was relaxed in commit
    a7164fa ("btrfs: prepare for extensions in compression
    options").
    
    Signed-off-by: Adam Borowski <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ tighten the accepted format ]
    Signed-off-by: David Sterba <[email protected]>
    kilobyte authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    fa4d885 View commit details
    Browse the repository at this point in the history
  3. btrfs: remove BUG_ON in btrfs_rm_dev_replace_free_srcdev()

    That was only an extra check to tackle a few bugs around this area, now
    its safe to remove it.  Replace it by an ASSERT.
    
    Signed-off-by: Anand Jain <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    6dd38f8 View commit details
    Browse the repository at this point in the history
  4. btrfs: send: remove unused code

    This code was first introduced in 31db9f7 ("Btrfs: introduce
    BTRFS_IOC_SEND for btrfs send/receive") and it was not functional, then
    it got slightly refactored in e938c8a ("Btrfs: code cleanups for
    send/receive"), alas it was still dead. So let's remove it for good!
    
    Signed-off-by: Nikolay Borisov <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    lorddoskias authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    eb7b9d6 View commit details
    Browse the repository at this point in the history
  5. btrfs: add a flag to iterate_inodes_from_logical to find all extent r…

    …efs for uncompressed extents
    
    The LOGICAL_INO ioctl provides a backward mapping from extent bytenr and
    offset (encoded as a single logical address) to a list of extent refs.
    LOGICAL_INO complements TREE_SEARCH, which provides the forward mapping
    (extent ref -> extent bytenr and offset, or logical address).  These are
    useful capabilities for programs that manipulate extents and extent
    references from userspace (e.g. dedup and defrag utilities).
    
    When the extents are uncompressed (and not encrypted and not other),
    check_extent_in_eb performs filtering of the extent refs to remove any
    extent refs which do not contain the same extent offset as the 'logical'
    parameter's extent offset.  This prevents LOGICAL_INO from returning
    references to more than a single block.
    
    To find the set of extent references to an uncompressed extent from [a, b),
    userspace has to run a loop like this pseudocode:
    
    	for (i = a; i < b; ++i)
    		extent_ref_set += LOGICAL_INO(i);
    
    At each iteration of the loop (up to 32768 iterations for a 128M extent),
    data we are interested in is collected in the kernel, then deleted by
    the filter in check_extent_in_eb.
    
    When the extents are compressed (or encrypted or other), the 'logical'
    parameter must be an extent bytenr (the 'a' parameter in the loop).
    No filtering by extent offset is done (or possible?) so the result is
    the complete set of extent refs for the entire extent.  This removes
    the need for the loop, since we get all the extent refs in one call.
    
    Add an 'ignore_offset' argument to iterate_inodes_from_logical,
    [...several levels of function call graph...], and check_extent_in_eb, so
    that we can disable the extent offset filtering for uncompressed extents.
    This flag can be set by an improved version of the LOGICAL_INO ioctl to
    get either behavior as desired.
    
    There is no functional change in this patch.  The new flag is always
    false.
    
    Signed-off-by: Zygo Blaxell <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ minor coding style fixes ]
    Signed-off-by: David Sterba <[email protected]>
    Zygo Blaxell authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    c995ab3 View commit details
    Browse the repository at this point in the history
  6. btrfs: add a flags argument to LOGICAL_INO and call it LOGICAL_INO_V2

    Now that check_extent_in_eb()'s extent offset filter can be turned off,
    we need a way to do it from userspace.
    
    Add a 'flags' field to the btrfs_logical_ino_args structure to disable
    extent offset filtering, taking the place of one of the existing
    reserved[] fields.
    
    Previous versions of LOGICAL_INO neglected to check whether any of the
    reserved fields have non-zero values.  Assigning meaning to those fields
    now may change the behavior of existing programs that left these fields
    uninitialized.  The lack of a zero check also means that new programs
    have no way to know whether the kernel is honoring the flags field.
    
    To avoid these problems, define a new ioctl LOGICAL_INO_V2.  We can
    use the same argument layout as LOGICAL_INO, but shorten the reserved[]
    array by one element and turn it into the 'flags' field.  The V2 ioctl
    explicitly checks that reserved fields and unsupported flag bits are zero
    so that userspace can negotiate future feature bits as they are defined.
    
    Since the memory layouts of the two ioctls' arguments are compatible,
    there is no need for a separate function for logical_to_ino_v2 (contrast
    with tree_search_v2 vs tree_search where the layout and code are quite
    different).  A version parameter and an 'if' statement will suffice.
    
    Now that we have a flags field in logical_ino_args, add a flag
    BTRFS_LOGICAL_INO_ARGS_IGNORE_OFFSET to get the behavior we want,
    and pass it down the stack to iterate_inodes_from_logical.
    
    Motivation and background, copied from the patchset cover letter:
    
    Suppose we have a file with one extent:
    
        root@tester:~# zcat /usr/share/doc/cpio/changelog.gz > /test/a
        root@tester:~# sync
    
    Split the extent by overwriting it in the middle:
    
        root@tester:~# cat /dev/urandom | dd bs=4k seek=2 skip=2 count=1 conv=notrunc of=/test/a
    
    We should now have 3 extent refs to 2 extents, with one block unreachable.
    The extent tree looks like:
    
        root@tester:~# btrfs-debug-tree /dev/vdc -t 2
        [...]
                item 9 key (1103101952 EXTENT_ITEM 73728) itemoff 15942 itemsize 53
                        extent refs 2 gen 29 flags DATA
                        extent data backref root 5 objectid 261 offset 0 count 2
        [...]
                item 11 key (1103175680 EXTENT_ITEM 4096) itemoff 15865 itemsize 53
                        extent refs 1 gen 30 flags DATA
                        extent data backref root 5 objectid 261 offset 8192 count 1
        [...]
    
    and the ref tree looks like:
    
        root@tester:~# btrfs-debug-tree /dev/vdc -t 5
        [...]
                item 6 key (261 EXTENT_DATA 0) itemoff 15825 itemsize 53
                        extent data disk byte 1103101952 nr 73728
                        extent data offset 0 nr 8192 ram 73728
                        extent compression(none)
                item 7 key (261 EXTENT_DATA 8192) itemoff 15772 itemsize 53
                        extent data disk byte 1103175680 nr 4096
                        extent data offset 0 nr 4096 ram 4096
                        extent compression(none)
                item 8 key (261 EXTENT_DATA 12288) itemoff 15719 itemsize 53
                        extent data disk byte 1103101952 nr 73728
                        extent data offset 12288 nr 61440 ram 73728
                        extent compression(none)
        [...]
    
    There are two references to the same extent with different, non-overlapping
    byte offsets:
    
        [------------------72K extent at 1103101952----------------------]
        [--8K----------------|--4K unreachable----|--60K-----------------]
        ^                                         ^
        |                                         |
        [--8K ref offset 0--][--4K ref offset 0--][--60K ref offset 12K--]
                             |
                             v
                             [-----4K extent-----] at 1103175680
    
    We want to find all of the references to extent bytenr 1103101952.
    
    Without the patch (and without running btrfs-debug-tree), we have to
    do it with 18 LOGICAL_INO calls:
    
        root@tester:~# btrfs ins log 1103101952 -P /test/
        Using LOGICAL_INO
        inode 261 offset 0 root 5
    
        root@tester:~# for x in $(seq 0 17); do btrfs ins log $((1103101952 + x * 4096)) -P /test/; done 2>&1 | grep inode
        inode 261 offset 0 root 5
        inode 261 offset 4096 root 5   <- same extent ref as offset 0
                                       (offset 8192 returns empty set, not reachable)
        inode 261 offset 12288 root 5
        inode 261 offset 16384 root 5  \
        inode 261 offset 20480 root 5  |
        inode 261 offset 24576 root 5  |
        inode 261 offset 28672 root 5  |
        inode 261 offset 32768 root 5  |
        inode 261 offset 36864 root 5  \
        inode 261 offset 40960 root 5   > all the same extent ref as offset 12288.
        inode 261 offset 45056 root 5  /  More processing required in userspace
        inode 261 offset 49152 root 5  |  to figure out these are all duplicates.
        inode 261 offset 53248 root 5  |
        inode 261 offset 57344 root 5  |
        inode 261 offset 61440 root 5  |
        inode 261 offset 65536 root 5  |
        inode 261 offset 69632 root 5  /
    
    In the worst case the extents are 128MB long, and we have to do 32768
    iterations of the loop to find one 4K extent ref.
    
    With the patch, we just use one call to map all refs to the extent at once:
        root@tester:~# btrfs ins log 1103101952 -P /test/
        Using LOGICAL_INO_V2
        inode 261 offset 0 root 5
        inode 261 offset 12288 root 5
    
    The TREE_SEARCH ioctl allows userspace to retrieve the offset and
    extent bytenr fields easily once the root, inode and offset are known.
    This is sufficient information to build a complete map of the extent
    and all of its references.  Userspace can use this information to make
    better choices to dedup or defrag.
    
    Signed-off-by: Zygo Blaxell <[email protected]>
    Reviewed-by: Hans van Kranenburg <[email protected]>
    Tested-by: Hans van Kranenburg <[email protected]>
    [ copy background and motivation from cover letter ]
    Signed-off-by: David Sterba <[email protected]>
    Zygo Blaxell authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    d24a67b View commit details
    Browse the repository at this point in the history
  7. btrfs: increase output size for LOGICAL_INO_V2 ioctl

    Build-server workloads have hundreds of references per file after dedup.
    Multiply by a few snapshots and we quickly exhaust the limit of 2730
    references per extent that can fit into a 64K buffer.
    
    Raise the limit to 16M to be consistent with other btrfs ioctls
    (e.g. TREE_SEARCH_V2, FILE_EXTENT_SAME).
    
    To minimize surprising userspace behavior, apply this change only to
    the LOGICAL_INO_V2 ioctl.
    
    Signed-off-by: Zygo Blaxell <[email protected]>
    Reviewed-by: Hans van Kranenburg <[email protected]>
    Tested-by: Hans van Kranenburg <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Zygo Blaxell authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    b115e3b View commit details
    Browse the repository at this point in the history
  8. Btrfs: rework outstanding_extents

    Right now we do a lot of weird hoops around outstanding_extents in order
    to keep the extent count consistent.  This is because we logically
    transfer the outstanding_extent count from the initial reservation
    through the set_delalloc_bits.  This makes it pretty difficult to get a
    handle on how and when we need to mess with outstanding_extents.
    
    Fix this by revamping the rules of how we deal with outstanding_extents.
    Now instead everybody that is holding on to a delalloc extent is
    required to increase the outstanding extents count for itself.  This
    means we'll have something like this
    
    btrfs_delalloc_reserve_metadata	- outstanding_extents = 1
     btrfs_set_extent_delalloc	- outstanding_extents = 2
    btrfs_release_delalloc_extents	- outstanding_extents = 1
    
    for an initial file write.  Now take the append write where we extend an
    existing delalloc range but still under the maximum extent size
    
    btrfs_delalloc_reserve_metadata - outstanding_extents = 2
      btrfs_set_extent_delalloc
        btrfs_set_bit_hook		- outstanding_extents = 3
        btrfs_merge_extent_hook	- outstanding_extents = 2
    btrfs_delalloc_release_extents	- outstanding_extnets = 1
    
    In order to make the ordered extent transition we of course must now
    make ordered extents carry their own outstanding_extent reservation, so
    for cow_file_range we end up with
    
    btrfs_add_ordered_extent	- outstanding_extents = 2
    clear_extent_bit		- outstanding_extents = 1
    btrfs_remove_ordered_extent	- outstanding_extents = 0
    
    This makes all manipulations of outstanding_extents much more explicit.
    Every successful call to btrfs_delalloc_reserve_metadata _must_ now be
    combined with btrfs_release_delalloc_extents, even in the error case, as
    that is the only function that actually modifies the
    outstanding_extents counter.
    
    The drawback to this is now we are much more likely to have transient
    cases where outstanding_extents is much larger than it actually should
    be.  This could happen before as we manipulated the delalloc bits, but
    now it happens basically at every write.  This may put more pressure on
    the ENOSPC flushing code, but I think making this code simpler is worth
    the cost.  I have another change coming to mitigate this side-effect
    somewhat.
    
    I also added trace points for the counter manipulation.  These were used
    by a bpf script I wrote to help track down leak issues.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    8b62f87 View commit details
    Browse the repository at this point in the history
  9. btrfs: add tracepoints for outstanding extents mods

    This is handy for tracing problems with modifying the outstanding
    extents counters.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    dd48d40 View commit details
    Browse the repository at this point in the history
  10. btrfs: make the delalloc block rsv per inode

    The way we handle delalloc metadata reservations has gotten
    progressively more complicated over the years.  There is so much cruft
    and weirdness around keeping the reserved count and outstanding counters
    consistent and handling the error cases that it's impossible to
    understand.
    
    Fix this by making the delalloc block rsv per-inode.  This way we can
    calculate the actual size of the outstanding metadata reservations every
    time we make a change, and then reserve the delta based on that amount.
    This greatly simplifies the code everywhere, and makes the error
    handling in btrfs_delalloc_reserve_metadata far less terrifying.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    69fe2d7 View commit details
    Browse the repository at this point in the history
  11. btrfs: switch args for comp_*_refs

    Make it more consistent, we want the inserted ref to be compared against
    what's already in there.  This will make the order go from lowest seq ->
    highest seq, which will make us more likely to make forward progress if
    there's a seqlock currently held.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    c7ad7c8 View commit details
    Browse the repository at this point in the history
  12. btrfs: add a comp_refs() helper

    Instead of open-coding the delayed ref comparisons, add a helper to do
    the comparisons generically and use that everywhere.  We compare
    sequence numbers last for following patches.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    1d148e5 View commit details
    Browse the repository at this point in the history
  13. btrfs: track refs in a rb_tree instead of a list

    If we get a significant amount of delayed refs for a single block (think
    modifying multiple snapshots) we can end up spending an ungodly amount
    of time looping through all of the entries trying to see if they can be
    merged.  This is because we only add them to a list, so we have O(2n)
    for every ref head.  This doesn't make any sense as we likely have refs
    for different roots, and so they cannot be merged.  Tracking in a tree
    will allow us to break as soon as we hit an entry that doesn't match,
    making our worst case O(n).
    
    With this we can also merge entries more easily.  Before we had to hope
    that matching refs were on the ends of our list, but with the tree we
    can search down to exact matches and merge them at insert time.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    0e0adbc View commit details
    Browse the repository at this point in the history
  14. btrfs: don't call btrfs_start_delalloc_roots in flushoncommit

    We're holding the sb_start_intwrite lock at this point, and doing async
    filemap_flush of the inodes will result in a deadlock if we freeze the
    fs during this operation.  This is because we could do a
    btrfs_join_transaction() in the thread we are waiting on which would
    block at sb_start_intwrite, and thus deadlock.  Using
    writeback_inodes_sb() side steps the problem by not introducing all of
    these extra locking dependencies.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    ce8ea7c View commit details
    Browse the repository at this point in the history
  15. btrfs: move btrfs_truncate_block out of trans handle

    Since we do a delalloc reserve in btrfs_truncate_block we can deadlock
    with freeze.  If somebody else is trying to allocate metadata for this
    inode and it gets stuck in start_delalloc_inodes because of freeze we
    will deadlock.  Be safe and move this outside of a trans handle.  This
    also has a side-effect of making sure that we're not leaving stale data
    behind in the other_encoding or encryption case.  Not an issue now since
    nobody uses it, but it would be a problem in the future.
    
    Signed-off-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    josefbacik authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    ddfae63 View commit details
    Browse the repository at this point in the history
  16. Btrfs: compression: separate heuristic/compression workspaces

    Compression heuristic itself is not a compression type, as current
    infrastructure provides workspaces for several compression types, it's
    difficult to just add heuristic workspace.
    
    Just refactor the code to support compression/heuristic workspaces with
    maximum code sharing and minimum changes in it.
    
    Signed-off-by: Timofey Titovets <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ coding style fixes ]
    Signed-off-by: David Sterba <[email protected]>
    nefelim4ag authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    4e439a0 View commit details
    Browse the repository at this point in the history
  17. Btrfs: heuristic: add bucket and sample counters and other defines

    Add basic defines and structures for data sampling.
    
    Added macros:
     - For future sampling algo
     - For bucket size
    
    Heuristic workspace:
     - Add bucket for storing byte type counters
     - Add sample array for storing partial copy of input data range
     - Add counter for store current sample size to workspace
    
    Signed-off-by: Timofey Titovets <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ minor coding style fixes, comments updated ]
    Signed-off-by: David Sterba <[email protected]>
    nefelim4ag authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    17b5a6c View commit details
    Browse the repository at this point in the history
  18. Btrfs: heuristic: implement sampling logic

    Copy sample data from the input data range to sample buffer then
    calculate byte value count for that sample into bucket.
    
    Signed-off-by: Timofey Titovets <[email protected]>
    [ minor comment updates ]
    Signed-off-by: David Sterba <[email protected]>
    nefelim4ag authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    a440d48 View commit details
    Browse the repository at this point in the history
  19. Btrfs: heuristic: add detection of repeated data patterns

    Walk over data sample and use memcmp to detect repeated patterns, like
    zeros, but a bit more general.
    
    Signed-off-by: Timofey Titovets <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ minor coding style fixes ]
    Signed-off-by: David Sterba <[email protected]>
    nefelim4ag authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    1fe4f6f View commit details
    Browse the repository at this point in the history
  20. Btrfs: heuristic: add byte set calculation

    Calculate byte set size for data sample:
    - calculate how many unique bytes have been in the sample
    - for all bytes count > 0, check if we're still in the low count range
      (~25%), such data are easily compressible, otherwise furhter analysis
      is needed
    
    Signed-off-by: Timofey Titovets <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ update comments ]
    Signed-off-by: David Sterba <[email protected]>
    nefelim4ag authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    a288e92 View commit details
    Browse the repository at this point in the history
  21. Btrfs: heuristic: add byte core set calculation

    Calculate byte core set for data sample:
    - sort buckets' numbers in decreasing order
    - count how many values cover 90% of the sample
    
    If the core set size is low (<=25%), data are easily compressible.
    If the core set size is high (>=80%), data are not compressible.
    
    Signed-off-by: Timofey Titovets <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ update comments ]
    Signed-off-by: David Sterba <[email protected]>
    nefelim4ag authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    858177d View commit details
    Browse the repository at this point in the history
  22. Btrfs: heuristic: add Shannon entropy calculation

    Byte distribution check in heuristic will filter edge data cases and
    some time fail to classify input data.
    
    Let's fix that by adding Shannon entropy calculation, that will cover
    classification of most other data types.
    
    As Shannon entropy needs log2 with some precision to work, let's use
    ilog2(N) and for increased precision, by do ilog2(pow(N, 4)).
    
    Shannon entropy has been slightly changed to avoid signed numbers and
    division.
    
    The calculation is direct by the formula, successor of precalculated
    table or chains of if-else.
    
    The accuracy errors of ilog2 are compensated by
    
    @ENTROPY_LVL_ACEPTABLE 70 -> 65
    @ENTROPY_LVL_HIGH      85 -> 80
    
    Signed-off-by: Timofey Titovets <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ update comments ]
    Signed-off-by: David Sterba <[email protected]>
    nefelim4ag authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    1956243 View commit details
    Browse the repository at this point in the history
  23. btrfs: Fix bug for misused dev_t when lookup in dev state hash table.

    Fix bug of commit 74d4699 ("block: replace bi_bdev with a gendisk
    pointer and partitions index").
    
    bio_dev(bio) is used to find the dev state in function
    __btrfsic_submit_bio. But when dev_state is added to the hashtable, it
    is using dev_t of block_device.
    
    bio_dev(bio) returns a dev_t of part0 which is different from dev_t in
    block_device(bd_dev). bd_dev in block_device represents the exact
    partition.
    
    block_device.bd_dev =
    	bio->bi_partno (same as block_device.bd_partno) + bio_dev(bio).
    
    When adding a dev_state into hashtable, we use the exact partition dev_t.
    So when looking it up, it should also use the exact partition dev_t.
    
    Reproducer of this bug:
    
    Use MOUNT_OPTIONS="-o check_int" and run btrfs/001 in fstests.
    Then there will be WARNING like below.
    
    WARNING:
    btrfs: attempt to write superblock which references block M @29523968 (sda7     /1111654400/2) which is never written!
    
    Signed-off-by: Gu JinXiang <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Gu JinXiang authored and kdave committed Nov 1, 2017
    Configuration menu
    Copy the full SHA
    d28e649 View commit details
    Browse the repository at this point in the history

Commits on Nov 15, 2017

  1. Btrfs: add write_flags for compression bio

    Compression code path has only flaged bios with REQ_OP_WRITE no matter
    where the bios come from, but it could be a sync write if fsync starts
    this writeback or a normal writeback write if wb kthread starts a
    periodic writeback.
    
    It breaks the rule that sync writes and writeback writes need to be
    differentiated from each other, because from the POV of block layer,
    all bios need to be recognized by these flags in order to do some
    management, e.g. throttlling.
    
    This passes writeback_control to compression write path so that it can
    send bios with proper flags to block layer.
    
    Signed-off-by: Liu Bo <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Nov 15, 2017
    Configuration menu
    Copy the full SHA
    f82b735 View commit details
    Browse the repository at this point in the history
  2. btrfs: Fix transaction abort during failure in btrfs_rm_dev_item

    btrfs_rm_dev_item calls several function under an active transaction,
    however it fails to abort it if an error happens. Fix this by adding
    explicit btrfs_abort_transaction/btrfs_end_transaction calls.
    
    Signed-off-by: Nikolay Borisov <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    lorddoskias authored and kdave committed Nov 15, 2017
    Configuration menu
    Copy the full SHA
    5e9f2ad View commit details
    Browse the repository at this point in the history
  3. btrfs: add missing device::flush_bio puts

    This fixes potential bio leaks, in several error paths. Unfortunatelly
    the device structure freeing is opencoded in many places and I missed
    them when introducing the flush_bio.
    
    Most of the time, devices get freed through call_rcu(..., free_device),
    so it at least it's not that easy to hit the leak, but it's still
    possible through the path that frees stale devices.
    
    Fixes: e0ae999 ("btrfs: preallocate device flush bio")
    Reviewed-by: Nikolay Borisov <[email protected]>
    Reviewed-by: Anand Jain <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    kdave committed Nov 15, 2017
    Configuration menu
    Copy the full SHA
    3065ae5 View commit details
    Browse the repository at this point in the history
  4. btrfs: dev_alloc_list is not protected by RCU, use normal list_del

    The dev_alloc_list list could be protected by various mutexes,
    depending on the context. The list tracks devices that can take part of
    allocating new chunks, so the closest mutex is chunk_mutex. Adding a new
    device from inside the ADD_DEV ioctl will need device_list_mutex and
    registering a new device from the ioctl needs uuid_mutex.
    
    All mutexes naturally guarantee exclusivity against the same context.
    The device ownership can move between the contexts and the exclusivity
    is guaranteed by other means, eg. during the mount with the uuid_mutex.
    
    There's no RCU involved for dev_alloc_list.
    
    Signed-off-by: David Sterba <[email protected]>
    kdave committed Nov 15, 2017
    Configuration menu
    Copy the full SHA
    619c47f View commit details
    Browse the repository at this point in the history
  5. Btrfs: bail out gracefully rather than BUG_ON

    If a file's DIR_ITEM key is invalid (due to memory errors) and gets
    written to disk, a future lookup_path can end up with kernel panic due
    to BUG_ON().
    
    This gets rid of the BUG_ON(), meanwhile output the corrupted key and
    return ENOENT if it's invalid.
    
    Signed-off-by: Liu Bo <[email protected]>
    Reported-by: Guillaume Bouchard <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Liu Bo authored and kdave committed Nov 15, 2017
    Configuration menu
    Copy the full SHA
    56a0e70 View commit details
    Browse the repository at this point in the history
  6. Btrfs: move definition of the function btrfs_find_new_delalloc_bytes

    Move the definition of the function btrfs_find_new_delalloc_bytes() closer
    to the function btrfs_dirty_pages(), because in a future commit it will be
    used exclusively by btrfs_dirty_pages(). This just moves the function's
    definition, with no functional changes at all.
    
    Signed-off-by: Filipe Manana <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    fdmanana authored and kdave committed Nov 15, 2017
    Configuration menu
    Copy the full SHA
    f48bf66 View commit details
    Browse the repository at this point in the history
  7. Btrfs: fix reported number of inode blocks after buffered append writes

    The patch from commit a7e3b97 ("Btrfs: fix reported number of inode
    blocks") introduced a regression where if we do a buffered write starting
    at position equal to or greater than the file's size and then stat(2) the
    file before writeback is triggered, the number of used blocks does not
    change (unless there's a prealloc/unwritten extent). Example:
    
      $ xfs_io -f -c "pwrite -S 0xab 0 64K" foobar
      $ du -h foobar
      0	foobar
      $ sync
      $ du -h foobar
      64K	foobar
    
    The first version of that patch didn't had this regression and the second
    version, which was the one committed, was made only to address some
    performance regression detected by the intel test robots using fs_mark.
    
    This fixes the regression by setting the new delaloc bit in the range, and
    doing it at btrfs_dirty_pages() while setting the regular dealloc bit as
    well, so that this way we set both bits at once avoiding navigation of the
    inode's io tree twice. Doing it at btrfs_dirty_pages() is also the most
    meaninful place, as we should set the new dellaloc bit when if we set the
    delalloc bit, which happens only if we copied bytes into the pages at
    __btrfs_buffered_write().
    
    This was making some of LTP's du tests fail, which can be quickly run
    using a command line like the following:
    
      $ ./runltp -q -p -l /ltp.log -f commands -s du -d /mnt
    
    Fixes: a7e3b97 ("Btrfs: fix reported number of inode blocks")
    Signed-off-by: Filipe Manana <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    fdmanana authored and kdave committed Nov 15, 2017
    Configuration menu
    Copy the full SHA
    e3b8a48 View commit details
    Browse the repository at this point in the history