URL: <https://savannah.gnu.org/bugs/?63576>
Summary: Improve find manual with examples and explanations for -size option Project: findutils Submitter: lgtr Submitted: Mon 26 Dec 2022 01:42:25 PM UTC Category: documentation Severity: 3 - Normal Item Group: None Status: None Privacy: Public Assigned to: None Originator Name: Originator Email: Open/Closed: Open Release: 4.9.0 Discussion Lock: Any Fixed Release: None _______________________________________________________ Follow-up Comments: ------------------------------------------------------- Date: Mon 26 Dec 2022 01:42:25 PM UTC By: LGTR <lgtr> For this manual: https://www.gnu.org/software/findutils/manual/html_mono/find.html#Size My suggestion is to improve/clarify the exact operation of the "-size" option with further examples and explanations, since it's difficult to grasp what it really means. Even official GNU documentation like the one for tar wrongly describes how it operates based on file sizes: https://www.gnu.org/software/tar/manual/tar.html#files https://www.gnu.org/software/tar/manual/tar.html#nul Basically they treat default block size as 1K instead of 512B. Some examples and explanation could be as follows: ---- *2.6 _-size_ options and parameters* $ find [path] -size [+-]n[bckwMG] Options are optional. Parameters are mandatory. *2.6.1 BLOCK size and block number* The number parameter for _-size_ represents how many BLOCKS it will be with the block size given by the letter afterwards, being the default block size (no letter) of 512-byte (no units given). If a letter option is present after the number parameter [bckwMG], each BLOCK will be in size: * b: 512-byte block * c: 1-byte block * w: 2-byte block * k: Kibibytes (KiB, units of 1024 bytes), 1024-byte block * M: Mebibytes (MiB, units of 1024 * 1024 = 1048576 bytes), 1048576-byte block * G: Gibibytes (GiB, units of 1024 * 1024 * 1024 = 1073741824 bytes), 1073741824-byte block *2.6.2 Rounding in entire BLOCK size near _-size_ value* _find -size_ does not search rounding up to byte value always, but to x bytes according to specified BLOCK after the number parameter (it can be 1 byte if set like that). It finds all files nearing its size in a range of the BLOCK size. Also, you can put a + or - sign in front of the number parameter to search below or above the rounded BLOCK size, as explanation follows: * The "-" range modifier option will set the file size up to, including, 1 BLOCK less times the specified number: $ find . -size -400 -print Formula is "file-size <= (n-1)*BLOCK". This means print files up to, including (400-1)*512-byte. Will print files <= 204288 bytes (up to, including, 204288 bytes, which equals to 399*512-byte, which equals to 199.5K) * If no range modifier is set (may equal to ~ or ≈), will set file size between the range of bigger than 1 BLOCK less the number specified, up to, including, times BLOCK the number specified: find . -size 400 -print Formula is "(n-1)*BLOCK < file-size <= (n)*BLOCK", which is the same as: "file-size > (n-1)*BLOCK" & "file-size <= (n)*BLOCK" This means print files bigger than (400-1)*512-byte up to, including (400)*512-byte. Will print files > 204288 and <= 204800 bytes. * The "+" range modifier sets file size to strictly bigger than BLOCK times: find . -size +400 -print Formula is "file-size > (n)*BLOCK". This means print files bigger than (400)*512-byte. Will print files > 204800 bytes. *2.6.3 Size ranges* As this example demonstrates, any file size is covered, if the 3 range modifiers were to be used being the rest of the parameters equal, so there is no case where certain file sizes will not be covered leaving a "gap". Each range modifier covers a set of BLOCKS, and by all three, all sizes are covered. If you want to find files without any rounding, use the "c" parameter, so BLOCK will equal 1 byte (one ASCII character) and will look intuitive to novice users. Most likely this is the behavior most users expect, since we are accustomed to bit/byte precision even when using multipliers. The example would be then, for 400*512-byte: find . -size -204800c -print will print files lesser than 204800 bytes. find . -size 204800c -print will print files of 204800 bytes. find . -size +204800c -print will print files bigger than 204800 bytes. *2.6.4 Special cases* Some cases can be difficult to grasp or understand for novice users, specially because of the way _find_ operates (in block sizes). If the range modifier is "-" and "n" equals to "1", it will find only files of size 0 regardless of the letter after the number or BLOCK size: $ find . -size -1 -print $ find . -size -1b -print $ find . -size -1c -print $ find . -size -1w -print $ find . -size -1k -print $ find . -size -1M -print $ find . -size -1G -print _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?63576> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/