forked from simonsj/fdupes-jody
-
Notifications
You must be signed in to change notification settings - Fork 0
/
jdupes.1
276 lines (257 loc) · 8.82 KB
/
jdupes.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
.TH JDUPES 1
.\" NAME should be all caps, SECTION should be 1-8, maybe w/ subsection
.\" other parms are allowed: see man(7), man(1)
.SH NAME
jdupes \- finds and performs actions upon duplicate files
.SH SYNOPSIS
.B jdupes
[
.I options
]
.I DIRECTORY
\|.\|.\|.
.SH "DESCRIPTION"
Searches the given path(s) for duplicate files. Such files are found by
comparing file sizes, then partial and full file hashes, followed by a
byte-by-byte comparison.
.SH OPTIONS
.TP
.B -@ --loud
output annoying low-level debug info while running
.TP
.B -1 --one-file-system
do not match files that are on different filesystems or devices
.TP
.B -A --nohidden
exclude hidden files from consideration
.TP
.B -B --dedupe
issue the btrfs same-extents ioctl to trigger a deduplication on
disk. The program must be built with btrfs support for this option
to be available
.TP
.B -C --chunksize=\fIBYTES\fR
set the I/O chunk size manually; larger values may improve performance
on rotating media by reducing the number of head seeks required, but
also increases memory usage and can reduce performance in some cases
.TP
.B -D --debug
if this feature is compiled in, show debugging statistics and info
at the end of program execution
.TP
.B -d --delete
prompt user for files to preserve, deleting all others (see
.B CAVEATS
below)
.TP
.B -f --omitfirst
omit the first file in each set of matches
.TP
.B -H --hardlinks
normally, when two or more files point to the same disk area they are
treated as non-duplicates; this option will change this behavior
.TP
.B -h --help
displays help
.TP
.B -i --reverse
reverse (invert) the sort order of matches
.TP
.B -I --isolate
isolate each command-line parameter from one another; only match if the
files are under different parameter specifications
.TP
.B -L --linkhard
replace all duplicate files with hardlinks to the first file in each set
of duplicates
.TP
.B -m --summarize
summarize duplicate files information
.TP
.B -N --noprompt
when used together with \-\-delete, preserve the first file in each set of
duplicates and delete the others without prompting the user
.TP
.B -n --noempty
exclude zero-length files from consideration; this option is the default
behavior and does nothing (also see \fB\-z/--zeromatch\fP)
.TP
.B -O --paramorder
parameter order preservation is more important than the chosen sort; this
is particularly useful with the \fB\-N\fP option to ensure that automatic
deletion behaves in a controllable way
.TP
.B -o --order\fR=\fIWORD\fR
order files according to WORD:
time - sort by modification time
name - sort by filename (default)
.TP
.B -p --permissions
don't consider files with different owner/group or permission bits as
duplicates
.TP
.B -P --print=type
print extra information to stdout; valid options are:
early - matches that pass early size/permission/link/etc. checks
partial - files whose partial hashes match
fullhash - files whose full hashes match
.TP
.B -Q --quick
.B [WARNING: RISK OF DATA LOSS, SEE CAVEATS]
skip byte-for-byte verification of duplicate pairs (use hashes only)
.TP
.B -q --quiet
hide progress indicator
.TP
.B -R --recurse:
for each directory given after this option follow subdirectories
encountered within (note the ':' at the end of option; see the
Examples section below for further explanation)
.TP
.B -r --recurse
for every directory given follow subdirectories encountered within
.TP
.B -l --linksoft
replace all duplicate files with symlinks to the first file in each set
of duplicates
.TP
.B -S --size
show size of duplicate files
.TP
.B -s --symlinks
follow symlinked directories
.TP
.B -v --version
display jdupes version and compilation feature flags
.TP
.B -x --xsize=[+]SIZE (NOTE: deprecated in favor of \-X)
exclude files of size less than SIZE from consideration, or if SIZE is
prefixed with a '+' i.e.
jdupes -x +226 [files]
then exclude files larger than SIZE. Suffixes K/M/G can be used.
.TP
.B -X --exclude=spec:info
exclude files based on specified criteria; supported specs are:
.RS
.IP `size[+-=]:number[suffix]'
Match only if size is greater (+), less than (-), or equal to (=) the
specified number, with an optional multiplier suffix. The +/- and =
specifiers can be combined; ex :"size+=4K" will match if size is greater
than or equal to four kilobytes (4096 bytes). Suffixes supported are
K/M/G/T/P/E with a B or iB extension (all case-insensitive); no extension
or an IB extension specify binary multipliers while a B extension
specifies decimal multipliers (ex: 4K or 4KiB = 4096, 4KB = 4000.)
.RE
.TP
.B -z --zeromatch
consider zero-length files to be duplicates; this replaces the old
default behavior when \fB\-n\fP was not specified
.TP
.B -Z --softabort
if the user aborts the program (as with CTRL-C) act on the matches that
were found before the abort was received. For example, if -L and -Z are
specified, all matches found prior to the abort will be hard linked. The
default behavior without -Z is to abort without taking any actions.
.SH NOTES
A set of arrows are used in hard linking to show what action was taken on
each link candidate. These arrows are as follows:
.TP
.B ---->
This file was successfully hard linked to the first file in the duplicate
chain
.TP
.B -@@->
This file was successfully symlinked to the first file in the chain
.TP
.B -==->
This file was already a hard link to the first file in the chain
.TP
.B -//->
Linking this file failed due to an error during the linking process
.PP
Duplicate files are listed together in groups with each file displayed on a
separate line. The groups are then separated from each other by blank lines.
.SH EXAMPLES
.TP
.B jdupes a --recurse: b
will follow subdirectories under b, but not those under a.
.TP
.B jdupes a --recurse b
will follow subdirectories under both a and b.
.TP
.B jdupes -O dir1 dir3 dir2
will always place 'dir1' results first in any match set (where relevant)
.SH CAVEATS
Using
.B \-1
or
.BR \-\-one\-file\-system
prevents matches that cross filesystems, but a more relaxed form of this
option may be added that allows cross-matching for all filesystems that
each parameter is present on.
When using
.B \-d
or
.BR \-\-delete ,
care should be taken to insure against accidental data loss.
.B \-Z
or
.BR \-\-softabort
used to be --hardabort in jdupes prior to v1.5 and had the opposite behavior.
Defaulting to taking action on abort is probably not what most users would
expect. The decision to invert rather than reassign to a different switch
was made because this feature was still fairly new at the time of the change.
The
.B \-O
or
.BR \-\-paramorder
option allows the user greater control over what appears in the first
position of a match set, specifically for keeping the \fB\-N\fP option
from deleting all but one file in a set in a seemingly random way. All
directories specified on the command line will be used as the sorting
order of result sets first, followed by the sorting algorithm set by
the \fB\-o\fP or \fB\-\-order\fP option. This means that the order of
all match pairs for a single directory specification will retain the
old sorting behavior even if this option is specified.
When used together with options
.B \-s
or
.BR \-\-symlink ,
a user could accidentally preserve a symlink while deleting the file it
points to.
Furthermore, when specifying a particular directory more than once, all
files within that directory will be listed as their own duplicates,
leading to data loss should a user preserve a file without its "duplicate"
(the file itself!).
The
.B \-Q
or
.BR \-\-quick
option only reads each file once, hashes it, and performs comparisons
based solely on the hashes. There is a small but significant risk of a
hash collision which is the purpose of the failsafe byte-for-byte
comparison that this option explicitly bypasses. Do not use it on ANY data
set for which any amount of data loss is unacceptable. This option is not
included in the help text for the program due to its risky nature.
.B You have been warned!
Using the
.B \-C
or
.BR \-\-chunksize
option to override I/O chunk size can increase performance on rotating
storage media by reducing "head thrashing," reading larger amounts of
data sequentially from each file. This tunable size can have bad side
effects; the default size maximizes algorithmic performance without
regard to the I/O characteristics of any given device and uses a modest
amount of memory, but other values may greatly increase memory usage or
incur a lot more system call overhead. Try several different values to
see how they affect performance for your hardware and data set. This
option does not affect match results in any way, so even if it slows
down the file matching process it will not hurt anything.
.SH REPORTING BUGS
Send all bug reports to [email protected] or use the Issue tracker at
http://github.com/jbruchon/jdupes/issues
.SH AUTHOR
jdupes is a fork of 'fdupes' which is maintained by and contains
extra code copyrighted by Jody Bruchon <[email protected]>
Based on 'fdupes' created by Adrian Lopez <[email protected]>