-
-
Notifications
You must be signed in to change notification settings - Fork 126
/
Copy pathcache-file-format.txt
271 lines (183 loc) · 8.43 KB
/
cache-file-format.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
The QDirStat Cache File Format V2.0
===================================
Author: Stefan Hundhammer <Stefan.Hundhammer@gmx.de>
Updated: 2024-03-01
QDirStat can read cache files in either gzip or plain text (uncompressed)
format. The file format is line oriented.
Empty lines as well as lines with a '#' character as their first
non-whitespace character are ignored.
To generate a cache file, you can use QDirstat ("File" -> "Write Cache File")
or the supplied qdirstat-cache-writer script in the scripts/ directory of the
QDirStat sources.
Example:
[qdirstat 2.0 cache file]
# Generated by qdirstat-cache-writer
# Do not edit!
#
# Type path size uid gid perm. mtime <optional fields>
D /work/home/sh/src/qdirstat 4096 1000 1000 0775 0x65e0ce47
# Device: /dev/nvme0n1p5
F .qmake.stash 739 1000 1000 0664 0x65915c05
F .gitignore 22 1000 1000 0664 0x5fb9043d
F LICENSE 18092 1000 1000 0664 0x4ba94fed
F qdirstat.pro.user 20372 1000 1000 0664 0x622b43a9
F qdirstat.pro 832 1000 1000 0664 0x6151907d
F README.md 45874 1000 1000 0664 0x65ce5618
F Makefile 35154 1000 1000 0664 0x65915c05
D /work/home/sh/src/qdirstat/src 20480 1000 1000 0775 0x65d7ba63
F OpenDirDialog.cpp 9411 1000 1000 0664 0x65b3e4b5
F ui_mime-category-config-page.h 12390 1000 1000 0664 0x65931d62
F HeaderTweaker.h 4988 1000 1000 0664 0x6585900b
F SizeColDelegate.cpp 6789 1000 1000 0664 0x6585900b
F SystemFileChecker.h 1389 1000 1000 0664 0x65a95973
F ui_filesystems-window.h 5155 1000 1000 0664 0x65915c09
F PkgQuery.cpp 4632 1000 1000 0664 0x6585900b
F UnreadableDirsWindow.cpp 6945 1000 1000 0664 0x6585900b
F file-type-stats-window.ui 3007 1000 1000 0664 0x615ffa2c
F PercentileStats.cpp 4538 1000 1000 0664 0x6585900b
(End of example)
Header
======
The first line ( "[qdirstat 2.0 cache file]" ) is a header identifying the file
format. This document describes file format 2.0 from early 2024 which includes
fields like UID, GID and permissions.
For the older file format 1.0, see document cache-file-format-v10.txt in the
same directory. That older format is still supported.
Data Lines
==========
The data lines are separated into fields by whitespace (blanks or tabs).
Fields are not surrounded by single or double quotes.
The maximum line length is 1024 bytes.
Mandatory fields in version 2.0 are:
- Type
- Path or name
- Size
- UID (numeric user ID)
- GID (numeric group ID)
- Permissions (octal)
- MTime (time of last modification)
After those mandatory fields there may be optional fields in this order:
- "blocks:" followed by a field with the number of blocks
- "links:" followed by a field with the number of links
The identifiers of those optional fields ("blocks:", "links:") are case
insensitive.
Mandatory fields in version 1.0:
- Type
- Path or name
- Size
- MTime (time of last modification)
No UID, GID, permissions; but the same optional fields as above.
Fields
======
Type
----
Any of:
"F" plain file
"D" directory
"L" (symbolic) link
"BlockDev" block device i-node
"CharDev" character device i-node
"FIFO" FIFO (named pipe)
"Socket" socket
The type field is case insensitive.
Path or Name
------------
Either an absolute path (starting with "/") or only a base name relative to
the last preceding full path in the file.
Directory entries are required to have an absolute path. Entries for plain
files, symlinks, or special files (devices, FIFOs, sockets) may have an
absolute or a relative path.
Hint: To save some disk space with relative paths, it makes sense to
list the plain files in a directory first and then descend into any
subdirectories when writing a cache file.
Paths and names are URL-encoded, i.e. any character (in particular whitespace)
that might otherwise be some kind of delimiter is specified as its hex code
with preceding "%":
with blank -> with%20blank
with%percent -> with%25percent
Take special care for blanks, tabs, newlines, and percent characters. It
does not hurt to escape a few more characters than would strictly be
necessary.
As for encoding, unfortunately this is one big mess. 7 bit ASCII works
alright, but if there are any special characters, everything depends on the
locale in which the user created a file. There is no standard for file name
encodings in file systems; special characters may come in all kinds of
flavours - in Latin-1 (ISO-8859-1), Latin-2, UTF-8, Japanese, Korean, Chinese,
whatever. In an ideal world, the file system would take care about this and
normalize file names with non-ASCII (7 bit) characters, but that doesn't
happen. So if one user uses, say, Latin-1 and another uses UTF-8, a file system
may have files with different encodings for each file name. Worse yet, the same
(special, i.e. non-7-bit-ASCII) character will be be stored in different
character representations for different file names.
Those character representations is what readdir() returns. There is no way to
tell in which encoding a name may come, so there is also no way to convert it
to a well-defined standard encoding like, say, UTF-8. So what gets stored in
the cache file is the same byte sequence as returned by readdir(). Those names
encoded in something other than the current locale of KDE where QDirStat is
running will of course be displayed with garbage letters, but this cannot be
helped. This is just the same as when the name is read directly from the file
system with readdir(). Bad luck.
Size
----
The entry's size (st_size in struct stat as returned by lstat() ).
Note: This is the entry's own size, not the accumulated size of all
children!
This size is given in bytes. It may also have a trailing unit (directly
following the number, without whitespace):
- "K" for kB (1024 bytes)
- "M" for MB (1024 kB)
- "G" for GB (1024 MB)
The size is always specified in integer numbers, never in fractional
numbers. So if it cannot be divided by a bigger unit without a fractional
part, use the next-lower unit that fits without fraction.
Examples:
1024 -> 1K
1025 -> 1025 (NOT 1.01K or something like this!)
8589934592 -> 8G
8589934593 -> 8589934593 (bad luck)
UID
---
The numeric user ID of the owner of the file or directory.
This is the same number that you see with 'ls -ln' in the third field:
% ls -ln cache-file-format.txt
-rw-rw-r-- 1 1000 1000 7355 Feb 29 18:17 cache-file-format.txt
^^^^
GID
---
The numeric group ID that the file or directory belongs to.
This is the same number that you see with 'ls -ln' in the fourth field:
% ls -ln cache-file-format.txt
-rw-rw-r-- 1 1000 1000 7355 Feb 29 18:17 cache-file-format.txt
^^^^
Permissions
-----------
The numeric permissions in octal notation, just like what you would use with
the 'chmod' command. Examples:
0600 for rw-------
0644 for rw-r--r--
0755 for rwxr-xr-x
Notice that this does not contain the type (file, directory, symbolic link
etc.), just the permissions. It does contain extended permissions like the
setuid, setgid or sticky bit, though.
MTime
------
The entry's last modification time as time_t, i.e., in seconds since
1970-01-01 00:00:00.
May be specified in hex (with preceding 0x) or decimal.
Blocks
------
If a file is a sparse file (and only then) it has a "blocks:" field.
This is the content of the st_blocks field of struct stat as returned by
lstat(): The number of disk blocks actually allocated.
This number multiplied by the block size may be less than what st_size
returns; in this case that file is considered to be a "sparse" file or a
file with "holes".
A block size of 512 bytes is assumed.
Example:
blocks: 17
This file has 17*512 bytes allocated.
Links
-----
If a non-directory entry has more than one hard link, the entry has a
"links:" field indicating the number of hard links:
links: 7