Algebraicfile Format Specification
Overview
This specification describes the algebraicfile file format—a file format for storing or transmitting encrypted data. Typically, an algebraicfile contains encrypted data corresponding to a single file on a file system. The design of the file format allows for an algebraicfile to be read and written as a stream with constant in-memory overhead.
The specification describes version 5 of the format, which is the current version. Programs and packages that read algebraicfile-formatted files should support the latest and all previous versions. Readers must return an error if a version is unknown or unsupported. Writers may support only the latest version.
The recommended filename extension is .algebraic
. The Uniform Type
Identifier is org.littleroot.algebraicfile
.
Each algebraicfile has 6 sections:
Section | Length | Encryption | |
---|---|---|---|
1 | Identifier | 6 byte | none |
2 | Header | 57 byte | none |
3 | Metadata | variable, non-zero | XChaCha20-Poly1305 |
4 | Filler | variable, possibly zero | unspecified |
5 | Data | variable, non-zero | Libsodium crypto_secretstream |
6 | Checksum | 32 byte | none |

Example Algebraicfile
The example below uses an original source file named hello.txt
with the contents "hello, world\n
".
% cat hello.txt
hello, world
% xxd hello.txt
00000000: 6865 6c6c 6f2c 2077 6f72 6c64 0a hello, world.
%
The file hello.txt
was encrypted (with the option turned on to
obfuscate the true length of the original data) into an algebraicfile
named hello.txt.algebraic
. Each section of the resulting algebraicfile
is discussed below.
% xxd hello.txt.algebraic 00000000: 0c75 0d05 0e05 4d77 0805 b407 4a52 714c .u....Mw....JRqL 00000010: 9d28 1a11 5bed 0000 0001 0040 0000 0826 .(..[......@...& 00000020: dd45 b83f 8a34 4f41 2c95 831e adc7 9c1d .E.?.4OA,....... 00000030: 186d dce0 8dd4 7d00 0000 0000 0001 3528 .m....}.......5( 00000040: 94f3 4e1e 70c1 8c17 9a02 795f a817 aaa3 ..N.p.....y_.... 00000050: c20a 04a3 cc47 55c6 03f2 4775 bf87 1caa .....GU...Gu.... 00000060: d60c 4ee5 0723 dd24 5794 c240 8108 a45a ..N..#.$W..@...Z 00000070: 1fc5 c778 2b8e 7656 a394 6094 d0b9 1e89 ...x+.vV..`..... 00000080: a1c0 49a3 fc9c fd79 b1f6 6a11 0a4e 02bb ..I....y..j..N.. 00000090: 943a c4bc a12f 627b 479d 2890 c83f 1b95 .:.../b{G.(..?.. 000000a0: f021 39f2 ad04 b6c6 2ab5 443d c236 3c23 .!9.....*.D=.6<# 000000b0: 0864 4412 c4e4 4e98 b7e5 ddd9 f7bb 698b .dD...N.......i. 000000c0: 732e f0ae ecca f1a7 1b0f 2994 c855 efc3 s.........)..U.. 000000d0: 28fe 7b1a 1481 3dfe 9c00 ec10 ff29 f27b (.{...=......).{ 000000e0: 7c3f caef 5b63 e2a2 2c8e f15f f651 0b9b |?..[c..,.._.Q.. 000000f0: 8250 7559 a066 4bc9 22f6 148d 7437 377b .PuY.fK."...t77{ 00000100: 784b ffd9 97ac d8b6 4d74 882c c13e 2270 xK......Mt.,.>"p 00000110: 71ee 34be 4c62 7c89 6c9f 2052 b117 6b50 q.4.Lb|.l. R..kP 00000120: cafd df39 fa63 9e5f 8d19 35f3 6305 ae8f ...9.c._..5.c... 00000130: 0773 9834 d7d9 d5c4 b128 05dc f878 f570 .s.4.....(...x.p 00000140: 9c84 9e7e 5ae4 7ef3 da19 04e2 c6ae e915 ...~Z.~......... 00000150: 78a6 6999 506f 75f5 6d56 bde3 2d2f cf13 x.i.Pou.mV..-/.. 00000160: 29d8 e045 6594 776a 4542 9d63 9dec b534 )..Ee.wjEB.c...4 00000170: 3921 25d0 6999 949c 2d49 67b8 02bd 39cc 9!%.i...-Ig...9. 00000180: d2fe 71a4 4333 3472 8a2b cf73 143a 7a52 ..q.C34r.+.s.:zR 00000190: 1f79 051c b244 3489 9659 18ac 7be6 7724 .y...D4..Y..{.w$ 000001a0: 8bdf 7a62 1e77 2e2b 251e 05f5 2dfb df22 ..zb.w.+%...-.." 000001b0: 335c b52d b61c edfc 1d2e cb5c f70c 2600 3\.-.......\..&. 000001c0: a404 e458 f7db 0a1a cf21 dd ...X.....!. %
File Structure
1. Identifier section
The Identifier section consists of the following struct binary-encoded in big-endian order.
struct {
magic [5]byte
version uint8
}
The magic value is 0x0c 0x75 0x0d 0x05 0x0e
. The version field is the
algebraicfile format version as an integer; for example, for version 5 of the
format the value is 0x05
. Programs that read an algebraicfile should read
the Identifier section, and based on the version number adjust their parsing
behavior for the remaining sections.
In the example algebraicfile xxd
output from earlier, the Identifier section
is these bytes:
00000000: 0c75 0d05 0e05 4d77 0805 b407 4a52 714c .u....Mw....JRqL - SNIP -
2. Header section
The Header section consists of the following struct binary-encoded in big-endian order.
struct {
salt [16]byte
time uint32
mem uint32
threads uint8
metadata_nonce [24]byte
nextlen int64
}
The salt
, time
, mem
, and threads
fields are parameters for
Argon2id key derivation. The mem
value must be in unit kibibyte (KiB).
The metadata_nonce
field is the nonce for encryption of the Metadata
section. The nextlen
field represents the length in bytes of the
variable-length Metadata section that follows this section.
In the example algebraicfile xxd
output from earlier, the Header section
is these bytes:
00000000: 0c75 0d05 0e05 4d77 0805 b407 4a52 714c .u....Mw....JRqL 00000010: 9d28 1a11 5bed 0000 0001 0040 0000 0826 .(..[......@...& 00000020: dd45 b83f 8a34 4f41 2c95 831e adc7 9c1d .E.?.4OA,....... 00000030: 186d dce0 8dd4 7d00 0000 0000 0001 3528 .m....}.......5( - SNIP -
which breaks down to the hex field values:
salt 4d770805b4074a52714c9d281a115bed
time 00000001
mem 00400000
threads 08
metadata_nonce 26dd45b83f8a344f412c95831eadc79c1d186ddce08dd47d
nextlen 0000000000000135
3. Metadata section
The Metadata section consists of a JSON-encoded
object, encrypted and authenticated with XChaCha20-Poly1305. The section
byte size includes the Poly1305 authentication tag (in other words, the
AEAD overhead). The nonce for the encryption is the metadata_nonce
value in the Header section. The encryption key is derived by hashing a
user-supplied password with Argon2id; the parameters for Argon2id must
match the values in the Header section.
The section largely consists of metadata of the original file. The structure of the JSON-encoded object is:
{
cp: string // packed copyfile(3) data, base64-encoded.
fl: number // length of the Filler section, int64.
m: number // file mode bits, uint32; see Go type fs.FileMode for format.
n: string // filename, final path element only, base64-encoded.
l: string // linkname, present iff original file is a symbolic link, base64-encoded.
u: number // file uid; int64.
g: number // file gid; int64.
mt: number // file modification time; int64.
at: number // file access time; int64.
ct: number // file change time; int64.
bt: number // file birth time; int64.
cs: number // chunk size, in bytes, to use with libsodium crypto_secretstream functions; int64.
}
Details that apply to all fields:
All fields, except the cs
field, are optional in the encoded JSON. For
example, if an algebraicfile represents encrypted in-memory data, then
fields such as the original file's name, its file mode bits, and its
modification time are not applicable and hence can be omitted.
If a field's value is unavailable or invalid, writers must omit the property in its entirety in the encoded JSON. Readers must use "nil", "empty", or "zero" values for missing fields when decoding JSON. Readers must take into account integer precision and sign requirements when decoding numbers from JSON. Readers must skip without error unknown properties present in the encoded JSON.
Details for specific fields:
The fl
field represents the length in bytes of the variable-length
Filler section that follows this section. Note that if the property does
not exist in the encoded JSON, readers must
consider the value to be zero.
The l
field represents the target name for a symbolic link. It must be
present if and only if the original file corresponding to an algebraicfile is
a symbolic link.
The cp
field consists metadata about the original file. The value is the
base64-encoded result of copyfile(3)
called with flags COPYFILE_ACL | COPYFILE_XATTR | COPYFILE_PACK
. Writers should omit the field if the value
cannot be constructed (e.g. because copyfile(3)
isn't available).
In the example algebraicfile xxd
output from earlier, the Metadata section
is the following 309 encrypted bytes—the length having been specified
by the nextlen
field in the Header section.
- SNIP - 00000030: 186d dce0 8dd4 7d00 0000 0000 0001 3528 .m....}.......5( 00000040: 94f3 4e1e 70c1 8c17 9a02 795f a817 aaa3 ..N.p.....y_.... 00000050: c20a 04a3 cc47 55c6 03f2 4775 bf87 1caa .....GU...Gu.... 00000060: d60c 4ee5 0723 dd24 5794 c240 8108 a45a ..N..#.$W..@...Z 00000070: 1fc5 c778 2b8e 7656 a394 6094 d0b9 1e89 ...x+.vV..`..... 00000080: a1c0 49a3 fc9c fd79 b1f6 6a11 0a4e 02bb ..I....y..j..N.. 00000090: 943a c4bc a12f 627b 479d 2890 c83f 1b95 .:.../b{G.(..?.. 000000a0: f021 39f2 ad04 b6c6 2ab5 443d c236 3c23 .!9.....*.D=.6<# 000000b0: 0864 4412 c4e4 4e98 b7e5 ddd9 f7bb 698b .dD...N.......i. 000000c0: 732e f0ae ecca f1a7 1b0f 2994 c855 efc3 s.........)..U.. 000000d0: 28fe 7b1a 1481 3dfe 9c00 ec10 ff29 f27b (.{...=......).{ 000000e0: 7c3f caef 5b63 e2a2 2c8e f15f f651 0b9b |?..[c..,.._.Q.. 000000f0: 8250 7559 a066 4bc9 22f6 148d 7437 377b .PuY.fK."...t77{ 00000100: 784b ffd9 97ac d8b6 4d74 882c c13e 2270 xK......Mt.,.>"p 00000110: 71ee 34be 4c62 7c89 6c9f 2052 b117 6b50 q.4.Lb|.l. R..kP 00000120: cafd df39 fa63 9e5f 8d19 35f3 6305 ae8f ...9.c._..5.c... 00000130: 0773 9834 d7d9 d5c4 b128 05dc f878 f570 .s.4.....(...x.p 00000140: 9c84 9e7e 5ae4 7ef3 da19 04e2 c6ae e915 ...~Z.~......... 00000150: 78a6 6999 506f 75f5 6d56 bde3 2d2f cf13 x.i.Pou.mV..-/.. 00000160: 29d8 e045 6594 776a 4542 9d63 9dec b534 )..Ee.wjEB.c...4 00000170: 3921 25d0 6999 949c 2d49 67b8 02bd 39cc 9!%.i...-Ig...9. - SNIP -
4. Filler section
The Filler section may be used to increase the size of an algebraicfile, in
order to obfuscate the true length of the data. The number of bytes in the
section must match the fl
field in the Metadata section. The bytes
must be indistinguishable from any actual encrypted data.
Readers may ignore the Filler section, by discarding or seeking past fl
bytes after the Metadata section.
The Filler section can have zero length.
In the example algebraicfile xxd
output from earlier, the Filler section
is exactly 1
byte—the length would have been indicated by the encrypted fl
field in the
Metadata section.
- SNIP - 00000170: 3921 25d0 6999 949c 2d49 67b8 02bd 39cc 9!%.i...-Ig...9. - SNIP -
5. Data section
The Data section is all bytes after the Filler section but before the
final, fixed-length Checksum section. The section consists of the
original source file's data, encrypted and authenticated using
libsodium's crypto_secretstream
API. The initial
key provided to crypto_secretstream_xchacha20poly1305_init_push
must be the same key used to encrypt the Metadata section.
The chunk size for crypto_secretstream_xchacha20poly1305_push
calls
must match the cs
value from the Metadata section. Writers may write
data that is fewer than cs
bytes in the final call.
Writers may omit the Data section entirely when there is zero file data;
doing so can save 41 bytes (24 bytes for the init_push
header + 17 bytes
for the push
of the empty, final-tagged message).
In the example algebraicfile xxd
output from earlier, the Data section
is these
54 encrypted bytes:
- SNIP - 00000170: 3921 25d0 6999 949c 2d49 67b8 02bd 39cc 9!%.i...-Ig...9. 00000180: d2fe 71a4 4333 3472 8a2b cf73 143a 7a52 ..q.C34r.+.s.:zR 00000190: 1f79 051c b244 3489 9659 18ac 7be6 7724 .y...D4..Y..{.w$ 000001a0: 8bdf 7a62 1e77 2e2b 251e 05f5 2dfb df22 ..zb.w.+%...-.." - SNIP -
6. Checksum section
The 32-byte checksum is the SHA-256 sum of all the bytes in the preceding sections. Readers may forgo checksum verification.
In the example algebraicfile xxd
output from earlier, the Checksum section
is these
final 32 bytes:
- SNIP - 000001a0: 8bdf 7a62 1e77 2e2b 251e 05f5 2dfb df22 ..zb.w.+%...-.." 000001b0: 335c b52d b61c edfc 1d2e cb5c f70c 2600 3\.-.......\..&. 000001c0: a404 e458 f7db 0a1a cf21 dd ...X.....!.