Meta format for Archive and compression formats

Meta format for archive-and-compression formats

Intro

The actual files in archive-and-compression formats are compressed with pretty much established algorithms: What remains is the meta data. By using a standardized Data exchange format (like json) for the meta data, it becomes almost trivial to create your own archive-and-compression format.

Of course the other archive-and-compression formats (List_of_archive_formats) have (of what I can tell (I haven't checked them all)) custom formats that most likely use less space for the meta data. But today, the actual data is so big (images, videos ...) that the size of the meta-data becomes negligibly small.

Format description

Name Offset (bytes) Size (bytes) Description
strID 0 4 String to identify the file type (In my implementation: "mZip")
intVersion 4 2 Version number
intSerializationFormat 6 1 Serialization method of objMeta (0=json ...)
intSize 7 8 Archive size (total including headers and footers) (set to 0 if the size is unknown)
File 0 15
File 1
...
objMeta iMeta (archive size)-iMeta-8 Serialized object of all the meta data of the files
iMeta (archive size)-8 8 objMeta start position
The binary fields (intVersion, intSize and iMeta) are big endian (My view on Big or little endian).

intSerializationFormat (How is objMeta serialized )

Several other Data exchange formats exist.

Using a binary format would allow for storing for example thumbnails.

Standard fields of objMeta

Exactly how objMeta should look is really a question of its own.

The structure that I use in my implementation (link below):

{
  IStart:[14, 564],                 // Pointers to the files.
  StrName:["oak.txt", "oak.jpg"],   // File names
  IntCompMethod:[1, 0],             // Compression method (like zip, see more below)
  IntTMod:[1474476071, 1474476071], // Last modification times (in unix time)
  IntSize:[2010, 4043],             // File sizes (uncompressed)
  IntCompSize:[550, 4043],          // File sizes (compressed)
  StrSha1:["fb4a947efc3d959858e32ee856b4011a7e01e4f6", "ff45ba2084497bc0cdf49dc0aac52816f3c48143"] // Sha1 hash codes 
}
Other fields that one might want use:

IntCompMethod:

(like in the zip-format)

My implementation (mZipper)