[ARVADOS] created: 2.1.0-2347-g09d4326a8

Git user git at public.arvados.org
Fri Apr 15 20:54:58 UTC 2022

        at  09d4326a8a853f9a5f05acb434d92c06156ec107 (commit)

commit 09d4326a8a853f9a5f05acb434d92c06156ec107
Author: Peter Amstutz <peter.amstutz at curii.com>
Date:   Fri Apr 15 16:54:41 2022 -0400

    18894: Add section "estimating manifest size"
    Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz at curii.com>

diff --git a/doc/architecture/manifest-format.html.textile.liquid b/doc/architecture/manifest-format.html.textile.liquid
index 1780768bc..9efbef788 100644
--- a/doc/architecture/manifest-format.html.textile.liquid
+++ b/doc/architecture/manifest-format.html.textile.liquid
@@ -60,6 +60,28 @@ A normalized manifest is a manifest that meets the following additional restrict
 * Blocks within a stream are ordered based on order of file tokens of the stream.  A given block is listed at most once in a stream.
 * Filename must not contain @"/"@ (the stream name represents the path prefix)
+h3. Estimating manifest size
+Here's a formula for estimating manifest size as stored in the database, assuming efficiently packed blocks.
+manifest_size =
+   + (total data size / 64 MB) * 40
+   + sum(number of files * 20)
+   + sum(size of all directory paths)
+   + sum(size of all file names)
+Here is the size when including signatures.  The signatures authorize access to fetch the block from a Keep server, as <a href="#token_signatures">described below</a>.  The signed manifest is what is actually transferred to/from the API server and stored in RAM by @arv-mount at .
+manifest_size =
+   + (total data size / 64 MB) * 94
+   + sum(number of files * 20)
+   + sum(size of all directory paths)
+   + sum(size of all file names)
 h3. Example manifests
 A manifest with four files in two directories:
@@ -122,7 +144,7 @@ table(table table-bordered table-condensed).
 |@d41d8cd98f00b204e9800998ecf8427e+0+z@|Hint does not start with uppercase letter|
 |@d41d8cd98f00b204e9800998ecf8427e+0+Zfoo*bar@|Hint contains invalid character @*@|
-h3. Token signatures
+h3(#token_signatures). Token signatures
 A token signature (sign-hint) provides proof-of-access for a data block.  It is computed by taking a SHA1 HMAC of the blob signing token (a shared secret between the API server and keep servers), block digest, current API token, expiration timestamp, and blob signature TTL.



More information about the arvados-commits mailing list