[arvados] updated: 2.5.0-233-g599e2e679
git repository hosting
git at public.arvados.org
Sun Mar 5 22:08:55 UTC 2023
Summary of changes:
doc/_includes/_container_runtime_constraints.liquid | 3 ++-
doc/admin/controlling-container-reuse.html.textile.liquid | 2 +-
doc/api/methods/container_requests.html.textile.liquid | 6 +++++-
lib/config/config.default.yml | 5 +++++
4 files changed, 13 insertions(+), 3 deletions(-)
via 599e2e6798a2497ed56b6dfe917a09260dd74407 (commit)
via dc1837da3a36ab3b273fef7245975ecd92524072 (commit)
via 6392400ce074288657ba47ce98997b502538edda (commit)
via 7541b9c86e07a65139ee508cc2476fdc77ee4310 (commit)
from e1c7395cf6e649876132030c2011434581d3a66a (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
commit 599e2e6798a2497ed56b6dfe917a09260dd74407
Author: Brett Smith <brett.smith at curii.com>
Date: Sun Mar 5 17:06:41 2023 -0500
19981: Explain how keep_cache_* runtime constraints affect reuse
Arvados-DCO-1.1-Signed-off-by: Brett Smith <brett.smith at curii.com>
diff --git a/doc/api/methods/container_requests.html.textile.liquid b/doc/api/methods/container_requests.html.textile.liquid
index 5e15df5ba..fad051f4b 100644
--- a/doc/api/methods/container_requests.html.textile.liquid
+++ b/doc/api/methods/container_requests.html.textile.liquid
@@ -139,9 +139,13 @@ h2(#scheduling_parameters). {% include 'container_scheduling_parameters' %}
h2(#container_reuse). Container reuse
-When a container request is "Committed", the system will try to find and reuse an existing Container with the same command, cwd, environment, output_path, container_image, mounts, secret_mounts, runtime_constraints, runtime_user_uuid, and runtime_auth_scopes being requested. (Hashes in the serialized fields environment, mounts and runtime_constraints use normalized key order.)
+When a container request is "Committed", the system will try to find and reuse an existing Container with the same command, cwd, environment, output_path, container_image, mounts, secret_mounts, runtime_constraints, runtime_user_uuid, and runtime_auth_scopes being requested.
+
+* The serialized fields environment, mounts, and runtime_constraints are normalized when searching.
+* The system will also search for containers with minor variations in the keep_cache_disk and keep_cache_ram runtime_constraints that should not affect the result. This searches for other common values for those constraints, so a container that used a non-default value for these constraints may not be reused by later container requests that use a different value.
In order of preference, the system will use:
+
* The first matching container to have finished successfully (i.e., reached state "Complete" with an exit_code of 0) whose log and output collections are still available.
* The oldest matching "Running" container with the highest progress, i.e., the container that is most likely to finish first.
* The oldest matching "Locked" container with the highest priority, i.e., the container that is most likely to start first.
commit dc1837da3a36ab3b273fef7245975ecd92524072
Author: Brett Smith <brett.smith at curii.com>
Date: Sun Mar 5 16:57:11 2023 -0500
19981: Fix verb tense typo
Arvados-DCO-1.1-Signed-off-by: Brett Smith <brett.smith at curii.com>
diff --git a/doc/admin/controlling-container-reuse.html.textile.liquid b/doc/admin/controlling-container-reuse.html.textile.liquid
index 787828c14..5e52c6801 100644
--- a/doc/admin/controlling-container-reuse.html.textile.liquid
+++ b/doc/admin/controlling-container-reuse.html.textile.liquid
@@ -10,7 +10,7 @@ Copyright (C) The Arvados Authors. All rights reserved.
SPDX-License-Identifier: CC-BY-SA-3.0
{% endcomment %}
-Sometimes a container exited successfully but produced bad output, and re-running the workflow will cause it to re-use the bad container instead of running a new container. One way to deal with this is to re-run the entire workflow with reuse disable. Another way is for the workflow author to tweak the input data or workflow so that on re-run it produces a distinct container request. However, for large or complex workflows both these options may be impractical.
+Sometimes a container exited successfully but produced bad output, and re-running the workflow will cause it to re-use the bad container instead of running a new container. One way to deal with this is to re-run the entire workflow with reuse disabled. Another way is for the workflow author to tweak the input data or workflow so that on re-run it produces a distinct container request. However, for large or complex workflows both these options may be impractical.
To prevent an individual container from being reused in later workflows, an admin can manually change the state of the bad container record from @Complete@ to @Cancelled at . The following @arv@ command demonstrates how change a container state to @Cancelled@, where @xxxxx-xxxxx-xxxxxxxxxxxxxxx@ is the @UUID@ of the container:
commit 6392400ce074288657ba47ce98997b502538edda
Author: Brett Smith <brett.smith at curii.com>
Date: Sun Mar 5 16:55:42 2023 -0500
19981: Document the keep_cache_disk runtime constraint
Arvados-DCO-1.1-Signed-off-by: Brett Smith <brett.smith at curii.com>
diff --git a/doc/_includes/_container_runtime_constraints.liquid b/doc/_includes/_container_runtime_constraints.liquid
index 3b8df32d4..f6f42d255 100644
--- a/doc/_includes/_container_runtime_constraints.liquid
+++ b/doc/_includes/_container_runtime_constraints.liquid
@@ -12,7 +12,8 @@ table(table table-bordered table-condensed).
|_. Key|_. Type|_. Description|_. Notes|
|ram|integer|Number of ram bytes to be used to run this process.|Optional. However, a ContainerRequest that is in "Committed" state must provide this.|
|vcpus|integer|Number of cores to be used to run this process.|Optional. However, a ContainerRequest that is in "Committed" state must provide this.|
-|keep_cache_ram|integer|Number of keep cache bytes to be used to run this process.|Optional.|
+|keep_cache_disk|integer|When the container process accesses data from Keep via the filesystem, that data will be cached on disk, up to this amount in bytes.|Optional. If your cluster is configured to use a disk cache by default, the default size will match your @ram@ constraint, bounded between 2GiB and 32GiB.|
+|keep_cache_ram|integer|When the container process accesses data from Keep via the filesystem, that data will be cached in memory, up to this amount in bytes.|Optional. If your cluster is configured to use a RAM cache by default, the administrator sets a default cache size.|
|API|boolean|When set, ARVADOS_API_HOST and ARVADOS_API_TOKEN will be set, and container will have networking enabled to access the Arvados API server.|Optional.|
|cuda|object|Request CUDA GPU support, see below|Optional.|
commit 7541b9c86e07a65139ee508cc2476fdc77ee4310
Author: Brett Smith <brett.smith at curii.com>
Date: Sun Mar 5 16:44:40 2023 -0500
19981: Add config note about impact of changing DefaultKeepCacheRAM
Arvados-DCO-1.1-Signed-off-by: Brett Smith <brett.smith at curii.com>
diff --git a/lib/config/config.default.yml b/lib/config/config.default.yml
index 527439571..b17523e35 100644
--- a/lib/config/config.default.yml
+++ b/lib/config/config.default.yml
@@ -992,6 +992,11 @@ Clusters:
# disk cache size will use a disk cache, sized to the
# container's RAM requirement (but with minimum 2 GiB and
# maximum 32 GiB).
+ #
+ # Note: If you change this value, containers that used the previous
+ # default value will only be reused by container requests that
+ # explicitly specify the previous value in their keep_cache_ram
+ # runtime constraint.
DefaultKeepCacheRAM: 0
# Number of times a container can be unlocked before being
-----------------------------------------------------------------------
hooks/post-receive
--
More information about the arvados-commits
mailing list