[ARVADOS] updated: 5b970a6c9505527e146e73cb4756a64ecc1679cd
git at public.curoverse.com
git at public.curoverse.com
Fri May 1 17:02:58 EDT 2015
Summary of changes:
crunch_scripts/crunchutil/vwd.py | 6 ++++--
crunch_scripts/run-command | 4 ++--
doc/user/topics/run-command.html.textile.liquid | 14 +++++++-------
3 files changed, 13 insertions(+), 11 deletions(-)
via 5b970a6c9505527e146e73cb4756a64ecc1679cd (commit)
from 0a87aad48d7fccfc4d7d56a8628370cb7370d792 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
commit 5b970a6c9505527e146e73cb4756a64ecc1679cd
Author: Peter Amstutz <peter.amstutz at curoverse.com>
Date: Fri May 1 17:02:54 2015 -0400
5787: Error writing output will cause task to fail. Tweak documentation page.
diff --git a/crunch_scripts/crunchutil/vwd.py b/crunch_scripts/crunchutil/vwd.py
index 5b9edf5..5e6f1be 100644
--- a/crunch_scripts/crunchutil/vwd.py
+++ b/crunch_scripts/crunchutil/vwd.py
@@ -40,7 +40,7 @@ def checkin(target_dir):
Keep as normal files (Keep does not support symlinks).
Symlinks to files in the keep mount will result in files in the new
- collection which reference existing Keep blocks, no data copying necessay.
+ collection which reference existing Keep blocks, no data copying necessary.
Returns a new Collection object, with data flushed but the collection record
not saved to the API.
@@ -56,6 +56,7 @@ def checkin(target_dir):
logger = logging.getLogger("arvados")
+ caught_error = False
for root, dirs, files in os.walk(target_dir):
for f in files:
try:
@@ -91,5 +92,6 @@ def checkin(target_dir):
dat = reader.read(64*1024)
except (IOError, OSError) as e:
logger.error(e)
+ caught_error = True
- return outputcollection
+ return (outputcollection, caught_error)
diff --git a/crunch_scripts/run-command b/crunch_scripts/run-command
index 682ebfc..1755eb9 100755
--- a/crunch_scripts/run-command
+++ b/crunch_scripts/run-command
@@ -426,10 +426,10 @@ if "task.vwd" in taskp and "task.foreach" in jobp:
if stat.S_ISLNK(s.st_mode):
os.unlink(os.path.join(root, f))
-outcollection = vwd.checkin(outdir).manifest_text()
+(outcollection, checkin_error) = vwd.checkin(outdir).manifest_text()
# Success if we ran any subprocess, and they all exited 0.
-success = rcode and all(status == 0 for status in rcode.itervalues())
+success = rcode and all(status == 0 for status in rcode.itervalues()) and not checkin_error
api.job_tasks().update(uuid=arvados.current_task()['uuid'],
body={
diff --git a/doc/user/topics/run-command.html.textile.liquid b/doc/user/topics/run-command.html.textile.liquid
index 64f7f79..1ef359d 100644
--- a/doc/user/topics/run-command.html.textile.liquid
+++ b/doc/user/topics/run-command.html.textile.liquid
@@ -73,12 +73,12 @@ table(table table-bordered table-condensed).
|$(dir ...) | Takes a reference to an Arvados collection or directory within an Arvados collection and evaluates to a directory path on the local file system where that directory can be accessed by your command. The path may include a file name, in which case it will evaluate to the parent directory of the file. Uses Python's os.path.dirname(), so "/foo/bar" will evaluate to "/foo" but "/foo/bar/" will evaluate to "/foo/bar". Will raise an error if the directory is not accessible. |
|$(basename ...) | Strip leading directory and trailing file extension from the path provided. For example, $(basename /foo/bar.baz.txt) will evaluate to "bar.baz".|
|$(glob ...) | Take a Unix shell path pattern (supports @*@ @?@ and @[]@) and search the local filesystem, returning the first match found. Use together with $(dir ...) to get a local filesystem path for Arvados collections. For example: $(glob $(dir $(mycollection)/*.bam)) will find the first .bam file in the collection specified by the user parameter "mycollection". If there is more than one match, which one is returned is undefined. Will raise an error if no matches are found.|
-|$(task.tmpdir)|Get the designated temporary directory. This directory will be discarded when the job completes.|
-|$(task.outdir)|Get the designated output directory. The contents of this directory will be saved to Keep when the job completes. A symlink to a file in the keep mount will reference existing Keep blocks in your job output collection, with no data copying or duplication.|
-|$(job.srcdir)|Get the path to your git checkout.|
-|$(node.cores)|Get the number of CPU cores on the node.|
-|$(job.uuid)|Get current job uuid.|
-|$(task.uuid)|Get current task uuid.|
+|$(task.tmpdir)|Designated temporary directory. This directory will be discarded when the job completes.|
+|$(task.outdir)|Designated output directory. The contents of this directory will be saved to Keep when the job completes. A symlink to a file in the keep mount will reference existing Keep blocks in your job output collection, with no data copying or duplication.|
+|$(job.srcdir)|Path to the git working directory ($CRUNCH_SRC).|
+|$(node.cores)|Number of CPU cores on the node.|
+|$(job.uuid)|Current job uuid ($JOB_UUID)|
+|$(task.uuid)|Current task uuid ($TASK_UUID)|
h3. Escape sequences
@@ -241,7 +241,7 @@ h3. task.vwd
Background: because Keep collections are read-only, this does not play well with certain tools that expect to be able to write their outputs alongside their inputs (such as tools that generate indexes that are closely associated with the original file.) The run-command's solution to this is the "virtual working directory".
- at task.vwd@ specifies a Keep collection with the starting contents of the directory. @run-command@ will then populate @task.outdir@ with directories and symlinks to mirror the contents of the @task.vwd@ collection. Your command will then be able to both access its input files and write its output files in @task.outdir at . When the command completes, the output collection will write both the output of your command and the contents of the starting collection. Note that files from the starting collection remain read-only and cannot be altered, but may be deleted or renamed.
+ at task.vwd@ specifies a Keep collection with the starting contents of the output directory. @run-command@ will populate @task.outdir@ with directories and symlinks to mirror the contents of the @task.vwd@ collection. Your command will then be able to both access its input files and write its output files from within @task.outdir at . When the command completes, run-command will write the contents of the output directory, which will include the out of your command as well as symlinks to files in starting collection. Note that files from the starting collection remain read-only and cannot be altered, but may be deleted or renamed.
h3. task.foreach
-----------------------------------------------------------------------
hooks/post-receive
--
More information about the arvados-commits
mailing list