[ARVADOS] created: 427c7d7de5017e322c5bf359ea878c9e7b406337
git at public.curoverse.com
git at public.curoverse.com
Tue Feb 9 15:53:18 EST 2016
at 427c7d7de5017e322c5bf359ea878c9e7b406337 (commit)
commit 427c7d7de5017e322c5bf359ea878c9e7b406337
Author: Peter Amstutz <peter.amstutz at curoverse.com>
Date: Tue Feb 9 15:53:13 2016 -0500
8406: Treat EXIT_TEMPFAIL as EXIT_RETRY_UNLOCKED if we have previously gotten
EXIT_RETRY_UNLOCKED (because the job is now in "Running" state.)
diff --git a/services/api/lib/crunch_dispatch.rb b/services/api/lib/crunch_dispatch.rb
index 05f85c7..06a8a4b 100644
--- a/services/api/lib/crunch_dispatch.rb
+++ b/services/api/lib/crunch_dispatch.rb
@@ -637,7 +637,7 @@ class CrunchDispatch
jobrecord = Job.find_by_uuid(job_done.uuid)
- if exit_status == EXIT_RETRY_UNLOCKED
+ if exit_status == EXIT_RETRY_UNLOCKED or (exit_tempfail and @job_retry_counts[jobrecord.uuid])
# The job failed because all of the nodes allocated to it
# failed. Only this crunch-dispatch process can retry the job:
# it's already locked, and there's no way to put it back in the
-----------------------------------------------------------------------
hooks/post-receive
--
More information about the arvados-commits
mailing list