[ARVADOS] created: 427c7d7de5017e322c5bf359ea878c9e7b406337

git at public.curoverse.com git at public.curoverse.com
Tue Feb 9 15:53:18 EST 2016


        at  427c7d7de5017e322c5bf359ea878c9e7b406337 (commit)


commit 427c7d7de5017e322c5bf359ea878c9e7b406337
Author: Peter Amstutz <peter.amstutz at curoverse.com>
Date:   Tue Feb 9 15:53:13 2016 -0500

    8406: Treat EXIT_TEMPFAIL as EXIT_RETRY_UNLOCKED if we have previously gotten
    EXIT_RETRY_UNLOCKED (because the job is now in "Running" state.)

diff --git a/services/api/lib/crunch_dispatch.rb b/services/api/lib/crunch_dispatch.rb
index 05f85c7..06a8a4b 100644
--- a/services/api/lib/crunch_dispatch.rb
+++ b/services/api/lib/crunch_dispatch.rb
@@ -637,7 +637,7 @@ class CrunchDispatch
 
     jobrecord = Job.find_by_uuid(job_done.uuid)
 
-    if exit_status == EXIT_RETRY_UNLOCKED
+    if exit_status == EXIT_RETRY_UNLOCKED or (exit_tempfail and @job_retry_counts[jobrecord.uuid])
       # The job failed because all of the nodes allocated to it
       # failed.  Only this crunch-dispatch process can retry the job:
       # it's already locked, and there's no way to put it back in the

-----------------------------------------------------------------------


hooks/post-receive
-- 




More information about the arvados-commits mailing list