[ARVADOS] updated: 68fd6c636f39a90ce602c7bbc658f5e002aa3627

Git user git at public.curoverse.com
Fri Nov 18 12:14:25 EST 2016


Summary of changes:
 sdk/cli/bin/crunch-job | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

  discards  260442aac2e9f1ef6ae913a8c9e69ed9caa2e04c (commit)
       via  68fd6c636f39a90ce602c7bbc658f5e002aa3627 (commit)

This update added new revisions after undoing existing revisions.  That is
to say, the old revision is not a strict subset of the new revision.  This
situation occurs when you --force push a change and generate a repository
containing something like this:

 * -- * -- B -- O -- O -- O (260442aac2e9f1ef6ae913a8c9e69ed9caa2e04c)
            \
             N -- N -- N (68fd6c636f39a90ce602c7bbc658f5e002aa3627)

When this happens we assume that you've already had alert emails for all
of the O revisions, and so we here report only the revisions in the N
branch from the common base, B.

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.


commit 68fd6c636f39a90ce602c7bbc658f5e002aa3627
Author: Tom Clegg <tom at curoverse.com>
Date:   Fri Nov 18 12:14:15 2016 -0500

    10470: Recognize more slurm error messages.
    
    Example from slurm 14.03.9:
    
    srun: error: _server_read: fd 12 got error or unexpected eof reading header
    srun: error: step_launch_notify_io_failure: aborting, io error with slurmstepd on node 0
    srun: Job step aborted: Waiting up to 2 seconds for job step to finish.

diff --git a/sdk/cli/bin/crunch-job b/sdk/cli/bin/crunch-job
index be14be9..078f894 100755
--- a/sdk/cli/bin/crunch-job
+++ b/sdk/cli/bin/crunch-job
@@ -1519,7 +1519,7 @@ sub preprocess_stderr
         $st->{node}->{fail_count}++;
       }
     }
-    elsif ($line =~ /srun: error: (Node failure on|Aborting, .*\bio error\b)/) {
+    elsif ($line =~ /srun: error: .*\b(Node failure on|Aborting, .*\bio error\b)/i) {
       $jobstep[$jobstepidx]->{tempfail} = 1;
       if (defined($job_slot_index)) {
         $slot[$job_slot_index]->{node}->{fail_count}++;

-----------------------------------------------------------------------


hooks/post-receive
-- 




More information about the arvados-commits mailing list