[arvados] Keepproxy Reader failed checksum

George Chlipala gchlip2 at uic.edu
Wed Sep 13 12:31:56 EDT 2017


To whom it may concern -

We are experiencing a large number of "Reader failed checksum" errors when
using the keepproxy and I cannot seem to figure out why.

Here is an snippet from our log for the keepproxy.

Sep 13 11:21:12 arvados keepproxy: 2017/09/13 11:21:12 128.248.171.19,
172.17.0.231:52236 GET
/fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb at 59cbcfe9
200 67108864 34647446
http://arvados-keep01.cri.local:25107/fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb@59cbcfe9
Reader failed checksum
Sep 13 11:21:29 arvados keepproxy: 2017/09/13 11:21:29 128.248.171.19,
172.17.0.231:51688 GET
/fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb at 59cbcfe9
200 67108864 30612102
http://arvados-keep01.cri.local:25107/fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb@59cbcfe9
Reader failed checksum
Sep 13 11:21:44 arvados keepproxy: 2017/09/13 11:21:44 128.248.171.19,
172.17.0.231:50472 GET
/fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb at 59cbcfe9
200 67108864 34956478
http://arvados-keep01.cri.local:25107/fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb@59cbcfe9
Reader failed checksum
Sep 13 11:22:00 arvados keepproxy: 2017/09/13 11:22:00 128.248.171.19,
172.17.0.231:52286 GET
/fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb at 59cbcfe9
200 67108864 63292254
http://arvados-keep01.cri.local:25107/fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb@59cbcfe9
Reader failed checksum

If I look at the keepstore, I see the following messages associated with
those requests from the keepproxy...  In all cases the status code is 200.

Sep 13 11:21:12 prometheus keepstore:
time="2017-09-13T11:21:12.588429359-05:00" level=info msg=response
RequestID=bfpza7ye6fmw remoteAddr="172.17.0.227:59396" reqBytes=0
reqForwardedFor= reqMethod=GET reqPath=
fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb at 59cbcfe9"
respBytes=67108864 respStatus=OK respStatusCode=200 timeToStatus=1.628499
timeTotal=15.000262 timeWriteBody=13.371763
Sep 13 11:21:14 prometheus keepstore:
time="2017-09-13T11:21:14.127563080-05:00" level=debug msg=request
RequestID=bfpzafjxba0b remoteAddr="172.17.0.227:59406" reqBytes=0
reqForwardedFor= reqMethod=GET reqPath=
fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb at 59cbcfe9
"
Sep 13 11:21:29 prometheus keepstore:
time="2017-09-13T11:21:29.125244128-05:00" level=info msg=response
RequestID=bfpzafjxba0b remoteAddr="172.17.0.227:59406" reqBytes=0
reqForwardedFor= reqMethod=GET reqPath=
fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb at 59cbcfe9"
respBytes=67108864 respStatus=OK respStatusCode=200 timeToStatus=0.179427
timeTotal=14.997671 timeWriteBody=14.818244
Sep 13 11:21:29 prometheus keepstore:
time="2017-09-13T11:21:29.838397727-05:00" level=debug msg=request
RequestID=bfpzamrr4y9i remoteAddr="172.17.0.227:59408" reqBytes=0
reqForwardedFor= reqMethod=GET reqPath=
fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb at 59cbcfe9
"
Sep 13 11:21:44 prometheus keepstore:
time="2017-09-13T11:21:44.838600482-05:00" level=info msg=response
RequestID=bfpzamrr4y9i remoteAddr="172.17.0.227:59408" reqBytes=0
reqForwardedFor= reqMethod=GET reqPath=
fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb at 59cbcfe9"
respBytes=67108864 respStatus=OK respStatusCode=200 timeToStatus=0.179266
timeTotal=15.000209 timeWriteBody=14.820943
Sep 13 11:21:45 prometheus keepstore:
time="2017-09-13T11:21:45.863245412-05:00" level=debug msg=request
RequestID=bfpzau4rwy8i remoteAddr="172.17.0.227:59412" reqBytes=0
reqForwardedFor= reqMethod=GET reqPath=
fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb at 59cbcfe9
"
Sep 13 11:22:00 prometheus keepstore:
time="2017-09-13T11:22:00.828749773-05:00" level=info msg=response
RequestID=bfpzau4rwy8i remoteAddr="172.17.0.227:59412" reqBytes=0
reqForwardedFor= reqMethod=GET reqPath=
fe194b0a4576d5c8d9d0a2055b132b60+67108864+Ab7cbd46f674287a7e906d009c18227475a83edbb at 59cbcfe9"
respBytes=67108864 respStatus=OK respStatusCode=200 timeToStatus=0.508386
timeTotal=14.965493 timeWriteBody=14.457107

When I check the blocks on the keepstore it seems that the blocks are not
corrupted.

ROOT [prometheus:/mnt/keep/fe1] # openssl dgst -md5 -hex *
MD5(fe125c2660a88a250a8d95542a80a665)= fe125c2660a88a250a8d95542a80a665
MD5(fe1341f67e7c0a53a0a67c664b1385a0)= fe1341f67e7c0a53a0a67c664b1385a0
MD5(fe1763cc1a8713b6c768852bca9207b1)= fe1763cc1a8713b6c768852bca9207b1
MD5(fe194b0a4576d5c8d9d0a2055b132b60)= fe194b0a4576d5c8d9d0a2055b132b60
MD5(fe1b04d3a017e84bef791ecaca2fb764)= fe1b04d3a017e84bef791ecaca2fb764

We are running the following versions of keepproxy and keepstore.

keepproxy-0.1.20170906144951.22418ed-1.x86_64
keepstore-0.1.20170906144951.22418ed-1.x86_64

Any help would be greatly appreciated.

Thanks!

George Chlipala, Ph.D.
Senior Research Specialist
Research Resources Center
University of Illinois at Chicago

phone: 312-413-1700
email: gchlip2 at uic.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.arvados.org/pipermail/arvados/attachments/20170913/ca87513d/attachment.html>


More information about the arvados mailing list