<div dir="ltr">Tom -<div><br></div><div>Thanks for the response.<br><div><br></div><div>Primarily, I am having issues with the keep docker. Docker is providing fully qualified names, e.g. <a href="http://docker.io/arvados/jobs">docker.io/arvados/jobs</a>, and when the job run it seems to be looking for the relative name of the image, e.g. arvados/jobs. I can run the CWL script using cwltool just fine. It seems that Arvados has a problem with the docker image name. I had problems with the <hash>/json vs. <hash>.json issue and was able to apply the fix. I just seemed like the name issue was just another problem and I though that if I could eliminate the "middle man" it would simplify the system and we could actually use Arvados.</div><div><br></div><div>Also, I have noticed that the arvados keep docker cannot handle docker image registries that have a port. For example we have a private registry of <a href="http://172.17.1.121:5000">172.17.1.121:5000</a>. If I try to add an image from the registry to the keep docker, e.g. <a href="http://172.17.1.121:5000/mytest">172.17.1.121:5000/mytest</a>, the keep docker will see the image name as 172.17.1.121 and the tag as 5000/mytest:latest. My solution is to setup a reverse proxy for this, but just another Arvados-ism that requires extra effort.</div><div><br></div><div>I am also very concerned about storage, We plan to have a number of tools that we would use both with arvados and outside of arvados. Considering that avardos stores the docker image as a tar ball of all layers I am concerned that if we make a large number of tools available to arvados, there will be multiple copies of dependency layers in the keep. Also, if we needed to update a dependency layer and decided to update the tools, it seems that we would also need to update arvados along with our private registry. I do understand that the keep does have some level of data deduplication using blocks which should help some, however if the image tarballs are not, for lack of a better word, aligned properly the resulting blocks may split differently and would not be properly deduplicated. Even if the images are 100% deduplicated, if these are tools in our private registry we would still have two copies, one in the private registry and one in the keep.</div></div><div><br></div><div>I can appreciate the issue of having reproducible pipelines but couldn't you just use the image hashes, now provided by docker, to ensure that that the tools of the pipeline have not changed?</div></div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div dir="ltr">George Chlipala, Ph.D.<div><div>Senior Research Specialist</div><div>Research Resources Center</div><div>University of Illinois at Chicago</div></div><div><br></div><div>phone: 312-413-1700</div><div>email: <a href="mailto:gchlip2@uic.edu" target="_blank">gchlip2@uic.edu</a></div></div></div></div></div></div></div></div></div></div></div>
<br><div class="gmail_quote">On Wed, Aug 24, 2016 at 3:26 PM, Tom Morris <span dir="ltr"><<a href="mailto:tfmorris@curoverse.com" target="_blank">tfmorris@curoverse.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Wed, Aug 24, 2016 at 12:06 PM, George Chlipala <<a href="mailto:gchlip2@uic.edu">gchlip2@uic.edu</a>> wrote:<br>
> Is it possible to configure arvados to create and run jobs (pipelines)<br>
> without having to store the docker images in keep, i.e. just have the docker<br>
> on the compute nodes download docker images directly from a docker<br>
> repository?<br>
<br>
</span>This isn't currently possible. One problem with allowing this is that<br>
it would negatively impact the reproducability of the pipeline.<br>
<br>
What is your motivation for wanting to do this?<br>
<span class="HOEnZb"><font color="#888888"><br>
Tom Morris<br>
Director, Product Management<br>
Curoverse<br>
</font></span></blockquote></div><br></div>