[arvados-dev] Docker image for Crunch jobs

Tom Clegg tom at curoverse.com
Mon Apr 7 13:42:10 EDT 2014


On Mon, Apr 7, 2014 at 10:29 AM, Brett Smith <brett at curoverse.com> wrote:
> One quirk about the current recipe is that the Docker image has to be run in
> privileged mode. arv-crunch-job automatically mounts Keep, and Docker's
> default security policy will prohibit this mount. Longer-term, I wonder if
> it would make sense for the parent compute node to mount Keep and make it
> available to the job container as a volume.

One complication here is that the arv-mount process has to use the
per-job auth token: keeping arv-mount alive longer should give us
better performance in general, but we don't want to share an arv-mount
process between consecutive jobs running on the same compute node.

It's possible for arv-mount to inspect the credentials in the calling
process's environment, which could give us a safety net in case
arv-mount doesn't get set up / taken down correctly between jobs.
"Calling process's credentials don't match what I'm using -> exit."

> If we'd like to make this usable as quickly as possible, probably the best
> way to do that would be to add Dockerfiles as a new installation path in
> crunch-job. Much like it looks for install scripts in the repository, if it
> sees a Dockerfile, it should build that image and run the job inside it. The
> code for that is relatively straightforward; the hard part of this would be
> setting up Docker on the compute nodes, and thinking through the
> configuration to make it robust.

Wouldn't it be easier (and more reproducibility-friendly) to do the
container-building before the job is submitted, and provide a
container ID or collection hash in runtime_constraints when submitting
the job?

> We could also decide to roll that work into the larger Crunch v2 effort.

If "build your docker container, store it in Keep, and use it by
providing its collection hash when calling jobs.create" is close at
hand, we should do it now rather than wait for Crunch 2 to be ready.

> After thinking it over some, I do think that this is the right layer to
> introduce Docker at; or at least, this is the layer where we need to be able
> to prepare and run customized Docker images. If we try to make this image's
> responsibilities part of a larger Docker container, we're introducing space
> where users have to worry that their customizations might interfere with
> Arvados' general operations, which is something we should avoid. We can
> still containerize the compute nodes; we should just keep it separate from
> this work.

Absolutely. We want minimal restrictions/requirements for these
per-job containers. What's the cleanest way to provide the Arvados
SDKs without imposing on user-provided images? Apart from the SDKs
used in the job, there should be no need for any Arvados stuff at all
in the containers where job tasks run, right?

> I'm interesting in hearing other folks' thoughts about where work should be
> focused next.

If we can run job tasks in arbitrary user-provided containers, without
doing much work on crunch-job itself that will get thrown out when we
replace it, we should do that.

A Docker-based test suite that runs some jobs from the tutorials would
also be a big deal.



More information about the Arvados-dev mailing list