[ARVADOS] updated: 3bf22ae161deadd56d20c4165d2ec569de2dcdef
git at public.curoverse.com
git at public.curoverse.com
Fri May 22 10:03:25 EDT 2015
Summary of changes:
doc/_config.yml | 1 +
.../install-compute-node.html.textile.liquid | 103 +++++++++++++++++++++
.../install-crunch-dispatch.html.textile.liquid | 70 ++++++++++++--
3 files changed, 168 insertions(+), 6 deletions(-)
create mode 100644 doc/install/install-compute-node.html.textile.liquid
via 3bf22ae161deadd56d20c4165d2ec569de2dcdef (commit)
from 8b658d96a7bb087406fee97ee4ba9feb830abfd4 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
commit 3bf22ae161deadd56d20c4165d2ec569de2dcdef
Author: Ward Vandewege <ward at curoverse.com>
Date: Fri May 22 10:03:01 2015 -0400
Add installation instructions for compute nodes; update the installation
instructions for crunch dispatcher.
No issue #
diff --git a/doc/_config.yml b/doc/_config.yml
index e4dc782..1e68f08 100644
--- a/doc/_config.yml
+++ b/doc/_config.yml
@@ -152,6 +152,7 @@ navbar:
- install/install-keepproxy.html.textile.liquid
- install/install-arv-git-httpd.html.textile.liquid
- install/install-crunch-dispatch.html.textile.liquid
+ - install/install-compute-node.html.textile.liquid
- install/cheat_sheet.html.textile.liquid
- Software prerequisites:
- install/install-manual-prerequisites-ruby.html.textile.liquid
diff --git a/doc/install/install-compute-node.html.textile.liquid b/doc/install/install-compute-node.html.textile.liquid
new file mode 100644
index 0000000..dd64cd0
--- /dev/null
+++ b/doc/install/install-compute-node.html.textile.liquid
@@ -0,0 +1,103 @@
+---
+layout: default
+navsection: installguide
+title: Install a compute node
+...
+
+This installation guide assumes you are on a 64 bit Debian or Ubuntu system.
+
+h2. Install dependencies
+
+First add the Arvados apt repository, and then install a number of packages.
+
+<notextile>
+<pre><code>~$ <span class="userinput">echo "deb http://apt.arvados.org/ wheezy main" | sudo tee /etc/apt/sources.list.d/apt.arvados.org.list</span>
+~$ <span class="userinput">sudo /usr/bin/apt-key adv --keyserver pool.sks-keyservers.net --recv 1078ECD7</span>
+~$ <span class="userinput">sudo /usr/bin/apt-get update</span>
+~$ <span class="userinput">sudo /usr/bin/apt-get install python-pip python-pyvcf python-gflags python-google-api-python-client python-virtualenv libattr1-dev libfuse-dev python-dev python-llfuse fuse crunchstat python-arvados-fuse iptables ca-certificates lxc apt-transport-https docker.io</span>
+</code></pre>
+</notextile>
+
+h2. Install slurm and munge
+
+<notextile>
+<pre><code>~$ <span class="userinput">sudo /usr/bin/apt-get install slurm-llnl munge</span>
+</code></pre>
+</notextile>
+
+h2. Copy configuration files from the dispatcher (api)
+
+The @/etc/slurm-llnl/slurm.conf@ and @/etc/munge/munge.key@ files need to be identicaly across the dispatcher and all compute nodes. Copy the files you created in the "Install the Crunch dispatcher":{{site.baseurl}} step to this compute node.
+
+h2. Crunch user account
+
+* @adduser crunch@
+
+The crunch user should have the same UID, GID, and home directory on all compute nodes and on the dispatcher (api server).
+
+h2. Configure fuse
+
+Install this file as @/etc/fuse.conf@:
+
+<notextile>
+<pre>
+# Set the maximum number of FUSE mounts allowed to non-root users.
+# The default is 1000.
+#
+#mount_max = 1000
+
+# Allow non-root users to specify the 'allow_other' or 'allow_root'
+# mount options.
+#
+user_allow_other
+</pre>
+</notextile>
+
+h2. Tell the API server about this compute node
+
+Load your API superuser token on the compute node:
+
+<notextile>
+<pre><code>
+~$ <span class="userinput">HISTIGNORE=$HISTIGNORE:'export ARVADOS_API_TOKEN=*'</span>
+~$ <span class="userinput">export ARVADOS_API_TOKEN=@your-superuser-token@</span>
+~$ <span class="userinput">export ARVADOS_API_HOST=@uuid_prefix.your.domain@</span>
+~$ <span class="userinput">unset ARVADOS_API_HOST_INSECURE</span>
+</code>
+</pre>
+</notextile>
+
+Then execute this script to create a compute node object, and set up a cron job to have the compute node ping the API server every five minutes:
+
+<notextile>
+<pre><code>
+#!/bin/bash
+if ! test -f /root/node.json ; then
+ arv node create --node "{\"hostname\": \"$(hostname)\"}" > /root/node.json
+
+ # Make sure /dev/fuse permissions are correct (the device appears after fuse is loaded)
+ chmod 1660 /dev/fuse && chgrp fuse /dev/fuse
+fi
+
+UUID=`grep \"uuid\" /root/node.json |cut -f4 -d\"`
+PING_SECRET=`grep \"ping_secret\" /root/node.json |cut -f4 -d\"`
+
+if ! test -f /etc/cron.d/node_ping ; then
+ echo "*/5 * * * * root /usr/bin/curl -k -d ping_secret=$PING_SECRET https://api/arvados/v1/nodes/$UUID/ping" > /etc/cron.d/node_ping
+fi
+
+/usr/bin/curl -k -d ping_secret=$PING_SECRET https://api/arvados/v1/nodes/$UUID/ping?ping_secret=$PING_SECRET
+</code>
+</pre>
+</notextile>
+
+And remove your token from the environment:
+
+<notextile>
+<pre><code>
+~$ <span class="userinput">unset ARVADOS_API_TOKEN</span>
+~$ <span class="userinput">unset ARVADOS_API_HOST</span>
+</code>
+</pre>
+</notextile>
+
diff --git a/doc/install/install-crunch-dispatch.html.textile.liquid b/doc/install/install-crunch-dispatch.html.textile.liquid
index 231d1f4..4a695ca 100644
--- a/doc/install/install-crunch-dispatch.html.textile.liquid
+++ b/doc/install/install-crunch-dispatch.html.textile.liquid
@@ -21,19 +21,77 @@ Install the Python SDK and CLI tools on controller and all compute nodes.
* See "Python SDK":{{site.baseurl}}/sdk/python/sdk-python.html page for details.
-h4. Likely crunch job dependencies
+h4. Slurm
-On compute nodes:
+On the API server, install slurm and munge, and generate a munge key:
-* @pip install --upgrade pyvcf@
+<notextile>
+<pre><code>~$ <span class="userinput">sudo /usr/bin/apt-get install slurm-llnl munge</span>
+~$ <span class="userinput">sudo /usr/sbin/create-munge-key</span>
+</code></pre>
+</notextile>
-h4. Crunch user account
+Now we need to give slurm a configuration file in @/etc/slurm-llnl/slurm.conf at . Here's an example:
+
+<notextile>
+<pre>
+ControlMachine=uuid_prefix.your.domain
+SlurmctldPort=6817
+SlurmdPort=6818
+AuthType=auth/munge
+StateSaveLocation=/tmp
+SlurmdSpoolDir=/tmp/slurmd
+SwitchType=switch/none
+MpiDefault=none
+SlurmctldPidFile=/var/run/slurmctld.pid
+SlurmdPidFile=/var/run/slurmd.pid
+ProctrackType=proctrack/pgid
+CacheGroups=0
+ReturnToService=2
+TaskPlugin=task/affinity
+#
+# TIMERS
+SlurmctldTimeout=300
+SlurmdTimeout=300
+InactiveLimit=0
+MinJobAge=300
+KillWait=30
+Waittime=0
+#
+# SCHEDULING
+SchedulerType=sched/backfill
+SchedulerPort=7321
+SelectType=select/cons_res
+SelectTypeParameters=CR_CPU_Memory
+FastSchedule=1
+#
+# LOGGING
+SlurmctldDebug=3
+#SlurmctldLogFile=
+SlurmdDebug=3
+#SlurmdLogFile=
+JobCompType=jobcomp/none
+#JobCompLoc=
+JobAcctGatherType=jobacct_gather/none
+#
+# COMPUTE NODES
+NodeName=DEFAULT
+PartitionName=DEFAULT MaxTime=INFINITE State=UP
+PartitionName=compute Default=YES Shared=yes
+
+NodeName=compute[0-255]
+
+PartitionName=compute Nodes=compute[0-255]
+</pre>
+</notextile>
+
+Please make sure to update the value of the @ControlMachine@ parameter to the hostname of your dispatcher (api server).
-On compute nodes and controller:
+h4. Crunch user account
* @adduser crunch@
-The crunch user should have the same UID, GID, and home directory on all compute nodes and on the controller.
+The crunch user should have the same UID, GID, and home directory on all compute nodes and on the dispatcher (api server).
h4. Repositories
-----------------------------------------------------------------------
hooks/post-receive
--
More information about the arvados-commits
mailing list