[arvados] updated: 2.6.0-354-gc68b24086
git repository hosting
git at public.arvados.org
Fri Jul 28 14:44:15 UTC 2023
Summary of changes:
doc/_includes/_download_installer.liquid | 6 +-
.../_multi_host_install_custom_certificates.liquid | 2 +-
doc/install/salt-multi-host.html.textile.liquid | 112 ++++++++++++++++-----
.../local.params.example.multiple_hosts | 8 +-
tools/salt-install/provision.sh | 6 +-
5 files changed, 99 insertions(+), 35 deletions(-)
via c68b2408668c3bc2092bc7bc372a04154216c52c (commit)
via 746a4b3877acad15f5c58c649a15c873127573c9 (commit)
via d121cf6f26a78225df4213155f31e6b26ee93a43 (commit)
from 89da894c27064b96186de343d421e6422ff1c7d6 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
commit c68b2408668c3bc2092bc7bc372a04154216c52c
Author: Lucas Di Pentima <lucas.dipentima at curii.com>
Date: Fri Jul 28 11:43:39 2023 -0300
20610: Updates docs to reflect changes on arvados.sls configuration.
Arvados-DCO-1.1-Signed-off-by: Lucas Di Pentima <lucas.dipentima at curii.com>
diff --git a/doc/install/salt-multi-host.html.textile.liquid b/doc/install/salt-multi-host.html.textile.liquid
index 3868483c4..e1476a827 100644
--- a/doc/install/salt-multi-host.html.textile.liquid
+++ b/doc/install/salt-multi-host.html.textile.liquid
@@ -192,7 +192,7 @@ The certificates will be requested from Let's Encrypt when you run the installer
* @cluster_int_cidr@ will be used to set @CLUSTER_INT_CIDR@
-* You'll also need @compute_subnet_id@ and @arvados_sg_id@ to set @DriverParameters.SubnetID@ and @DriverParameters.SecurityGroupIDs@ in @local_config_dir/pillars/arvados.sls@ and when you "create a compute image":#create_a_compute_image.
+* You'll also need @compute_subnet_id@ and @arvados_sg_id@ to set @COMPUTE_SUBNET@ and @COMPUTE_SG@ in @local.params@ and when you "create a compute image":#create_a_compute_image.
You can now proceed to "edit local.params* files":#localparams.
@@ -372,15 +372,14 @@ Follow "the instructions to build a cloud compute node image":{{site.baseurl}}/i
h3. Configure the compute image
-Once the image has been created, open @local_config_dir/pillars/arvados.sls@ and edit as follows (AWS specific settings described here, other cloud providers will have similar settings in their respective configuration section):
+Once the image has been created, open @local.params@ and edit as follows (AWS specific settings described here, you will need to make custom changes for other cloud providers):
-# In the @arvados.cluster.Containers.CloudVMs@ section:
-## Set @ImageID@ to the AMI produced by Packer
-## Set @DriverParameters.Region@ to the appropriate AWS region
-## Set @DriverParameters.AdminUsername@ to the admin user account on the image
-## Set the @DriverParameters.SecurityGroupIDs@ list to the VPC security group which you set up to allow SSH connections to these nodes
-## Set @DriverParameters.SubnetID@ to the value of SubnetId of your VPC
-# Update @arvados.cluster.InstanceTypes@ as necessary. The example instance types are for AWS, other cloud providers will of course have different instance types with different names and specifications.
+# Set @COMPUTE_AMI@ to the AMI produced by Packer
+# Set @COMPUTE_AWS_REGION@ to the appropriate AWS region
+# Set @COMPUTE_USER@ to the admin user account on the image
+# Set the @COMPUTE_SG@ list to the VPC security group which you set up to allow SSH connections to these nodes
+# Set @COMPUTE_SUBNET@ to the value of SubnetId of your VPC
+# Update @arvados.cluster.InstanceTypes@ in @local_config_dir/pillars/arvados.sls@ as necessary. The example instance types are for AWS, other cloud providers will of course have different instance types with different names and specifications.
(AWS specific) If m5/c5 node types are not available, replace them with m4/c4. You'll need to double check the values for Price and IncludedScratch/AddedScratch for each type that is changed.
h2(#installation). Begin installation
commit 746a4b3877acad15f5c58c649a15c873127573c9
Author: Lucas Di Pentima <lucas.dipentima at curii.com>
Date: Fri Jul 28 11:32:07 2023 -0300
20610: Adds documentation for load-balancing & rolling upgrades.
Arvados-DCO-1.1-Signed-off-by: Lucas Di Pentima <lucas.dipentima at curii.com>
diff --git a/doc/_includes/_download_installer.liquid b/doc/_includes/_download_installer.liquid
index 724ed1416..461debd49 100644
--- a/doc/_includes/_download_installer.liquid
+++ b/doc/_includes/_download_installer.liquid
@@ -9,7 +9,7 @@ SPDX-License-Identifier: CC-BY-SA-3.0
This is a package-based installation method, however the installation script is currently distributed in source form via @git at . We recommend checking out the git tree on your local workstation, not directly on the target(s) where you want to install and run Arvados.
<notextile>
-<pre><code>git clone https://github.com/arvados/arvados.git
+<pre><code class="userinput">git clone https://github.com/arvados/arvados.git
cd arvados
git checkout {{ branchname }}
cd tools/salt-install
@@ -31,7 +31,7 @@ h3. Using Terraform (AWS specific)
If you are going to use Terraform to set up the infrastructure on AWS, you first need to install the "Terraform CLI":https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli and the "AWS CLI":https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html tool. Then you can initialize the installer.
<notextile>
-<pre><code>CLUSTER=xarv1
+<pre><code class="userinput">CLUSTER=xarv1
./installer.sh initialize ~/setup-arvados-${CLUSTER} {{local_params_src}} {{config_examples_src}} {{terraform_src}}
cd ~/setup-arvados-${CLUSTER}
</code></pre>
@@ -40,7 +40,7 @@ cd ~/setup-arvados-${CLUSTER}
h3. Without Terraform
<notextile>
-<pre><code>CLUSTER=xarv1
+<pre><code class="userinput">CLUSTER=xarv1
./installer.sh initialize ~/setup-arvados-${CLUSTER} {{local_params_src}} {{config_examples_src}}
cd ~/setup-arvados-${CLUSTER}
</code></pre>
diff --git a/doc/_includes/_multi_host_install_custom_certificates.liquid b/doc/_includes/_multi_host_install_custom_certificates.liquid
index 1a51f2991..eac40218c 100644
--- a/doc/_includes/_multi_host_install_custom_certificates.liquid
+++ b/doc/_includes/_multi_host_install_custom_certificates.liquid
@@ -42,7 +42,7 @@ Make sure that all the FQDNs that you will use for the public-facing application
Note: because the installer currently looks for a different certificate file for each service, if you use a single certificate, we recommend creating a symlink for each certificate and key file to the primary certificate and key, e.g.
<notextile>
-<pre><code>ln -s xarv1.crt ${CUSTOM_CERTS_DIR}/controller.crt
+<pre><code class="userinput">ln -s xarv1.crt ${CUSTOM_CERTS_DIR}/controller.crt
ln -s xarv1.key ${CUSTOM_CERTS_DIR}/controller.key
ln -s xarv1.crt ${CUSTOM_CERTS_DIR}/keepproxy.crt
ln -s xarv1.key ${CUSTOM_CERTS_DIR}/keepproxy.key
diff --git a/doc/install/salt-multi-host.html.textile.liquid b/doc/install/salt-multi-host.html.textile.liquid
index 27e732164..3868483c4 100644
--- a/doc/install/salt-multi-host.html.textile.liquid
+++ b/doc/install/salt-multi-host.html.textile.liquid
@@ -30,6 +30,8 @@ SPDX-License-Identifier: CC-BY-SA-3.0
## "Common problems and solutions":#common-problems
# "Initial user and login":#initial_user
# "Monitoring and Metrics":#monitoring
+# "Load balancing controllers":#load_balancing
+## "Rolling upgrades procedure":#rolling-upgrades
# "After the installation":#post_install
h2(#introduction). Introduction
@@ -114,14 +116,14 @@ h4. Set credentials
You will need an AWS access key and secret key to create the infrastructure.
-<pre><code>$ export AWS_ACCESS_KEY_ID="anaccesskey"
-$ export AWS_SECRET_ACCESS_KEY="asecretkey"</code></pre>
+<pre><code class="userinput">export AWS_ACCESS_KEY_ID="anaccesskey"
+export AWS_SECRET_ACCESS_KEY="asecretkey"</code></pre>
h4. Create the infrastructure
Build the infrastructure by running @./installer.sh terraform at . The last stage will output the information needed to set up the cluster's domain and continue with the installer. for example:
-<pre><code>$ ./installer.sh terraform
+<pre><code class="userinput">./installer.sh terraform
...
Apply complete! Resources: 16 added, 0 changed, 0 destroyed.
@@ -278,7 +280,7 @@ _AWS Specific: Go to the AWS console and into the VPC service, there is a column
h3. Parameters from @local.params.secrets@:
# Set each @KEY@ / @TOKEN@ / @PASSWORD@ to a random string. You can use @installer.sh generate-tokens@
-<pre><code>$ ./installer.sh generate-tokens
+<pre><code class="userinput">./installer.sh generate-tokens
BLOB_SIGNING_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
MANAGEMENT_TOKEN=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
SYSTEM_ROOT_TOKEN=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
@@ -387,9 +389,7 @@ At this point, you are ready to run the installer script in deploy mode that wil
Run this in the @~/arvados-setup-xarv1@ directory:
-<pre>
-./installer.sh deploy
-</pre>
+<pre><code class="userinput">./installer.sh deploy</code></pre>
This will install and configure Arvados on all the nodes. It will take a while and produce a lot of logging. If it runs into an error, it will stop.
@@ -403,9 +403,7 @@ If you are running the diagnostics from one of the Arvados machines inside the p
You are an "external client" if you running the diagnostics from your workstation outside of the private network.
-<pre>
-./installer.sh diagnostics (-internal-client|-external-client)
-</pre>
+<pre><code class="userinput">./installer.sh diagnostics (-internal-client|-external-client)</code></pre>
h3(#debugging). Debugging issues
@@ -429,9 +427,7 @@ You can iterate on the config and maintain the cluster by making changes to @loc
If you are debugging a configuration issue on a specific node, you can speed up the cycle a bit by deploying just one node:
-<pre>
-./installer.sh deploy keep0.xarv1.example.com
-</pre>
+<pre><code class="userinput">./installer.sh deploy keep0.xarv1.example.com</code></pre>
However, once you have a final configuration, you should run a full deploy to ensure that the configuration has been synchronized on all the nodes.
@@ -452,7 +448,7 @@ If this happens, you need to
1. correct the database information
2. run @./installer.sh deploy xarv1.example.com@ to update the configuration on the API/controller node
3. Log in to the API/controller server node, then run this command to re-run the post-install script, which will set up the database:
-<pre>dpkg-reconfigure arvados-api-server</pre>
+<pre><code class="userinput">dpkg-reconfigure arvados-api-server</code></pre>
4. Re-run @./installer.sh deploy@ again to synchronize everything, and so that the install steps that need to contact the API server are run successfully.
h4. Missing ENA support (AWS Specific)
@@ -463,9 +459,9 @@ h2(#initial_user). Initial user and login
At this point you should be able to log into the Arvados cluster. The initial URL will be
-https://workbench.${DOMAIN}@
+ at https://workbench.${DOMAIN}@
-If you did *not* "configure a different authentication provider":#authentication you will be using the "Test" provider, and the provision script creates an initial user for testing purposes. This user is configured as administrator of the newly created cluster. It uses the values of @INITIAL_USER@ and @INITIAL_USER_PASSWORD@ the @local.params@ file.
+If you did *not* "configure a different authentication provider":#authentication you will be using the "Test" provider, and the provision script creates an initial user for testing purposes. This user is configured as administrator of the newly created cluster. It uses the values of @INITIAL_USER@ and @INITIAL_USER_PASSWORD@ from the @local.params*@ file.
If you *did* configure a different authentication provider, the first user to log in will automatically be given Arvados admin privileges.
@@ -473,9 +469,9 @@ h2(#monitoring). Monitoring and Metrics
You can monitor the health and performance of the system using the admin dashboard:
-https://grafana.${DOMAIN}@
+ at https://grafana.${DOMAIN}@
-To log in, use username "admin" and @${INITIAL_USER_PASSWORD}@ from @local.conf at .
+To log in, use username "admin" and @${INITIAL_USER_PASSWORD}@ from @local.params.secrets at .
Once logged in, you will want to add the dashboards to the front page.
@@ -486,6 +482,69 @@ Once logged in, you will want to add the dashboards to the front page.
# Visit each dashboard, at the top of the page click on the star next to the title to "Mark as favorite"
# They should now be linked on the front page.
+h2(#load_balancing). Load balancing controllers (optional)
+
+In order to handle high loads and perform rolling upgrades, the controller & api services can be scaled to a number of hosts and the installer make this implementation a fairly simple task.
+
+First, you should take care of the infrastructure deployment: if you use our Terraform code, you will need to set up the @terraform.tfvars@ in @terraform/vpc/@ so that in addition to the node named @controller@ (the load-balancer), a number of @controllerN@ nodes (backends) are defined as needed, and added to the @internal_service_hosts@ list.
+
+We suggest that the backend nodes just hold the controller & api services and nothing else, so they can be easily created or destroyed as needed without other service disruption. Because of this, you will need to set up a custom @dns_aliases@ variable map.
+
+The following is an example @terraform/vpc/terraform.tfvars@ file that describes a cluster with a load-balancer, 2 backend nodes, a separate database node, a keepstore node and a workbench node that will also hold other miscelaneous services:
+
+<pre><code>region_name = "us-east-1"
+cluster_name = "xarv1"
+domain_name = "xarv1.example.com"
+internal_service_hosts = [ "keep0", "database", "controller1", "controller2" ]
+private_ip = {
+ controller = "10.1.1.11"
+ workbench = "10.1.1.15"
+ database = "10.1.2.12"
+ controller1 = "10.1.2.21"
+ controller2 = "10.1.2.22"
+ keep0 = "10.1.2.13"
+}
+dns_aliases = {
+ workbench = [
+ "ws",
+ "workbench2",
+ "keep",
+ "download",
+ "prometheus",
+ "grafana",
+ "*.collections"
+ ]
+}</code></pre>
+
+Once the infrastructure is deployed, you'll then need to define which node will be using the @balancer@ role in @local.params@, as it's being shown in this partial example:
+
+<pre><code>...
+NODES=(
+ [controller.${DOMAIN}]=balancer
+ [controller1.${DOMAIN}]=api,controller
+ [controller2.${DOMAIN}]=api,controller
+ [database.${DOMAIN}]=database
+ [workbench.${DOMAIN}]=monitoring,workbench,workbench2,keepproxy,keepweb,websocket,keepbalance,dispatcher
+ [keep0.${DOMAIN}]=keepstore
+)
+...</code></pre>
+
+h3(#rolling-upgrades). Rolling upgrades procedure
+
+Once you have more than one controller backend node, it's easy to take one of those from the backend pool to upgrade it to a newer version of Arvados (that might involve applying database migrations) by adding its name to the @DISABLED_CONTROLLER@ variable in @local.params at . For example:
+
+<pre><code>...
+DISABLED_CONTROLLER="controller1"
+...</code></pre>
+
+Then, apply the configuration change to just the load-balancer:
+
+<pre><code class="userinput">./installer.sh deploy controller.xarv1.example.com</code></pre>
+
+This will allow you to do the necessary changes to the @controller1@ node without service disruption, as it will not be receiving any traffic until you remove it from the @DISABLED_CONTROLLER@ variable.
+
+You can do the same for the rest of the backend controllers one at a time to complete the upgrade.
+
h2(#post_install). After the installation
As part of the operation of @installer.sh@, it automatically creates a @git@ repository with your configuration templates. You should retain this repository but *be aware that it contains sensitive information* (passwords and tokens used by the Arvados services as well as cloud credentials if you used Terraform to create the infrastructure).
commit d121cf6f26a78225df4213155f31e6b26ee93a43
Author: Lucas Di Pentima <lucas.dipentima at curii.com>
Date: Fri Jul 28 10:48:40 2023 -0300
20610: Removes the need to manually set up the ENABLE_BALANCER variable.
Having the role->nodes map allows us to simplify manual configuration.
Arvados-DCO-1.1-Signed-off-by: Lucas Di Pentima <lucas.dipentima at curii.com>
diff --git a/tools/salt-install/local.params.example.multiple_hosts b/tools/salt-install/local.params.example.multiple_hosts
index b70ad747a..2b4276e29 100644
--- a/tools/salt-install/local.params.example.multiple_hosts
+++ b/tools/salt-install/local.params.example.multiple_hosts
@@ -131,6 +131,13 @@ for node in "${!NODES[@]}"; do
done
done
+# Auto-detects load-balancing mode
+if [ -z "${ROLES['balancer']:-}" ]; then
+ ENABLE_BALANCER="no"
+else
+ ENABLE_BALANCER="yes"
+fi
+
# Host SSL port where you want to point your browser to access Arvados
# Defaults to 443 for regular runs, and to 8443 when called in Vagrant.
# You can point it to another port if desired
@@ -164,7 +171,6 @@ KEEPSTORE0_INT_IP=10.1.2.13
SHELL_INT_IP=10.1.2.17
# Load balancing settings
-ENABLE_BALANCER="no"
DISABLED_CONTROLLER=""
# Performance tuning parameters
diff --git a/tools/salt-install/provision.sh b/tools/salt-install/provision.sh
index 09edaa05f..0937d3a29 100755
--- a/tools/salt-install/provision.sh
+++ b/tools/salt-install/provision.sh
@@ -881,9 +881,9 @@ else
grep -q "letsencrypt" ${P_DIR}/top.sls || echo " - letsencrypt" >> ${P_DIR}/top.sls
grep -q "letsencrypt_${R}_configuration" ${P_DIR}/top.sls || echo " - letsencrypt_${R}_configuration" >> ${P_DIR}/top.sls
- sed -i "s/__CERT_REQUIRES__/cmd: create-initial-cert-${ROLES["balancer"]}*/g;
- s#__CERT_PEM__#/etc/letsencrypt/live/${ROLES["balancer"]}/fullchain.pem#g;
- s#__CERT_KEY__#/etc/letsencrypt/live/${ROLES["balancer"]}/privkey.pem#g" \
+ sed -i "s/__CERT_REQUIRES__/cmd: create-initial-cert-${ROLES['balancer']}*/g;
+ s#__CERT_PEM__#/etc/letsencrypt/live/${ROLES['balancer']}/fullchain.pem#g;
+ s#__CERT_KEY__#/etc/letsencrypt/live/${ROLES['balancer']}/privkey.pem#g" \
${P_DIR}/nginx_${R}_configuration.sls
if [ "${USE_LETSENCRYPT_ROUTE53}" = "yes" ]; then
-----------------------------------------------------------------------
hooks/post-receive
--
More information about the arvados-commits
mailing list