[arvados] updated: 2.7.0-6021-g58942469c2
git repository hosting
git at public.arvados.org
Thu Feb 29 14:52:21 UTC 2024
Summary of changes:
.../cwl/crunchstat-summary.html.textile.liquid | 183 +++++++++++----------
doc/user/cwl/cwl-run-options.html.textile.liquid | 3 +-
doc/user/cwl/images/crunchstat-summary-html.png | Bin 132084 -> 142417 bytes
.../tutorials/wgs-tutorial.html.textile.liquid | 3 +-
4 files changed, 99 insertions(+), 90 deletions(-)
via 58942469c2e85ddf9e1b369ea16d7c0c837eec63 (commit)
from d49af567353a4597e6a478ff871bdc6d3bd50f08 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
commit 58942469c2e85ddf9e1b369ea16d7c0c837eec63
Author: Peter Amstutz <peter.amstutz at curii.com>
Date: Thu Feb 29 09:52:05 2024 -0500
19744: Update documentation
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz at curii.com>
diff --git a/doc/user/cwl/crunchstat-summary.html.textile.liquid b/doc/user/cwl/crunchstat-summary.html.textile.liquid
index 55bc702b6a..a28acd56ec 100644
--- a/doc/user/cwl/crunchstat-summary.html.textile.liquid
+++ b/doc/user/cwl/crunchstat-summary.html.textile.liquid
@@ -11,6 +11,8 @@ SPDX-License-Identifier: CC-BY-SA-3.0
{% include 'tutorial_expectations' %}
+*Note:* Starting from Arvados 2.7.2, these reports are generated automatically by @arvados-cwl-runner@ and can be found as @usage_report.html@ in a container request's log collection.
+
The @crunchstat-summary@ tool can be used to analyze workflow and container performance. It can be installed from packages (@apt install python3-crunchstat-summary@ or @yum install rh-python36-python-crunchstat-summary@), or in a Python virtualenv (@pip install crunchstat_summary@). @crunchstat-summary@ analyzes the crunchstat lines from the logs of a container or workflow and generates a report in text or html format.
h2(#syntax). Syntax
@@ -48,105 +50,110 @@ optional arguments:
</code></pre>
</notextile>
+When @crunchstat-summary@ is given a container or container request uuid for a toplevel workflow runner container, it will generate a report for the whole workflow. If the workflow is big, it can take a long time to generate the report.
+
h2(#examples). Examples
@crunchstat-summary@ prints to stdout. The html report, in particular, should be redirected to a file and then loaded in a browser.
-An example text report for a single workflow step:
+The html report can be generated as follows:
<notextile>
-<pre><code>~$ <span class="userinput">crunchstat-summary --container-request pirca-xvhdp-rs0ef250emtmbj8 --format text</span>
-category metric task_max task_max_rate job_total
-blkio:0:0 read 63067755822 53687091.20 63067755822
-blkio:0:0 write 64484253320 16376234.80 64484253320
-cpu cpus 16 - -
-cpu sys 2147.29 0.60 2147.29
-cpu user 549046.22 15.99 549046.22
-cpu user+sys 551193.51 16.00 551193.51
-fuseop:create count 1 0.10 1
-fuseop:create time 0.01 0.00 0.01
-fuseop:destroy count 0 0 0
-fuseop:destroy time 0 0 0.00
-fuseop:flush count 12 0.70 12
-fuseop:flush time 0.00 0.00 0.00
-fuseop:forget count 0 0 0
-fuseop:forget time 0 0 0.00
-fuseop:getattr count 40 2.70 40
-fuseop:getattr time 0.00 0.00 0.00
-fuseop:lookup count 36 2.90 36
-fuseop:lookup time 0.67 0.07 0.67
-fuseop:mkdir count 0 0 0
-fuseop:mkdir time 0 0 0.00
-fuseop:on_event count 0 0 0
-fuseop:on_event time 0 0 0.00
-fuseop:open count 9 0.30 9
-fuseop:open time 0.00 0.00 0.00
-fuseop:opendir count 0 0 0
-fuseop:opendir time 0 0 0.00
-fuseop:read count 481185 409.60 481185
-fuseop:read time 370.11 2.14 370.11
-fuseop:readdir count 0 0 0
-fuseop:readdir time 0 0 0.00
-fuseop:release count 7 0.30 7
-fuseop:release time 0.00 0.00 0.00
-fuseop:rename count 0 0 0
-fuseop:rename time 0 0 0.00
-fuseop:rmdir count 0 0 0
-fuseop:rmdir time 0 0 0.00
-fuseop:setattr count 0 0 0
-fuseop:setattr time 0 0 0.00
-fuseop:statfs count 0 0 0
-fuseop:statfs time 0 0 0.00
-fuseop:unlink count 0 0 0
-fuseop:unlink time 0 0 0.00
-fuseop:write count 5414406 1123.00 5414406
-fuseop:write time 475.04 0.11 475.04
-fuseops read 481185 409.60 481185
-fuseops write 5414406 1123.00 5414406
-keepcache hit 961402 819.20 961402
-keepcache miss 946 0.90 946
-keepcalls get 962348 820.00 962348
-keepcalls put 961 0.30 961
-mem cache 22748987392 - -
-mem pgmajfault 0 - 0
-mem rss 27185491968 - -
-net:docker0 rx 0 - 0
-net:docker0 tx 0 - 0
-net:docker0 tx+rx 0 - 0
-net:ens5 rx 1100398604 - 1100398604
-net:ens5 tx 1445464 - 1445464
-net:ens5 tx+rx 1101844068 - 1101844068
-net:keep0 rx 63086467386 53687091.20 63086467386
-net:keep0 tx 64482237590 20131128.60 64482237590
-net:keep0 tx+rx 127568704976 53687091.20 127568704976
-statfs available 398721179648 - 398721179648
-statfs total 400289181696 - 400289181696
-statfs used 1568198656 0 1568002048
-time elapsed 34820 - 34820
-# Number of tasks: 1
-# Max CPU time spent by a single task: 551193.51s
-# Max CPU usage in a single interval: 1599.52%
-# Overall CPU usage: 1582.98%
-# Max memory used by a single task: 27.19GB
-# Max network traffic in a single task: 127.57GB
-# Max network speed in a single interval: 53.69MB/s
-# Keep cache miss rate 0.10%
-# Keep cache utilization 99.97%
-# Temp disk utilization 0.39%
-#!! bwamem-samtools-view max RSS was 25927 MiB -- try reducing runtime_constraints to "ram":27541477785
-#!! bwamem-samtools-view max temp disk utilization was 0% of 381746 MiB -- consider reducing "tmpdirMin" and/or "outdirMin"
+<pre><code>~$ <span class="userinput">crunchstat-summary --container-request pirca-xvhdp-rs0ef250emtmbj8 --format html > report.html</span>
</code></pre>
</notextile>
-When @crunchstat-summary@ is given a container or container request uuid for a toplevel workflow runner container, it will generate a report for the whole workflow. If the workflow is big, it can take a long time to generate the report.
+When loaded in a browser:
+
+!(full-width)images/crunchstat-summary-html.png!
+
+<br>
-The equivalent html report can be generated as follows:
+Using @--format text@ will print detailed usage and summary:
<notextile>
-<pre><code>~$ <span class="userinput">crunchstat-summary --container-request pirca-xvhdp-rs0ef250emtmbj8 --format html > report.html</span>
+<pre><code>~$ <span class="userinput">crunchstat-summary --container-request pirca-xvhdp-rs0ef250emtmbj8 --format text</span>
+category metric task_max task_max_rate job_total
+blkio:0:0 read 63067755822 53687091.20 63067755822
+blkio:0:0 write 64484253320 16376234.80 64484253320
+cpu cpus 16 - -
+cpu sys 2147.29 0.60 2147.29
+cpu user 549046.22 15.99 549046.22
+cpu user+sys 551193.51 16.00 551193.51
+fuseop:create count 1 0.10 1
+fuseop:create time 0.01 0.00 0.01
+fuseop:destroy count 0 0 0
+fuseop:destroy time 0 0 0.00
+fuseop:flush count 12 0.70 12
+fuseop:flush time 0.00 0.00 0.00
+fuseop:forget count 0 0 0
+fuseop:forget time 0 0 0.00
+fuseop:getattr count 40 2.70 40
+fuseop:getattr time 0.00 0.00 0.00
+fuseop:lookup count 36 2.90 36
+fuseop:lookup time 0.67 0.07 0.67
+fuseop:mkdir count 0 0 0
+fuseop:mkdir time 0 0 0.00
+fuseop:on_event count 0 0 0
+fuseop:on_event time 0 0 0.00
+fuseop:open count 9 0.30 9
+fuseop:open time 0.00 0.00 0.00
+fuseop:opendir count 0 0 0
+fuseop:opendir time 0 0 0.00
+fuseop:read count 481185 409.60 481185
+fuseop:read time 370.11 2.14 370.11
+fuseop:readdir count 0 0 0
+fuseop:readdir time 0 0 0.00
+fuseop:release count 7 0.30 7
+fuseop:release time 0.00 0.00 0.00
+fuseop:rename count 0 0 0
+fuseop:rename time 0 0 0.00
+fuseop:rmdir count 0 0 0
+fuseop:rmdir time 0 0 0.00
+fuseop:setattr count 0 0 0
+fuseop:setattr time 0 0 0.00
+fuseop:statfs count 0 0 0
+fuseop:statfs time 0 0 0.00
+fuseop:unlink count 0 0 0
+fuseop:unlink time 0 0 0.00
+fuseop:write count 5414406 1123.00 5414406
+fuseop:write time 475.04 0.11 475.04
+fuseops read 481185 409.60 481185
+fuseops write 5414406 1123.00 5414406
+keepcache hit 961402 819.20 961402
+keepcache miss 946 0.90 946
+keepcalls get 962348 820.00 962348
+keepcalls put 961 0.30 961
+mem cache 22748987392 - -
+mem pgmajfault 0 - 0
+mem rss 27185491968 - -
+net:docker0 rx 0 - 0
+net:docker0 tx 0 - 0
+net:docker0 tx+rx 0 - 0
+net:ens5 rx 1100398604 - 1100398604
+net:ens5 tx 1445464 - 1445464
+net:ens5 tx+rx 1101844068 - 1101844068
+net:keep0 rx 63086467386 53687091.20 63086467386
+net:keep0 tx 64482237590 20131128.60 64482237590
+net:keep0 tx+rx 127568704976 53687091.20 127568704976
+statfs available 398721179648 - 398721179648
+statfs total 400289181696 - 400289181696
+statfs used 1568198656 0 1568002048
+time elapsed 34820 - 34820
+# Elapsed time: 9h 40m 20s
+# Assigned instance type: m5.4xlarge
+# Instance hourly price: $0.768
+# Max CPU usage in a single interval: 1599.52%
+# Overall CPU usage: 1582.98%
+# Requested CPU cores: 16
+# Instance VCPUs: 16
+# Max memory used: 25926.11MB
+# Requested RAM: 50000.00MB
+# Maximum RAM request for this instance type: 61736.70MB
+# Max network traffic: 127.57GB
+# Max network speed in a single interval: 53.69MB/s
+# Keep cache miss rate: 0.10%
+# Keep cache utilization: 99.97%
+# Temp disk utilization: 0.39%
</code></pre>
</notextile>
-
-When loaded in a browser:
-
-!(full-width)images/crunchstat-summary-html.png!
diff --git a/doc/user/cwl/cwl-run-options.html.textile.liquid b/doc/user/cwl/cwl-run-options.html.textile.liquid
index 703ec89139..27db90fbd3 100644
--- a/doc/user/cwl/cwl-run-options.html.textile.liquid
+++ b/doc/user/cwl/cwl-run-options.html.textile.liquid
@@ -74,7 +74,8 @@ table(table table-bordered table-condensed).
|==--skip-schemas==| Skip loading of schemas|
|==--trash-intermediate==|Immediately trash intermediate outputs on workflow success.|
|==--no-trash-intermediate==|Do not trash intermediate outputs (default).|
-
+|==--enable-usage-report==|Create usage_report.html with a summary of each step's resource usage.|
+|==--disable-usage-report==|Disable usage report.|
h3(#names). Specify workflow and output names
diff --git a/doc/user/cwl/images/crunchstat-summary-html.png b/doc/user/cwl/images/crunchstat-summary-html.png
index 488541b3a7..3832734e70 100644
Binary files a/doc/user/cwl/images/crunchstat-summary-html.png and b/doc/user/cwl/images/crunchstat-summary-html.png differ
diff --git a/doc/user/tutorials/wgs-tutorial.html.textile.liquid b/doc/user/tutorials/wgs-tutorial.html.textile.liquid
index e3e4310cbb..b64dc828bd 100644
--- a/doc/user/tutorials/wgs-tutorial.html.textile.liquid
+++ b/doc/user/tutorials/wgs-tutorial.html.textile.liquid
@@ -222,7 +222,7 @@ Once your workflow has finished, you can see how long it took the workflow to ru
If we click on the outputs of the workflow, we will see the output collection. It contains the GVCF, tabix index file, and HTML ClinVar report for each analyzed sample (e.g., set of FASTQs). You can open a report in the browser by selecting it from the listing. You can also download a file to your local machine by right-clicking a file and selecting "Download" from the context menu, or from the action menu available from the far right of each listing.
-Logs for the main process can be found back on the workflow process page. Selecting the "LOGS" button at the top navigates down to the logs. You can view the logs directly through that panel, or in the upper right-hand corner select the button with hover-over text "Go to Log collection".
+Logs for the main process can be found back on the workflow process page. Selecting the "LOGS" button at the top navigates down to the logs. You can view the logs directly through that panel, or in the upper right-hand corner select the button with hover-over text "Go to Log collection".
There are several logs available, so here is a basic summary of what some of the more commonly used logs contain. Let's first define a few terms that will help us understand what the logs are tracking.
@@ -238,6 +238,7 @@ node.json gives a high level overview about the instance such as name, price, an
* @crunch-run.txt@ and @crunchstat.txt@
** @crunch-run.txt@ has info about how the container's execution environment was set up (e.g., time spent loading the docker image) and timing/results of copying output data to Keep (if applicable)
** @crunchstat.txt@ has info about resource consumption (RAM, cpu, disk, network) by the container while it was running.
+* @usage_report.html@ can be viewed directly in the browser by clicking on it. It provides a summary and chart of the resource consumption derived from the raw data in @crunchstat.txt at . (Available starting with @arvados-cwl-runner@ 2.7.2).
* @container.json@
** Describes the container (unit of work to be done), contains CWL code, runtime constraints (RAM, vcpus) amongst other details
* @arv-mount.txt@
-----------------------------------------------------------------------
hooks/post-receive
--
More information about the arvados-commits
mailing list