Barometer post installation procedures

This document describes briefly the methods of validating the Barometer installation.

Automated post installation activities

The Barometer test-suite in Functest is called barometercollectd and is part of the Features tier. Running these tests is done automatically by the OPNFV deployment pipeline on the supported scenarios. The testing consists of basic verifications that each plugin is functional per their default configurations. Inside the Functest container, the detailed results can be found in the /home/opnfv/functest/results/barometercollectd.log.

Barometer post configuration procedures

The functionality for each plugin (such as enabling/disabling and configuring its capabilities) is controlled as described in the User Guide through their individual .conf file located in the /etc/collectd/collectd.conf.d/ folder on the compute node(s). In order for any changes to take effect, the collectd service must be stopped and then started again.

Platform components validation - Apex

The following steps describe how to perform a simple “manual” testing of the Barometer components:

On the controller:

  1. Get a list of the available metrics:

    $ openstack metric list
    
  2. Take note of the ID of the metric of interest, and show the measures of this metric:

    $ openstack metric measures show <metric_id>
    
  3. Watch the measure list for updates to verify that metrics are being added:

    $ watch –n2 –d openstack metric measures show <metric_id>
    

More on testing and displaying metrics is shown below.

On the compute:

  1. Connect to any compute node and ensure that the collectd service is running. The log file collectd.log should contain no errors and should indicate that each plugin was successfully loaded. For example, from the Jump Host:

    $ opnfv-util overcloud compute0
    $ ls /etc/collectd/collectd.conf.d/
    $ systemctl status collectd
    $ vi /opt/stack/collectd.log
    

    The following plugings should be found loaded: aodh, gnocchi, hugepages, intel_rdt, mcelog, ovs_events, ovs_stats, snmp, virt

  2. On the compute node, induce an event monitored by the plugins; e.g. a corrected memory error:

    $ git clone https://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git
    $ cd mce-inject
    $ make
    $ modprobe mce-inject
    

    Modify the test/corrected script to include the following:

    CPU 0 BANK 0
    STATUS 0xcc00008000010090
    ADDR 0x0010FFFFFFF
    

    Inject the error:

    $ ./mce-inject < test/corrected
    
  3. Connect to the controller and query the monitoring services. Make sure the overcloudrc.v3 file has been copied to the controller (from the undercloud VM or from the Jump Host) in order to be able to authenticate for OpenStack services.

    $ opnfv-util overcloud controller0
    $ su
    $ source overcloudrc.v3
    $ gnocchi metric list
    $ aodh alarm list
    

    The output for the gnocchi and aodh queries should be similar to the excerpts below:

    +--------------------------------------+---------------------+------------------------------------------------------------------------------------------------------------+-----------+-------------+
    | id                                   | archive_policy/name | name                                                                                                       | unit      | resource_id |
    +--------------------------------------+---------------------+------------------------------------------------------------------------------------------------------------+-----------+-------------+
      [...]
    | 0550d7c1-384f-4129-83bc-03321b6ba157 | high                | overcloud-novacompute-0.jf.intel.com-hugepages-mm-2048Kb@vmpage_number.free                                | Pages     | None        |
    | 0cf9f871-0473-4059-9497-1fea96e5d83a | high                | overcloud-novacompute-0.jf.intel.com-hugepages-node0-2048Kb@vmpage_number.free                             | Pages     | None        |
    | 0d56472e-99d2-4a64-8652-81b990cd177a | high                | overcloud-novacompute-0.jf.intel.com-hugepages-node1-1048576Kb@vmpage_number.used                          | Pages     | None        |
    | 0ed71a49-6913-4e57-a475-d30ca2e8c3d2 | high                | overcloud-novacompute-0.jf.intel.com-hugepages-mm-1048576Kb@vmpage_number.used                             | Pages     | None        |
    | 11c7be53-b2c1-4c0e-bad7-3152d82c6503 | high                | overcloud-novacompute-0.jf.intel.com-mcelog-                                                               | None      | None        |
    |                                      |                     | SOCKET_0_CHANNEL_any_DIMM_any@errors.uncorrected_memory_errors_in_24h                                      |           |             |
    | 120752d4-385e-4153-aed8-458598a2a0e0 | high                | overcloud-novacompute-0.jf.intel.com-cpu-24@cpu.interrupt                                                  | jiffies   | None        |
    | 1213161e-472e-4e1b-9e56-5c6ad1647c69 | high                | overcloud-novacompute-0.jf.intel.com-cpu-6@cpu.softirq                                                     | jiffies   | None        |
      [...]
    
    +--------------------------------------+-------+------------------------------------------------------------------+-------+----------+---------+
    | alarm_id                             | type  | name                                                             | state | severity | enabled |
    +--------------------------------------+-------+------------------------------------------------------------------+-------+----------+---------+
    | fbd06539-45dd-42c5-a991-5c5dbf679730 | event | gauge.memory_erros(overcloud-novacompute-0.jf.intel.com-mcelog)  | ok    | moderate | True    |
    | d73251a5-1c4e-4f16-bd3d-377dd1e8cdbe | event | gauge.mcelog_status(overcloud-novacompute-0.jf.intel.com-mcelog) | ok    | moderate | True    |
      [...]
    

Barometer post installation verification for Compass4nfv

For Fraser release, Compass4nfv integrated the barometer-collectd container of Barometer. As a result, on the compute node, collectd runs in a Docker container. On the controller node, Grafana and InfluxDB are installed and configured.

The following steps describe how to perform simple “manual” testing of the Barometer components after successfully deploying a Barometer scenario using Compass4nfv:

On the compute:

  1. Connect to any compute node and ensure that the collectd container is running.

    root@host2:~# docker ps | grep collectd
    

    You should see the container opnfv/barometer-collectd running.

  2. Testing using mce-inject is similar to testing done in Apex.

On the controller:

3. Connect to the controller and query the monitoring services. Make sure to log in to the lxc-utility container before using the OpenStack CLI. Please refer to this wiki for details: https://wiki.opnfv.org/display/compass4nfv/Containerized+Compass#ContainerizedCompass-HowtouseOpenStackCLI

root@host1-utility-container-d15da033:~# source ~/openrc
root@host1-utility-container-d15da033:~# gnocchi metric list
root@host1-utility-container-d15da033:~# aodh alarm list

The output for the gnocchi and aodh queries should be similar to the excerpts shown in the section above for Apex.

4. Use a web browser to connect to Grafana at http://<serverip>:3000/, using the hostname or IP of your Ubuntu server and port 3000. Log in with admin/admin. You will see collectd InfluxDB database in the Data Sources. Also, you will notice metrics coming in the several dashboards such as CPU Usage and Host Overview.

For more details on the Barometer containers, Grafana and InfluxDB, please refer to the following documentation links: https://wiki.opnfv.org/display/fastpath/Barometer+Containers#BarometerContainers-barometer-collectdcontainer <barometer-docker-userguide>