Getting Started with Callbacks

Introduction

GPFS callbacks allow automatic actions to be performed when specific events are triggered.

Callbacks are typically used for automations, such as reporting, alerts, or data migrations. E.G. a callback could be set to generate an email when disk space is low, or write to a log when a node is shutdown.

Basic Usage

Native utilisation of GPFS’ callback functionality requires the creation of external scripts, which must parse any data passed by the triggered callback before actioning. The ArcaPix API can support this style of callback.

Alternatively the ArcaPix API can set a callback from a Python function (‘callable’). When the GPFS callback is triggered, the ArcaPix API callback driver invokes the defined function.

E.G. to log the shutdown time of GPFS on a node:

1
2
3
4
5
def logger():
   with open("/var/log/gpfs/events/shutdown.log", 'a') as log:
      log.write("{0}: GPFS has shutdown.".format(datetime.datetime.today())

Node('mynode').onShutdown.new(logger)

This eliminates the need for creating external scripts and parameter parsing.

Natively GPFS only provides the functionality to specify a node or nodegroup on which to trigger a callback when an event occurs.

Using the ArcaPix API, callable-based Callbacks can be set to trigger on objects. E.G. setting callback function email_alert to trigger when filesystem mmfs1 exceeds a soft quota:

>>> Filesystem('mmfs1').onSoftQuotaExceeded.new(email_alert)

Callable-callbacks are triggered on all nodes with which the object is associated. I.E. a filesystem callback will trigger in parallel on all nodes on which the filesystem is mounted.

It’s also possible to define multiple trigger events via the callbacks collection, E.G.

>>> Cluster().callbacks.new(my_func, ['shutdown', 'startup'])

See the Callback Functions documentation page for a full listing of which objects support which callback events.

Accessing and Modifying Existing Callbacks

The onEventType function returns a Callback object representing the newly created callback, which can be used to look up and modify the callback’s properties.

E.G. to add the capability for the set callback to additionally trigger on startup events:

>>> cb = Cluster().onShutdown.new(logger)
>>> cb.add(event='startup')
>>> cb.events
['shutdown', 'startup']

Similarly, an object’s Callbacks collection can be used to look up all callbacks associated with that object. E.G.

>>> for cb in Cluster().callbacks:
...     print cb.id

A specific callback can be looked up via its associated ID

>>> Cluster().callbacks['my_callback']

If an ID is not provided on instantiation, one will be generated. This will be based on the function’s name, followed by an underscore and a random alpha-numeric string - such as logger_6381f180be69

It is recommended to provide an ID rather than relying on an auto-generated ID. This makes identifying specific callbacks easier.

Parameters and Variables

Native GPFS callbacks provide parameters, which are replaced with associated values when the callback is triggered. E.G. adding parameter %eventNode to a callback passes the name of the node on which the event was triggered to the callback script.

The ArcaPix API also supports parameters using the parms keyword. E.G. The parameter eventName will be passed to the callback function my_func when the shutdown event is triggered.

>>> Callback('shutdown-logger', my_func, 'shutdown', parms=['eventName'])

When the callback function is a Python callable, if a callable parameter is named after a GPFS variable, it will be extracted, so it doesn’t need to be specified separately with the parms keyword. E.G.

>>> def logger(eventName):
...    return "A %s event occurred" % eventName

If both are specifed, the variable will only be utilised once

>>> Callback('shutdown-logger', logger, 'shutdown', parms=['eventName'])

Note: Because of this behaviour, avoid giving parameters the same names as GPFS variables unless they’re meant to be treated as such. Note also that variable names are case sensitive.

Inferred variables and keyword parameters can be used in tandem.

E.G. below, format_str is passed via the parms keyword, while eventName is handled as a GPFS variable (so doesn’t need to be passed via parms)

>>> def logger(format_str, eventName):
...    return format_str.format(eventName)

>>> Callback('shutdown-logger', logger, 'shutdown', parms=['A {0} event occurred'])

In such instances, any non-GPFS parms will replace (in order) any non-GPFS extracted variables. Therefore the parms should match the function’s parameters, sans any GPFS variables, otherwise unexpected behaviour could occur.

Note: Parameters will be passed to callbacks as strings. Therefore it is necessary for callback functions to cast variables to the required type as appropriate.

Warnings and Caveats

Variables

It is important to note that serialised callbacks are serialised at their current state. Therefore upon deserialisation, the variables contain the same information as when the parent Python application was running.

E.G. in the example Python program below, the variable dt maintains state across serialisation and deserialisation. Upon a GPFS filesystem being unmounted, the event is triggered, the callback is deserialised and the callback function is invoked.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/usr/bin/env python2.7

# file: callback.py

from datetime import datetime
from arcapix.fs.gpfs import Node

# Get the time the script was run
dt = datetime.now()

def logcallback():

  with open("/tmp/output.log", "a") as f:

        # Write the unserialised variable dt
        f.write(dt.strftime("INSTALLED: %A, %d. %B %Y %I:%M.%f\n"))

        # Get the time the callback fired
        tn = datetime.now()
        f.write(tn.strftime("CALLED: %A, %d. %B %Y %I:%M.%f\n"))

# Install the callback to fire later
Node('mynode').callbacks.new(logcallback, ['unmount'])

Which we then run, waiting 65 seconds between the callback installation and the unmount of mmfs2.

$ date; cat /tmp/output.log; ./callback.py; sleep 65; sudo /usr/lpp/mmfs/bin/mmunmount mmfs2;
Tue Mar  8 11:57:05 GMT 2016

Tue Mar  8 11:58:17 GMT 2016: mmunmount: Unmounting filesystems ...

This installed the callback as a serialised object (the big long text in parms= is essentially the representation of the Python function logcallback dumped to a text string.)

$ sudo /usr/lpp/mmfs/bin/mmlscallback logcallback_360613e4ce8d
        command         = /opt/arcapix/lib/python2.7/site-packages/arcapix/fs/gpfs/callbackdriver.py
        event           = unmount
        node            = mynode.localnet
        parms           =
Z0mefUs4K4to4Y0yPcit6qZZnnfTaaFvIkUQ3l4ZJgPbnQzs35zaukkz78kehCNzukhenn5denQeryCLsg1Z8NZFfRoZ4dpo0OPY6joyU962qHNip6FPU8vl9R/QVuoGr8d13tTV8Mren5JLzq03Wv2tqbtz+RX9JI70R+ubo8f864aaa1244f96617c122b1a796fb412f

When mmunmount is called, this triggers the callback, and the log file is written to

$ cat /tmp/output.log
INSTALLED: Tuesday, 08. March 2016 11:57.683512
CALLED: Tuesday, 08. March 2016 11:58.321727

Output

Callback functions do not provide stdout. Data output from a callback function is written to the callback driver log file. It is advised to implement dedicated log files for callback functions in order to create granular logging of triggered callbacks.

Permissions

Callbacks run as root. Therefore a poorly defined callback could be destructive.

Any files created by a callback (E.G. a log file) will be created with root privileges. It is recommended that a callback enforces correct ownership and permissions of created data - such as is described here [stackoverflow.com].

Additionally it is recommended that callbacks drop privileges as soon as possible to a non-root user where it is possible to do so - such as is described here [antonym.org].

Warning

Both of the aforementioned referenced approaches are not officially supported by the ArcaPix API. It is left as an exercise for the reader to implement such approaches within the remit of their own or their organisation’s requirements.

Single Instance

As mentioned previously, some events will trigger callbacks on multiple nodes at once. If your callback function, for example, sends email alerts this behaviour can be troublesome.

To get around this, you might have your callback create a ‘lock’. For example, you might have your callback create some folder on triggering, and remove said folder on completion. If an instance of the callback checks for the lock folder and it already exists, then it can be assumed another instance of the callback is already running, so the new instance can exit without executing.

Note

The lock needs to be created in a location where all nodes can access it - such as within the GPFS filesystem.

It may also be worth encoding the time in the folder name, so as to limit how many instances of the callback can run within a certain period of time.

Examples

Email alert

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import smtplib
from email.mime.text import MIMEText

def email_alert(fsName):
    msg = MIMEText("Space is running low on %s" % fsName)

    msg['Subject'] = 'Low Disk Space!'
    msg['From'] = 'callbacks@mycompany.com'
    msg['To'] = 'alerts@mycompany.com'

    s = smtplib.SMTP('mailhost')
    s.sendmail('callbacks@mycompany.com',
                ['alerts@mycompany.com'], msg.as_string())
    s.quit()

Cluster().onLowDiskSpace.new('lowspace_email', email_alert)

Warning

The lowDiskSpace event triggers every two minutes until the problem is resolved (by default).

The trigger interval can be changed with the mmchconfig command.

Posting a Tweet

Posting a tweet via python depends on the python-twitter module. To install the python-twitter module, run the following command:

$ pip install python-twitter

Posting a tweet also requires creation of a Twitter developer account and Twitter API keys. Refer to https://apps.twitter.com/ for further information.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import twitter

def tweet(eventName, eventNode):
    # set up api auth
    api = twitter.Api(consumer_key='CONSUMER_KEY',
                      consumer_secret='CONSUMER_SECRET_KEY',
                      access_token_key='ACCESS_TOKEN',
                      access_token_secret='SECRET_TOKEN')

    msg = "A {0} event was triggered on {1}"
               .format(eventName, eventNode)

    # post status
    api.PostUpdate(msg)

 Node('democluster').callbacks.new(tweet, ['shutdown', 'startup'])