Provides various useful utility functions built on top of the API



Takes a list of hosts, which can be nodes and/or nodeclasses, and expands nodeclasses into a list of individual nodes (without repetition)

Parameters:hosts (list) – list od names of nodes and/or nodeclasses
arcapix.fs.gpfs.utils.change_migration_threshold(fs, pool, high, low=None, pre=None, update=False, index=0)

Set the migration threshold(s) for a given pool

  • fs (str) – name of the filesystem the pool belongs to
  • pool (str) – name of the pool whose threshold should be changed
  • high (int) – the upper limit at which migration should be triggered
  • low (int) – the lower limit at which migration should stop
  • pre (int) – pre-migration threshold
  • update (bool) – if True, check for any existing placement rules for the specified pool and update the migrate threshold of those. If False, or no relevant placement rules exist, a new one will be created.
  • index (int) – position in the placement polict at which the new migration rule should be inserted. Default = 0, the top of the policy.

Get the filesystem that a path belongs to

Parameters:path (str) – path to a file in a GPFS filesystem
Returns:matching filesystem, or None if one can’t be found
Return type:Filesystem

Get the filesystem corresponding to a target.

Like get_filesystem_by_path but also supports filessystem name and /dev/<fsname>

Parameters:target (str) – a filesystem name or path
Return type:Filesystem object
Raises:ValueError if a matching filesystem can’t be found

Returns the default placement pool for the filesystem expanding any macros used

Parameters:filesystem (str) – Filesystem to find the default palcement rule for
Returns:pool name
arcapix.fs.gpfs.utils.get_fileset_placement_pool(filesystem, fileset=None)

Polls the filesystem PlacecmentPolicy to try to figure out what pool files in ‘fset’ are assigned to.

Fileset placement can be specified either as “FOR FILESET …” or “WHERE FILESET_NAME …” If no specific placement rule exists, the default placement rule is used.

Any pool macros are resolved.

  • filesystem (str) – filesystem the fileset belongs to
  • fileset (str) – fileset to find placement pool for

If fileset is None, return default placement pool

Returns:pool name
class arcapix.fs.gpfs.utils.snapshot_rotation(fs, fmt)

Context manager to perform snapshot rotation.

>>> with snapshot_rotation('mmfs1', 'apsync-%Y%d%m%H%M%S') as sr:
...     apsync('mmfs1', sr.oldsnap.name, sr.newsnap.name)

Finds the most recent existing snapshot matching a given format. On context entry, creates a new snapshot according to the same format.

Snapshot objects for these snapshots can be accessed from the rotation object as attributes oldsnap and newsnap respectively.

On context exit, if no errors were raised, the older snapshot is deleted. Else, if there were errors, the newer snapshot is deleted.

The name format should be a valid ‘strptime’ format string. To use an alternate naming scheme, create a subclass which overrides the generate_name and match_name methods.

  • fs – filesystem name or object
  • fmt – a strptime compatible format string

Generate snapshot name based on fmt.


Check if a name matches fmt.


Used to parse the options passed to –policy-options.


SnapDiff – Find the differences between two snapshots

arcapix.fs.gpfs.utils.snapdiff.merge_diff(old, new)

Finds the difference between two lists of files. Takes iterables of tuples of the form ((inode, gen), snapid, path, size, type) Assumes iterables are ordered by inode number.

Based on the merge-phase of a merge sort

O = [1, 2]; N = [2, 3]

O  N
1      - 1 deleted from N
2  2   - 2 present in both, check snapid and path for modification
   3   - 3 created in N

Acts as a generator, returning tuples of the form (diff_type, files, size, type)


Parse line from work file.

Returns:tuple that can be passed to merge_diff
arcapix.fs.gpfs.utils.snapdiff.check_snapshot_order(filesystem, snap1, snap2, fsetName=None)

Check that two snapshots are in the right order

  • filesystem – Filesystem name or object that the snapshots belong to
  • snap1 (str) – name of the snapshot that should be older
  • snap2 (str) – name of the snapshot that should be newer
  • fsetName (str) – name of the fileset the snapshots belong to (if relevant)

True if snap1 is older than snap2, else False

arcapix.fs.gpfs.utils.snapdiff.check_snapshot_compatibility(fs, snap1, snap2, fset=None)

Checks that both snapshots belong to the same fileset (or are both global snapshots)

Returns:True if the snapshots can be snapdiffed
Return type:bool

Prints diff tuples, from snapdiff, with colours

>>> for f in snapDiff(...):
...     print_diff(f)
+ /path/to/new/file
Parameters:fdiff (FileDiff) – object to print
arcapix.fs.gpfs.utils.snapdiff.get_list_path(fs, snapshot, fsetName=None, exclude=None, storageDir=None)

Get the path for list file of files in a given snapshot.

Path is based on the snapshot scan arguments used to generate the lists


<storageDir> defaults to <fsmount>/.policytmp/snapdiff

arcapix.fs.gpfs.utils.snapdiff.snapDiff(fsName, old, new=None, fsetName=None, force=False, exclude=None, storageDir=None, **kwargs)

Finds the differences between the files in two snapshots.

  • fsName (str) – Name of the filesystem to scan snapshots of
  • old (str) – Name of the older snapshot to scan
  • new (str) – Name of the newer snapshot to scan If None, all files in ‘old’ will be returned
  • fsetName (str) – Name of a fileset, for fileset snap diff
  • exclude (list) – list of exclude patterns
  • storageDir (str) – directory to store list files in default=<fsmount>/snapdiff/
  • **kwargs – additional options to pass through to policy.run This can include nodes, threadLevel, etc.

generator of FileDiff

Return type:



Get Filesystem a path belongs to

>>> from arcapix.fs.gpfs.utils import get_filesystem_by_path
>>> fs = get_filesystem_by_path('/mmfs1/data')
>>> print fs.name

Find differences between two snapshots

>>> from arcapix.fs.gpfs.utils.snapdiff import snapDiff, print_diff
>>> for diff in snapDiff('mmfs1', 'oldsnap', 'newsnap'):
...     print_diff(diff)
+ /path/to/new/file
- /path/to/deleted/file