Scanning

Use the ngscan tool to list (scan) remote objects in storage endpoints.

ngscan

Synopsis

ngscan ( -E RESTRICTION_ALIASES[:RESTRICTION_PATHS] |
       --endpoint-exclude=EXCLUSION_ALIASES[:EXCLUSION_PATHS] )+

ngscan [-r] NAME1 ... NAMEn

ngscan [-r] [--filelist-format=NUL|quoted] -f FILELIST

Description

Lists remote objects, folders, and symbolic links in storage endpoints.

Options

--all-obj-instances
                list all object instances.
                Default: list only latest object instances.
--base-path-type=retrieve|store
                type of a base path at storage endpoints to list
                remote objects:
                "retrieve" - use the retrieve base path;
                   "store" - use the store base path.
                Default: "retrieve".
--config-file=FILE
                path to a master configuration file.
                Default: /opt/arcapix/etc/ngenea.conf
-E, --endpoint=ALIASES[:PATHS]
                restrict the set of storage endpoints for listing remote
                objects to endpoints with aliases specified by extended glob
                pattern ALIASES.
                Optionally, restrict listed remote object pathnames at those
                endpoints to pathnames matching extended glob pattern PATHS.
                By default, restrict listed remote object pathnames to the
                root path.
                Compatible with the option: --no-recursion-remote
--endpoint-exclude=ALIASES[:PATHS]
                exclude remote object pathnames specified by extended glob
                pattern PATHS at storage endpoints with aliases specified by
                extended glob pattern ALIASES from listing.
                By default, exclude remote object pathnames at the root path.
                Compatible with the option: --no-recursion-remote
--ent-type=STRING
                list remote entities with specified types:
                `f' - regular files;
                `d' - directories;
                `l' - symbolic links.
                Separate these letters by `,' to specify multiple types.
                Default: "f,d,l".
-f FILELIST     process files and directories from a filelist file.
--filelist-format=LF|NUL|quoted
                format of a filelist file:
                "LF"     - filenames delimited by newlines; a filename cannot
                           contain newline characters;
                "NUL"    - filenames delimited by the NUL (0) byte;
                "quoted" - filenames possibly enclosed in single or double
                           quotes and delimited by newlines.
                Default: "LF".
                Compatible with the option: -f FILELIST
--format=STRING
                output every line containing information about a remote object,
                folder, or symbolic link according to a specified format
                string; such string can contain the following
                format specifiers:
                %{FIELD_NAME}       - print a specified field aligned to the
                                      right using default width;
                %-{FIELD_NAME}      - print a specified field aligned to the
                                      left using default width;
                %WIDTH{FIELD_NAME}  - print a specified field aligned to the
                                      right in a column of specified width;
                %-WIDTH{FIELD_NAME} - print a specified field aligned to the
                                      left in a column of specified width.
                File and object name fields:
                "fln"                        - normalized local file name;
                "fln_raw"          or "fr"   - local file name biuniquely
                                               corresponding to an object;
                "name"                       - decoded object name without a
                                               base path prefix;
                "name_raw"         or "nr"   - raw object name without a base
                                               path prefix;
                "name_full"        or "nf"   - full decoded object name;
                "name_full_raw"    or "nfr"  - full raw object name;
                "name_no_uuid"     or "nnu"  - decoded object name without
                                               a base path prefix and
                                               UUID suffix;
                "name_no_uuid_raw" or "nnur" - raw object name without
                                               a base path prefix and
                                               UUID suffix.
                Standard file information fields:
                "type"                       - remote entity type
                                               (`f' - object, `d' - folder,
                                                `l' - symbolic link);
                "mode"                       - octal file mode;
                "owner"                      - owner (user) name;
                "group"                      - group name;
                "size"                       - size in bytes.
                Time fields:
                "atime"                      - last access time;
                "migtime"                    - migration time;
                "mtime"                      - last modification time;
                "ctime"                      - last status change time.
                Other fields:
                "hash_sha512"                - object content SHA-512 hash;
                "storage_alias"    or "sa"   - alias of a storage endpoint;
                "symlink_value"    or "sv"   - symbolic link value;
                "uuid"                       - object UUID.
                Metadata elements:
                "metadata.all.KEY"           - native or shadow metadata
                                               element with a specified key;
                "metadata.native.KEY"        - native metadata element with a
                                               specified key;
                "metadata.shadow.KEY"        - shadow metadata element with a
                                               specified key.
                Default: "%-{name} %{size}".
--help          display this help and exit.
--ignore-rmtlc  never read remote location xattrs to determine the names of
                remote objects for local files specified on the command line.
                Default: read remote location xattrs on running the program
                         with superuser privileges.
--json[=pretty] output log messages and information about remote objects,
                folders, and symlinks in JSON format.
                If the option argument "pretty" is present, produce indented
                multiline output for JSON objects; otherwise, produce
                single-line output for JSON objects.
                Default: output log messages in ordinary text form and output
                         information about remote objects, folders, and
                         symlinks in text table form.
--list-shadow   additionally list shadow metadata objects that have names
                beginning with `.' and ending with `.xattr'.
--no-header     do not print column headers.
--no-recursion-remote
                disable recursive interpretation of restriction and exclusion
                extended glob patterns for remote object pathnames.
                The recursive interpretation means matching sub-directories at
                all nesting levels, whereas non-recursive interpretation means
                matching a single directory.
                Compatible with the options: -E, --endpoint; --endpoint-exclude
-o FILE         output a remote object list to a specified file.
                Default: output to stdout.
-P, --param-endpoint=ALIASES:PARAMETER=VALUE
                add a parameter with name PARAMETER and value VALUE to
                parameters read from configuration files for storage endpoints
                with aliases specified by extended glob pattern ALIASES.
                If PARAMETER already exists in a configuration file, it takes
                a new VALUE.
--print-metadata
                output all available native and shadow metadata of listed
                remote objects, folders, and symlinks.
                Implies the options: --print-native-metadata +
                                     --print-shadow-metadata
--print-native-metadata
                output all available metadata of listed remote objects,
                folders, and symlinks except for metadata stored as
                object content.
--print-shadow-metadata
                output all available metadata of listed remote objects and
                folders stored in shadow metadata objects.
--print-vendor-metadata
                output vendor (storage-specific) metadata of listed remote
                objects, folders, and symlinks.
-q, --quiet     suppress normal output (to stdout).
                Return exit status 0 if normal output would contain at least
                one line describing a remote entity on condition that no
                warnings were printed.
                Return exit status 1 if normal output would be empty.
-r, --recursion-local
                if program arguments specify directory names, process files in
                those directories and their sub-directories recursively.
                Default: process specified directories but not their content.
--skip-check-uuid
                disable verifying that UUID-like suffixes in remote object
                names are equal to UUID metadata of those remote objects.
                If this verification is disabled, guessed file names
                corresponding to remote object names may be incorrect (to
                obtain a file name corresponding to a remote object name, its
                UUID suffix has to be removed).
                When determining remote object names for local file names
                specified on the command line by reading their remote location
                xattrs, disable verifying that the UUID xattr of a local file
                and the UUID of a remote object fetched from its metadata
                are equal.
--sort[=FIELD1,...,FIELDn]
                sort an output remote object list by specified fields.
                FIELDi is a field name (see the description of
                `--format=STRING' option) optionally followed by `-' for
                sorting in reverse order.
                Default: "storage_alias,name_full".
--time-style=rfc3339
                print time fields in RFC 3339 format with nanoseconds.
                Example: "2020-01-15 14:56:57.234567890+03:00".
-u, --unique    remove duplicate lines from an output remote object list.
-v, --verbose=LEVEL
                verbosity level:
                0 = remote object list entries and error and warning messages
                    (also used when this option is absent);
                2 = debug messages;
                3 = enable core dump;
                    print PID and current time with microsecond precision.
-V, --version   display version information and exit.

Examples

List all remote objects in all storage endpoints described in a default master configuration file:

ngscan --all-obj-instances -E '*'

List only latest instances of remote objects in all storage endpoints described in a default master configuration file:

ngscan -E '*'

Use a master configuration file "templates/master-fs.conf":

ngscan --config-file=templates/master-fs.conf -E '*'

List all remote objects in storage endpoints with the aliases "fs" and "awss3":

ngscan -E 'fs|awss3'

List all remote objects in storage endpoints with aliases containing the substring "s3" but not with the alias "remote_blackpearl_ds3":

ngscan -E '*s3*' --endpoint-exclude=remote_blackpearl_ds3

List remote objects recursively (i.e. including all sub-folders) in the folder "path/to/dir" in a storage endpoint with the alias "fs":

ngscan -E fs:path/to/dir

List remote objects recursively in the folders "path/to/dir" and "some/other/path" in a storage endpoint with the alias "fs":

ngscan -E 'fs:path/to/dir|some/other/path'

List remote objects recursively in the folder "dir" in all storage endpoints but exclude remote objects (recursively) in the folder "dir/subdir" in a storage endpoint with the alias "remote_blackpearl_ds3":

ngscan -E '*:dir' --endpoint-exclude=remote_blackpearl_ds3:dir/subdir

List remote objects non-recursively (i.e. not including remote objects in sub-folders) in the folder "path/to/dir" in storage endpoints with aliases containing the substring "s3":

ngscan -E '*s3*:path/to/dir/' --no-recursion-remote

Selecting Information Fields to Print

Print the fields: full remote object name ("nf") aligned to the left, file mode ("mode"), size ("size"), and last modification time ("mtime"):

ngscan --format='%-{nf} %{mode} %{size} %{mtime}' -E '*'

Print the fields: remote object name without a base path ("name") aligned to the left in a column of width 80 and remote object size ("size") in a column of width 16:

ngscan --format='%-80{name} %16{size}' -E '*'

Print remote object names without column header lines:

ngscan --format='%-{name}' --no-header -E '*'

Sorting

Sort by remote object size in ascending order:

ngscan --sort=size -E '*'

Sort by remote object size in descending order:

ngscan --sort=size- -E '*'

Sort by file owner in ascending order, then by size in descending order, then by name in ascending order:

ngscan --sort=owner,size-,name --format='%-{owner} %-{name} %{size}' -E '*'

Removing Duplicate Lines

Print all distinct file owners:

ngscan -u --format='%-{owner}' -E '*'

Print all distinct file owner / file mode pairs:

ngscan -u --format='%-{owner} %{mode}' -E '*'

See Also