.. _space-walkthru: ############################# Walkthrough: Creating a Space ############################# Introduction ============ On this page, we're going to walk-through creating a templated space using the PixStor Management REST API. You can see a more thorough overview of the REST API, as well as examples of using it via cURL on the :doc:`rest` page. For this guide, we're going to work in Python, using the `requests `_ library We're going to assume the REST server is running on localhost, behind an NGINX proxy, which provides SSL termination. For convenience, lets store the server url in a variable .. code-block:: python url = "https://localhost" Objective --------- We want to create a Space for a new project we're working on, called 'Sleepy Snake' Here, we're assuming that the underlying filesystem is PixStor. So in PixStor terms, what we want to do is - create an independent fileset - on the 'mmfs1' filesystem - we want the data to be placed in the 'sas1' pool - we want the fileset to have a particular project layout - we want the fileset to have a size (block quota) of 4GB In PixStor Management terms, this means creating a templated space Authentication ============== Before we can do anything else, we need to get an auth token The auth server url will be configured in APConfig (see :doc:`configuration`), so lets lookup that url .. code-block:: python from arcapix.config import config authserver = config['arcapix.auth.server.url'] # typically https://localhost Now we can request an access token from the auth server .. code-block:: python import requests payload = {'grant_type': 'password', 'username': 'myuser', 'password': 'mypassword'} resp = requests.post(authserver + '/oauth2/token', data=payload) assert resp.status_code == 200 token = resp.json()['access_token'] .. tip:: If the request raises ``SSLError("bad handshake ...")`` it's likely because of self-signed certificates. This can be resolved by adding ``verify=False`` to the request For more information, see `SSL Cert Verification `_ To make use of this access token, we have to use HTTP basic auth, with the token as the username and with an empty password. For convenience, lets create a requests `session `_ and apply the auth to it, so that we don't have to explicitly pass auth to every future requests .. code-block:: python session = requests.Session() session.auth = (token, '') # if you get SSLErrors from the self-signed certificates, add the following # session.verify = False From now on, we'll assume that all our requests are successful, but in practice you should always check status codes. Auth Roles ---------- Before we try to create a space, we should make sure that we're actually *allowed* to create a space. Our user/group will have associated with it a collection of authentication roles, and these roles determine what operations we're allowed to perform on what endpoints. We don't have to check the actual roles, though. We can just do an ``OPTIONS`` request against the ``/spaces`` endpoint and check the ``Allow`` header .. code-block:: python from __future__ import print_function resp = session.options(url + '/spaces/') print(resp.headers['Allow']) # HEAD, GET, POST, OPTIONS Here, we can see that we do have permission to perform ``POST`` requests against the ``/spaces`` endpoint, so we can indeed create spaces! .. note:: Different endpoints may have different permissions - you may have permission to create spaces but not profiles, for example. In general it's a good idea to check ``OPTIONS`` before trying to create an object. Collection+JSON Template ========================= To create a Space, we need to know what fields should be provided with our POST request. Fortunately, PixStor Management uses the `Collection+JSON `_ (C+J) format, which provides us with a template for creating a new objects So lets check the template from the ``/spaces`` endpoint .. code-block:: python resp = session.get(url + '/spaces/') print(resp.json()['collection']['template']) .. code-block:: javascript { "data": [ { "prompt": "space name", "name": "name", "value": "" }, { "prompt": "path of the space relative to its exposers", "name": "relativepath", "value": "" }, { "prompt": "templates applied to the space", "name": "templates", "value": "" }, { "prompt": "exposers providing access to the space", "name": "exposers", "value": "" }, { "prompt": "profile applied to the space", "name": "profile", "value": "" }, { "prompt": "hard limit on the size of the space in blocks", "name": "size", "value": "" } ] } So we need to provide a name, a relative path, templates, exposers, a profile, and a size. Okay, so going back to the objective (above), lets call the Space ``sleepy-snake``, with relative path ``projects/sleepy_snake`` - that's relative to the exposers (filesystem). And we'll give it a size of 4GB. .. warning:: Space names can't contain whitespace - they can only contain letters, numbers, hyphens and underscores. If you try to use a name containing 'invalid' characters, the POST request will return a ``422 (Unprocessable Entity)`` error .. code-block:: python resp = session.post(url + '/spaces/', json={'name': 'sleepy snake', ...}) print(resp.json()) # {'collection': {'error': {'message': 'Insertion failure: 1 document(s) contain(s) error(s)', 'code': 422, 'title': 'Error'}}} But what about the Profile and the Exposers? Finding where to put things =========================== It's not possible to create a filesystem or a pool via the REST api (yet). However, PixStor Management is populated with objects based on what already exists - so we get an exposer for every filesystem, and a data store for every pool. In addition, a special placement policy rule is created for each pool, resulting in corresponding 'default' profiles. So we need to lookup the exposer and profile for our filesystem and placement pool of choice. Pool ---- We want our data to be placed in pool ``sas1``. As mentioned above, the database should have been pre-populated with a datastore for pool ``sas1``, and with a profile to assign data to that datastore. The naming scheme for the pre-populated default placement profiles is ``{filesystem}-{pool}``, so in our case, we want to find the profile named ``mmfs1-sas1``. .. tip:: It is possible to create your own profile with additional ilm steps (migration rules), and with placement controls, such as matching certain file types or file size ranges, etc. But that won't be covered in this walkthrough So how do we find the profile with a particular name? Collection+JSON Queries _______________________ Once again, the C+J helps us out by providing models for queries [#]_ .. code-block:: python resp = session.get(url + '/profiles/') print(resp.json()['collection']['queries']) .. code-block:: javascript [ { "prompt": "Search by Name", "href": "/profiles?where={\"name\":\"{name}\"}", "data": [ { "prompt": "profile name", "name": "name", "value": "" } ], "rel": "search", "encoding": "uri-template" }, ... ] This shows us how to construct a query to search for a profile by name. The ``href`` gives the template for the url, and the data block tells use what parameter we need to replace in that href. Here, we have to replace the name parameter ``{name}`` with the actual name we want to search for, giving us .. code-block:: python "/profiles?where={\"name\":\"mmfs1-sata1\"}" So lets perform this query in python - we're using ``params`` for readability, but the result is the same .. code-block:: python params = {'where': '{"name": "mmfs1-sas1"}'} resp = session.get(url + '/profiles/', params=params) resp.headers['X-Total-Count'] # '1' print(resp.json()) .. code-block:: javascript { "collection": { "items": [ { "href": "/profiles/8f01ab22-cc0b-2056-ff8c-e1d829dc806c", "data": [ { "prompt": "current status of the item", "name": "status", "value": "ACTIVE" }, { "prompt": "unique identifier for the item", "name": "id", "value": "8f01ab22-cc0b-2056-ff8c-e1d829dc806c" }, { "prompt": "profile name", "name": "name", "value": "mmfs1-sas1" }, ... ], "links": [...] } ], "href": "/profiles?where={\"name\": \"mmfs1-sas1\"}", "links": [...], "template": {...}, "queries": [...], "version": "1.0", } } The full C+J response is quite long and unwieldy, so the above has been truncated. .. tip:: For a quick sanity check, we can look at the ``X-Total-Count`` header to see how many results the query returned. Profile names are unique, so logically, there should be only one. Referencing Items _________________ When referencing an item in a POST request, we can provide either the item's ``href`` or its ``id``. The href is preferred, since it uniquely identifies an item - it's possible for items of different types with the same id, whereas the href explicitly includes the id AND the type of item it is. The href is also easier to extract from the C+J response. In the response above, we see that we can get our profile item as ``resp.json()['collection']['items'][0]``. And at the very top of that item, we see a field for ``href`` So we can get the profile href from our query like so .. code-block:: python profile = resp.json()['collection']['items'][0]['href'] # "/profiles/8f01ab22-cc0b-2056-ff8c-e1d829dc806c" Filesystem ---------- We want to create our space in the ``mmfs1`` filesystem, so we need to find the corresponding exposer. Unlike profiles, there are multiple different types of exposer - including native, nfs, smb. So in addition to a name, we also have to query for the right exposer ``type``. A PixStor filesystem is represented in PixStor Management as a ``GPFSNativeExposer``, which has ``type='gpfsnative'`` So to find the ``mmfs1`` filesystem we can query the ``/exposers`` endpoint, again using the ``where`` url parameter .. code-block:: python params = {'where': '{"type": "gpfsnative", "name": "mmfs1"}'} resp = session.get(url + '/exposers/', params=params) print(resp.json()) .. code-block:: javascript { "collection": { "items": [ { "href": "/exposers/d672fde7-ba69-5d16-0acd-5868d2a8f3b9", "data": [ { "prompt": "current status of the item", "name": "status", "value": "ACTIVE" }, { "prompt": "specific type of the exposer", "name": "type", "value": "gpfsnative" }, { "prompt": "path at which the exposer is mounted", "name": "mountpoint", "value": "/mmfs1" }, { "prompt": "unique identifier for the item", "name": "id", "value": "d672fde7-ba69-5d16-0acd-5868d2a8f3b9" }, { "prompt": "exposer name", "name": "name", "value": "mmfs1" }, ... ], "links": [...], "href": "/exposers?where={\"type\": \"gpfsnative\", \"name\": \"mmfs1\"}", "links": [...], "template": {...}, "queries": [...], "version": "1.0", } } Again, we expect exactly one result, and we can get the exposer href the same as before .. code-block:: python exposer = resp.json()['collection']['items'][0]['href'] # "/exposers/d672fde7-ba69-5d16-0acd-5868d2a8f3b9" .. hint:: If there are no results for the exposers query, it's possible the database hasn't been populated (yet). You can manually populate PixStor Management by running the command .. code-block:: console $ adminctl populate now Making a Project Template ========================= The last thing we want before we can create our space is a template - a pre-defined directory layout that we can apply to our new space, and to any future spaces we might create .. Note:: It can be confusing, but try not to mix up Template objects with the C+J creation template discussed above Template Model -------------- The template we want to use doesn't exist yet, so we have to create it. To do this, we create a 'model' of the directory layout we want .. code-block:: console $ tree /mmfs1/project_template/ /mmfs1/project_template/ ├── assets # <-- this is a dependent fileset │   ├── models │   └── rigs ├── flame ├── houdini ├── maya ├── mudbox ├── nuke ├── published └── rendering Along with the directory layout, the model can include files and dependent filesets. We can even set up permissions, which the template will capture. POSTing the Template --------------------- Once we have our template model, we create the actual template via the REST interface. As with spaces, we can lookup the C+J 'template' for the fields we need to POST .. code-block:: python resp = session.get(url + '/templates/') print(resp.json()['collection']['template']) .. code-block:: javascript { "data": [ { "prompt": "template name", "name": "name", "value": "" }, { "prompt": "specific type of the template", "name": "type", "value": "" }, { "prompt": "path to the template", "name": "template_location", "value": "" } ] } We need to provide the name, the type (``filesystemtemplate`` in this case), and the location of the template's model To perform a POST request with C+J, we have to fill in the ``value`` fields in the C+J template we just looked up .. code-block:: python template_data = { "template": { "data": [ {"name": "name", "value": "project_template"}, {"name": "type", "value": "filesystemtemplate"}, {"name": "template_location", "value": "/mmfs1/project_template"} ] } } (You don't need to include the prompt fields, but if you do include them, they will just be ignored) We then POST this data to the ``/templates`` endpoint .. code-block:: python resp = session.post( url + '/templates/', json=template_data, headers={"Content-Type": "application/vnd.collection+json"} ) print(resp.status_code) # 202 (Accepted) .. important:: As shown above, we need to include the ``Content-Type: application/vnd.collection+json`` header so the REST server knows what JSON format it is receiving We're using ``json=`` because ``data=`` would 'form encode' the data, resulting in a ``400 (Bad Request)`` error. .. important:: The trailing slash on the end of the url ``/templates/`` is **required**. Without it, the POST request will fail. If there were no issue with the request, we should get back status code ``202 (Accepted)``. Template Builder ---------------- A 202 status means the new template has been added to the database, and a task has been submitted to the job engine. This builder task will copy the template model into the configured template store (see :doc:`configuration`). The response from the POST request will include a ``Location`` header, which we can query to check the status of our template .. code-block:: python print(resp.headers['Location']) # 'https://localhost/templates/e00e872b-f4c5-2557-795d-4ccf4715b602?projection={"status":1}' Because of the way C+J is structured, it's not easy to grab just the ``status`` field from the response. Lets write a little helper function .. code-block:: python def get_status(collection, item): for data in collection['collection']['items'][item]['data']: if data['name'] == 'status': return data['value'] else: raise KeyError("Status field not found") Now lets check the status of our template .. code-block:: python resp = session.get(resp.headers['Location']) print(get_status(resp.json(), 0)) # 'ACTIVE' The status will transition from ``NEW`` (data POSTed), to ``PENDING`` (task submitted), to ``INPROGRESS`` (task running), to ``ACTIVE`` When the template reaches state ``ACTIVE``, we know that the builder job has completed successfully, and it's ready to use. .. code-block:: python template = resp.json()['collection']['items'][0]['href'] # "/templates/e00e872b-f4c5-2557-795d-4ccf4715b602" .. note:: Once the template has been ingested, its ``template_location`` field is updated (internally) to point to the location of template within the template store. On subsequent GET requests, the ``template_location`` field will be returned as ``null`` .. tip:: Once the template is built, the original model can be modified or even deleted without affecting the template. Checking the Builder Task _________________________ Say we want to check the status of the builder task itself, rather than watching the template's status, or say the template enters state ``ERRORED``, implying that the builder task failed. When we look up an item, included in the response (in the ``items`` block) is a ``links`` block. .. code-block:: python print(resp.json()['collection']['items'][0]['links']) [ { "href": "/jobs?where={\"resource_type\": \"templates\", \"resource_id\": \"e00e872b-f4c5-2557-795d-4ccf4715b602\"}", "prompt": "Jobs", "name": "Jobs", "render": "link", "rel": "jobs" } ] Here we see a link to a ``jobs`` query. Lets do another helper function .. code-block:: python def get_link_href(collection, item, name): for link in collection['collection']['items'][item]['links']: if link['name'] == name: return link['href'] else: raise KeyError(name) Visiting the ``Jobs`` link will return a collection of all jobs associated with our template .. code-block:: python href = get_link_href(resp.json(), 0, 'Jobs') resp = session.get(url + href) In this instance, there should be only one job returned, since we've only submitted one task for our template (the builder task) If the job is no longer active (``COMPLETED`` or ``ERRORED``), we can check the job's logs to try and diagnose any issues. In the job's ``links`` section, we should see links for ``stdout`` and ``stderr`` .. code-block:: python print(resp.json()['collection']['items'][0]['links']) .. code-block:: javascript [ { "href": "/templates/e00e872b-f4c5-2557-795d-4ccf4715b602", "prompt": "Resource", "name": "resource", "render": "link", "rel": "resource" }, { "href": "/jobs/407ee461-5cc1-43e5-8f3e-a375f8f9b9c8/stderr", "prompt": "Stderr", "name": "stderr", "render": "link", "rel": "stderr" }, { "href": "/jobs/407ee461-5cc1-43e5-8f3e-a375f8f9b9c8/stdout", "prompt": "Stdout", "name": "Stdout", "render": "link", "rel": "stdout" } ] (We also get a link back to our template in the ``resource`` link) Performing a GET request against the ``stderr`` link will return the log in plain text format .. code-block:: python href = get_link_href(resp.json(), 0, "stderr") resp = session.get(url + href) print(resp.headers['Content-Type']) # text/plain print(resp.text) # 'DEBUG:... .. note:: Python logging is written to stderr by default, so typically will end up in the ``stderr`` log. The ``stdout`` log would contain anything written to stdout, such as ``print`` statements. But since none of the PixStor Management tasks use print statements, the ``stdout`` log will usually be empty. However, this behaviour may vary depending on which job engine is being used. Creating the Space ================== Now, finally, we have everything we need to create our Space. So, as we did for our project template above, we need to fill in the C+J template values .. code-block:: python space_data = { "template": { "data": [ {"name": "name", "value": "sleepy-snake"}, {"name": "relativepath", "value": "projects/sleepy_snake"}, {"name": "exposers", "value": exposer}, {"name": "profile", "value": profile}, {"name": "templates", "value": template}, {"name": "size", "value": 4*1024*1024*1024} # 4GB ] } } .. tip:: Here we only have one exposer and one template. If instead, we wanted to create a space with multiple exposers (or templates) we would send their hrefs as a comma-separated list - e.g. .. code-block:: python "value": '/exposers/bb2873a8-c489-4530-a8a5-ece70598f3ea,/exposers/8b234c97-23a1-4f2b-9fce-1200d40e96c1' Then we POST the data to the ``/spaces`` endpoint and wait for our new space to enter an ``ACTIVE`` state .. code-block:: python resp = session.post( url + '/spaces/', json=space_data, headers={"Content-Type": "application/vnd.collection+json"} ) checkurl = resp.headers['Location'] from time import sleep # query the status every second for 10 seconds for _ in range(10): sleep(1) resp = session.get(checkurl) status = get_status(resp.json(), 0) if status == 'ACTIVE': break else: raise Exception("Space didn't become active after 10s - got status '%s'" % status) And we're done! Checking Our Work ----------------- Database ________ First of all, lets check what our space looks like in the PixStor Management database The ``Location`` header for our space uses a projection to only return the ``status`` field, not the other data fields, so we can't use that. Lets try doing a query for our space .. code-block:: python params = {'where': '{"name": "sleep-snake"}'} resp = session.get(url + '/spaces/', params=params) print(resp.json()) .. code-block:: javascript { "collection": { "href": "/spaces/", "items": [ { "href": "/spaces/de95cf28-fdf9-5455-b1d9-831cf8a1869b", "data": [ { "prompt": "current status of the item", "name": "status", "value": "ACTIVE" }, { "prompt": "space name", "name": "name", "value": "sleepy-snake" }, { "prompt": "path of the space relative to its exposers", "name": "relativepath", "value": "projects/sleepy_snake" }, { "prompt": "unique identifier for the item", "name": "id", "value": "de95cf28-fdf9-5455-b1d9-831cf8a1869b" }, { "prompt": "hard limit on the size of the space in blocks", "name": "size", "value": 4294967296 }, ... ], "links": [ { "href": "/templates/?where=id==\"e00e872b-f4c5-2557-795d-4ccf4715b602\"", "prompt": "templates applied to the space", "name": "templates", "render": "link", "rel": "templates collection" }, { "href": "/exposers/?where=id==\"d672fde7-ba69-5d16-0acd-5868d2a8f3b9\"", "prompt": "exposers providing access to the space", "name": "exposers", "render": "link", "rel": "exposers collection" }, { "href": "/profiles/8f01ab22-cc0b-2056-ff8c-e1d829dc806c", "prompt": "profile applied to the space", "name": "profile", "render": "link", "rel": "profile item" }, { "href": "/jobs?where={\"resource_type\": \"spaces\", \"resource_id\": \"de95cf28-fdf9-5455-b1d9-831cf8a1869b\"}", "prompt": "Jobs", "name": "Jobs", "render": "link", "rel": "jobs" } ] } ], "links": [...], "template": {...}, "queries": [...], "version": "1.0", } } Looks good. Notice that the related items - exposers, templates, profile - appear in the links section. This allows us to look up those items without having to figure out their urls. .. note:: It is possible to have multiple spaces with the same name, but *only* if they have different profiles. In general, you should avoid creating multiple spaces with the same name. Filesystem __________ Now, lets check that a fileset was actually created for our Space .. code-block:: console $ mmlsfileset mmfs1 Filesets in file system 'mmfs1': Name Status Path ... sas1-sleepy-snake Linked /mmfs1/projects/sleepy_snake sata1-sleepy-snake-8b234c97 Linked /mmfs1/projects/sleepy_snake/assets The first one, ``sas1-sleepy-snake``, is the fileset created for our space. Notice that the fileset doesn't have exactly the name we specified for the space - it has the name of its placement pool stuck on the front. This prefix is used for matching the space (fileset) to the right pool for the space's profile (placement policy rule) .. code-block:: console $ mmlspolicy mmfs1 -L ... RULE 'sas1-placement' SET POOL 'sas1' WHERE FILESET_NAME LIKE 'sas1-%' RULE 'default' SET POOL 'sata1' The second fileset shown, ``sata1-sleepy-snake-8b234c97``, is the dependent fileset installed by our template. Instead of the space's placement pool, this name is prefixed with the pool that the original, model fileset was assigned to - in this case ``sata1``. The name also includes the name of our space (plus a random suffix). Lets check the full template was installed .. code-block:: console $ tree /mmfs1/projects/sleepy_snake /mmfs1/projects/sleepy_snake ├── assets │   ├── models │   └── rigs ├── flame ├── houdini ├── maya ├── mudbox ├── nuke ├── published └── rendering Great! Finally, the ``size`` value we specified should have been translated into a block quota .. code-block:: console $ mmlsquota -j sas1-sleepy-snake mmfs1 Block Limits | File Limits Filesystem type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks mmfs1 FILESET 109 3865470566 4294967296 0 none | 10 0 0 0 none Perfect! Our space is now ready to use. Exercise: Create a Snapshot =========================== Now that you've made yourself a space, why not practice what you've learned by creating a snapshot of that space Hints ----- - The endpoint you're looking for is ``/snapshots`` - To create a space snapshot, the only fields you need to POST are ``name``, ``type``, and ``space`` - The type you want to use is ``gpfsspacesnapshot`` Solution -------- Click the (+) icon on the right to reveal the solution... .. toggle:: .. code-block:: python params = {'where': '{"name": "sleep-snake"}'} resp = session.get(url + '/spaces/', params=params) space = resp.json()['collection']['items'][0]['href'] snapshot_data = { "template": { "data": [ {"name": "name", "value": "snake-snap"}, {"name": "type", "value": "gpfsspacesnapshot"}, {"name": "space", "value": space}, ] } } resp = session.post( url + '/snapshots/', json=snapshot_data, headers={"Content-Type": "application/vnd.collection+json"} ) Full Code ========= The full code for this walkthrough can be found here: .. toctree:: :maxdepth: 1 walkthru-code ---- .. rubric:: Footnotes .. [#] `uri-templates `_ are not part of the standard Collection+JSON spec