-
Notifications
You must be signed in to change notification settings - Fork 2
Nuxeo Load notes for UCB
Loading a collection into Nuxeo basically consists of 2 steps:
- load files
- load metadata
Both make use of the pynux library, which is a python wrapper for the Nuxeo REST API.
This all assumes you're working on nuxeo-stg.cdlib.org, which has Nuxeo and pynux installed, and that you have sufficient permissions.
At its most basic, this entails using pynux's pifolder command to load a folder of content. For example:
pifolder --leaf_type SampleCustomPicture \
--input_path /apps/content/new_path/UCM/MercedMugbook \
--target_path /asset-library/UCM/ \
--folderish_type SampleCustomPicture
This assumes that the /asset-library/UCM/MercedMugbook folder does not exist on Nuxeo yet. If the folder does already exist, use the --skip_root_folder_creation option like so:
pifolder --leaf_type SampleCustomPicture \
--input_path /apps/content/new_path/UCM/MercedMugbook \
--target_path /asset-library/UCM/ \
--folderish_type SampleCustomPicture
--skip_root_folder_creation
In practice, it is often necessary to do a bit of prep work on the "raw" files that we receive from contributors in order to get them ready for the above ingest process, for example removing extraneous/duplicate files, normalizing filenames, etc. Not sure if you'll be doing this as well?
In any case, we hard link the files into a new, organized directory structure rather than copying them, in order to save space. We've named the scripts that do this *relink.py, for example: uci-oral-histories-relink.py
In general, the "raw" files are in /apps/content/raw_path and the organized/hardlinked files are in /apps/content_new_path.
The above instructions assume that the folder of content you're dealing with is all simple objects, i.e. a directory of image files with no nested components. Complex objects can get pretty hairy to load programmatically, depending. Let's assume you don't have to deal with that for now -- but let me know if you do...
This basically entails using the pynux-utils update_nuxeo_properties function to "update" the metadata on an existing object in Nuxeo.
Take the halberstadt.py script, for example. This does the following:
- iterates over a directory of XML files, each of which contains the metadata for an object in this collection
- parses the metadata, and transforms it into a ucldc-friendly python dictionary
- determines the nuxeo path of the object in question
- passes the python dict and the path to update_nuxeo_properties