The following is some documentation I wrote up for York. I’m publishing it here because putting all the pieces together took me some time, and maybe someone else will find it useful. The export/import process is present in the dspace documentation, but it’s not terribly verbose. Hopefully, this post can serve as a useful adjunct to the official stuff.
Also of note is the fact that Dspace 1.5.0 introduced a bug in the ItemExport application where the handle of exported items is not properly removed. Attempts to import these items result in failure and a message:
ERROR: duplicate key violates unique constraint "handle_handle_key"
This documentation will cover the process both with and without this bug. For export/import in 1.5.0, follow all instructions. For newer/older versions, omit the steps in red. I’m given to understand that version 1.6 will contain functionality for moving items/collections from within the UI, so these instructions will hopefully someday be obsolete.
The basic procedure for moving a collection in Dspace is: export the items in the collection to file, delete the old items & collection, re-import the items into the new location. These steps are carried out as follows:
1. Locate the collection to be exported. note down its handle, as that is what you’ll be using to identify it in the export. If you don’t know, the handle is the last part of the URL to the collection in question. For example, for http://pi.library.yorku.ca/dspace/handle/10315/2753, the handle is 10315/2753.
2. Log in to the Dspace server and go to the [dspace]/bin/ directory. Use dsrun to execute the item exporter as follows:
# ./dsrun org.dspace.app.itemexport.ItemExport -t COLLECTION -i <handle of collection to export> -d <destination directory> -n 1
Where:
-t COLLECTION – specifies that you are exporting every item in the collection. Note that this does not actually export the collection itself, but every item inside it. If you need to export a single item, you can replace COLLECTION with ITEM.
-i <handle> – specify the handle that you noted down in step 1. You don’t need to enclose it in quotes, just the handle is fine.
-d <destination> – where you want the export files to be created. I usually put these in the dspace user home directory. Note that this directory must exist already, or the export will fail.
-n 1 – the exported items will be created in a series of sequentially numbered directories within the destination path. This switch specifies the starting number for these directories. I almost always use 1 here. However, if you are exporting multiple collections and combining them into one import, you may need to change this.
3. Go back into the Dspace interface and delete the collection. Deleting the collection will also delete any items inside it. Yes, this is scary, but it’s necessary because you’re going to be importing those same items again, and we need to make room for them. Lest you worry too much, remember that your export created an archive of the files in the collection and you can always re-import them back to where they came from if you have regrets.
4. (If you’re not running 1.5.0, skip to step 8.)
5. Because of the aforementioned bug in 1.5.0, when the items were deleted in step 3, their handles didn’t get deleted in the database. We’re going to have to do this manually. Inside each of the numbered directories in your export folder, there should be a file called handle. It contains the handle of the item in that particular subdirectory. Note down the handle for each item you exported.
6. Connect to the dspace database either using some sort of GUI tool (I like pgAdmin) or the command line. Within the dspace database, there should be a table called handle. Look for the item’s handle in the column ‘handle’ and delete the row. From the command line, it would be something like:
# psql
dspace=> DELETE FROM handle WHERE handle='<item handle here>';
7. Do step 6 for each item you exported. The database should be clear for importing now.
8. If you’re moving the items to a new collection, create it now and make a note of the collection’s handle. If putting them into an existing collection, just note the handle.
9. From the [dspace]/bin directory, execute the following:
# ./dsrun org.dspace.app.itemimport.ItemImport -a -e <eperson> -c <destination collection handle> -s <export archive directory> -m <filename>
Where:
-e <eperson> – your username in dspace
-c <destination collection handle> – the handle of the destination collection, as gathered in step 8.
-s <export archive directory> – the same directory you specified in step 2.
-m <filename> – Dspace will create a “map file” during this process that tells you which exported files (as designated by their numbered directories) were mapped to which item handles. Give it a sensible file name and keep it around! It’s good for several things which are outside the scope of this document. This file must not already exist, or the import will fail. If you previously tried an import which failed halfway through, a map file may already exist. If no items successfully imported, the mapfile should be empty and you can freely delete it. If it does contain information, then you will need to leave it in place and run the command again with the same map file name, adding the -R switch to resume the import.
10. That’s it! Assuming no errors occurred, you should now be able to see your items in their new locations.