Batch Upload

Overview

This feature will allow for the simultaneous uploading of multiple geospatial datasets. This feature could either be a desktop-based tool that is linked to a GeoNode instance, or it could be a web-based interface within the GeoNode, using flash or other technology. The goal of this feature is to allow for users to transfer large amounts of data into the GeoNode through as user-friendly a process as possible. This will encourage the storing and sharing of data through GeoNodes.  The core use case is user x has y (large) amount of data that needs to be uploaded and current upload system is excessively time consuming which discourages GeoNode use.

Use Cases

1)  A major international organization when setting up a new instance of GeoNode will have a large amount of data that needs to be uploaded to the system. Based on the current methodology for uploading data it is extremely time consuming and causes a disincentive for the sharing of all that data that is willing to be shared. By creating the ability for batch uploading of data the process simplified and removes this significant obstacle to the sharing of data through the GeoNode system.  By reducing the time commitment needed for manual uploading of data more efforts can be spent on other crucial aspects of data sharing, such as metadata and proper data management.

2) A hazard analysis is completed for the country of Haiti, producing many gigabytes of geospatial data that is important to distribute and disseminate as widely as possible in order to facilitate proper planning for reconstruction after the 2010 Earthquake. The team has limited time to spend on data management and therefore the massive time commitment needed to uploading their data to a GeoNode results in the data staying on an individuals hard drive. By allow for batch uploading of data, and the relevant workflow, the data can be easily upload and distributed to the many actors that require the information for their planning procedures.

Specification

The core is to allow a user to select a series of files and/or directories on their computer, hit ‘upload’, and have them ingested in to a GeoNode that they have a user account on.  They will need to be logged in so their proper user metadata gets associated.  Possible features include:

  • Simultaneous uploading of many datasets, comprised of different geospatial data types. Initially to include Shapefiles and GeoTiffs. Future datatypes could include – KML, Spreadsheets, GeoRSS, etc.
  • Allow for shapefiles to be uploaded as .zip archives.
  • Uploading of shapefiles without the manual selection of each individual file (i.e. .shx, .prj, .dbf, etc.)
  • When files are uploaded in batch they are linked to the users GeoNode profile. This will allow for automated metadata completion based on the users profile information.
  • When files are batch uploaded the system will check for existing file metadata (and import it). When metadata information is missing it will prompt the user for the completion of the metadata. The metadata completion will be done either during the batch upload system or through a prompt on the users GeoNode profile.
  • If a desktop solution is developed it will allow for configuration of which GeoNode instance and user credentials should be used.
  • Expose a server end point as FTP, Samba and/or webDAV, so that existing file transfer protocols can be used, and users could just drag their data to a mounted folder on their desktop.
  • Technical Details

    Javascript alone is not sufficient for web upload.  Will need flash or html 5.  But most anything on the web does less well with over 2 gigabytes of data.  In the long term we probably should have both desktop and web-based.

    Estimated costs

    $16,000 – $25,000 for a basic implementation.