HOME > Continuum > Publications > F20

Factsheet

Managing Datasets

Datasets, like other information or corporate records created or received by a public office, can be public records under the Public Records Act 2005 and need to be managed accordingly. This factsheet provides some guidance for the management of datasets in public sector organisations.


What are datasets?

A dataset is structured, encoded information found in lists, tables, spreadsheets or databases. Data may be numeric, spatial, spectral, statistical or structured text (including bibliographic data and database reports).

Datasets are most commonly found in tables, spreadsheets and databases: 

  1. Tables are the simplest of the three types of datasets. They consist of an ordered arrangement of any number of rows and columns.
  2.  Spreadsheets consist of interactive tables in which a data item may include a formula and may be dynamically linked to another data item, so that a change in one causes a change in the other.
  3.  Databases are best referred to as database systems. A database system has three components: The database itself (the actual content); a Database Management System (the software between the data and the user); and the database application, which incorporates the user interface and the functionality that enables the user to search through and process the content of the database, as well as the programs that support the system in processing inputs and outputs.

Why is managing datasets important?

  •  Datasets are often created and managed to provide evidence of an organisation’s core functions and are necessary for business continuity, accountability, and evidence based decisions.
  • Datasets can be a type of public record under the Public Records Act 2005 and so need to be managed accordingly.
  •  Datasets can be a rich source for economic and social quantitative research as records of past actions and snapshots of a particular point in time.
  • Datasets represent challenges to conventional long-term management strategies.
    • While the content can be just text, which is easy to preserve, it can be hard or impossible to understand this content without good quality contextual documentation or metadata.
    •  Meaningful access to information in databases can rely on complicated relationships between data.
    • Datasets often have unique security and confidentiality requirements relating to them because of the types and volume of the content that they hold.
back to top

Planning for effective dataset management

Create a strategy or plan for the stewardship and preservation of your datasets, from their creation through to disposal, considering all possible uses for the data. Here are the steps:

 1.  Assign responsibility

Ensure that responsibilities for the management of all datasets are assigned to someone in your organisation.

2. Create appropriate metadata

Datasets, like all records, require metadata to ensure they provide evidence of business activity and can be accessed for as long as they are required. Identify relevant standards for data/metadata content and format. The Archives New Zealand Electronic Recordkeeping Metadata Standard contains minimum requirements for recordkeeping metadata.

3. Make multiple (back-up) copies of valuable datasets

Store some of them off-site and in different systems. Your vital records, business continuity, or disposal documentation may assist with identifying your valuable datasets.

4.    Plan for data migration

Plan the transition of datasets to new storage media and software systems in advance. Include budgetary planning for new storage and software technologies, file format migrations, and timeframes to complete the work. Storing datasets on new technologies before existing storage media becomes obsolete may help to prevent information loss.

5.    Plan for transitions in data stewardship

If the data will eventually be turned over to a formal repository or other custodial environment, ensure that it meets the requirements of the new environment and that the new steward has agreed to take it on.

6.    Tailor plans for preservation and access to the expected use

For example, gene-sequence data used daily by thousands of researchers worldwide may need a different preservation and access infrastructure to an internal human resources database used to manage staff information.

7.    Pay attention to security

Be aware of what you must do to maintain the integrity of your datasets and prevent unauthorised access.

8.    Identify all relevant legislation

Ensure your approaches to stewardship, access and disposal are compliant with all relevant legislation such as the Public Records Act 2005 and the Privacy Act 1993, and may be accessible under the Official Information Act 1982. There may also be sector-specific legislation that applies to the datasets.

9.    Know the value and retention requirements

Datasets may be of long-term or short-term value. Make sure that they are covered by a current disposal authority authorised by the Chief Archivist.

back to top

More Information

Archives New Zealand. S8 Electronic Recordkeeping Metadata Standard. June 2008.   http://continuum.archives.govt.nz/recordkeeping-publications.html#standards

The Common Data Format website.   

http://cdf.gsfc.nasa.gov/

The Data Documentation Initiative (DDI).    

http://www.ddialliance.org/

Digital Preservation Europe (DPE). Database Preservation. March 2009.

http://www.digitalpreservationeurope.eu/publications/briefs/english.php#25

The Dutch National Archive. From digital volatility to digital permanence. Preserving databases. December 2003.

http://www.digitaleduurzaamheid.nl/bibliotheek/docs/volatility-permanence-databases-en.pdf

Electronic Resource Preservation and Access Network (ERPANET). Conference Proceedings on Long-term Preservation of Databases. April 2003.

http://www.erpanet.org/events/2003/bern/

The Statistical Data and Metadata Exchange (SDMX) standard website. 

http://sdmx.org/

Contact us:

For recordkeeping advice and assistance, please contact Archives New Zealand at rkadvice@archives.govt.nz