Formatting datasets for upload

From Fenix

Revision as of 14:05, 6 April 2009 by Grita (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

Dataset file format

  • In the Workstation, each dataset file must be uploaded together with a metadata file. Dataset files should prepared and saved in .csv format or comma-delimited text files. This is a type of data format in which each piece of data is separated by a comma, where each column value is separated by a comma from the next column and each row starts a new line. Users should format their datasets using the Excel templates provided in Appendix 1 and save the files as .csv by selecting the Save As>CSV (Comma delimited) (*.csv) option. (See a sample of a .csv dataset file below.)
  • Please note that due to current formatting restrictions in the Workstation you should avoid re-opening a .csv dataset file in Excel once it has been saved, as this would result in some loss of formatting, particularly of any codes beginning with 0 and of the dates. This problem should be resolved in subsequent releases.


What to do if you need to re-open or edit a previously saved .csv file in Excel

  • Currently some formatting issues still need to be resolved in the Workstation. This means that when a previously saved .csv dataset file is re-opened in Excel, some of the pre-defined formatting may be lost. It is therefore advisable to open .csv dataset files in text format (e.g. with Notepad or Textpad) to make any changes. However, should you need to re-open a previously formatted .csv file in Excel, it will be necessary to re-enter any codes beginning with 0, and reformat the date column as YYYY-MM-DD. To do this, right-click on the selected date column and select Format Cells>number tab>date format yyyy-mm-dd.


Dataset types

  • Several dataset types can be loaded in the Workstation. Refer to Appendix 1 for specifications of the Column headers for each dataset and to download the .xls templates for preparation of the datasets to be uploaded. For dataset files, the first row should contain the column headers, and the next rows the data content. Column headers must be the same for all datasets of a same type, and follow the given order.


Image:Sample_dataset.png


Metadata files

  • All dataset files must be accompanied by a metadata file containing information about the data. The metadata file is in .xml format and is generated automatically by the metadata editor (See Running the Metadata Editor). No naming conventions currently apply to datasets, however metadata files must be named with the same name of the dataset they refer to with the addition of the ending "_Metadata".