Difference between revisions of "File formats"

From WeSISpedia
Jump to: navigation, search
(Long format: Monadic)
(Long format: Monadic)
Line 20: Line 20:
 
* If there is no name that suits the information you want to store, create a new name and describe the purpose of the column on the [[Optional_column_names|page]] so others may use it later on.
 
* If there is no name that suits the information you want to store, create a new name and describe the purpose of the column on the [[Optional_column_names|page]] so others may use it later on.
  
== Long format: Monadic ==
+
== Mandatory columns for the monadic template ==
  
 
{|class="wikitable sortable" border="1"
 
{|class="wikitable sortable" border="1"
Line 35: Line 35:
 
|value||5||yes||yes||Numeric, string or date time||The actual value of the indicator||Always use the dot (.) as the decimal separator for numeric data. For dates follow the [[Coding_rules|date time format]].
 
|value||5||yes||yes||Numeric, string or date time||The actual value of the indicator||Always use the dot (.) as the decimal separator for numeric data. For dates follow the [[Coding_rules|date time format]].
 
|-
 
|-
|unit||6||yes||yes||String||The unit of ''value''.||For dates the unit is "date", otherwise name the actual unit of the value (e.g. "per 1000 inhabitants", "as % of GDP" etc.)
+
|unit||6||yes||yes||String||The unit of ''value''||For dates the unit is "date", otherwise name the actual unit of the value (e.g. "per 1000 inhabitants", "as % of GDP" etc.)
 
|-
 
|-
 
|scale||7||yes||yes||String||The scale of ''value''||Only five possible values are allowed here:  
 
|scale||7||yes||yes||String||The scale of ''value''||Only five possible values are allowed here:  
Line 46: Line 46:
 
|-
 
|-
 
|source||8||yes||partly||String||The source of ''value''||If ''value'' derives from a common source please check if there is already a [[Abbreviations|harmonized abbreviation]].  
 
|source||8||yes||partly||String||The source of ''value''||If ''value'' derives from a common source please check if there is already a [[Abbreviations|harmonized abbreviation]].  
 +
|-
 +
|publication_date||9||yes||yes||Date||Date of data collection and/or upload to WeSIS||This date refers to the data collection, the last check or the upload, i.e. it does ''not'' refer to ''year'' or date of ''value''. Yet, the common [[Coding_rules|date time format]] applies accordingly.
 +
|-
 +
|category||10||yes||partly||String||Technical name of a category an indicator belongs to||Category refers to any upper level an indicator belongs to ("parent topic"). In WeSISpedia these are called "Subcategories" of the main topics (Y, X1, X2) and are assigned by the projects.
 +
|-
 +
|label||11||yes||no||Label of the (technical) category name||This is the easy-to-read label for ''category''.
 +
|-
 +
|data_quality||12||yes||yes||TBC||Currently, this is a place holder for the quality rating as the decision is still pending.
 +
|-
 +
|data_quality_confidence||13||yes||yes||TBC||Currently, this is a place holder for the quality rating as the decision is still pending.
 
|}
 
|}
  

Revision as of 15:01, 18 March 2019

This page describes the file formats suitable and accepted for upload to WeSIS. Two templates - one for monadic and one for dyadic data - are stored in "Seafile > WeSIS > Data Templates" (each as a csv- and a xlsx-file).

NOTE: THIS PAGE IS STILL UNDER CONSTRUCTION!

General remarks

When preparing the file, the first row - and only the first row (!) - can be used for any kind of additional information (coder information, project notes, last updated etc.). The line must start with the hashtag character # (as if it was a comment in R).

Many coding rules have been agreed upon and shall be followed. Please be aware of these rules and regularly check the site to see if further decisions have been added. These rules encompass among others country codes, date-time formats, technical variable names and other aspects that affect the data collection, ensure consistency and reduce error.

Both templates include

  1. mandatory and standardized columns for the data per se,
  2. mandatory and standardized columns for metadata, and
  3. (unlimited) optional columns for additional information regarding the data.

Standards for optional column names will evolve applying the principle of "first come, first serve" that was agreed upon. If you want to use optional columns,

  • check the page on optional column names if any other project already made a suggestion that fits your needs.
  • If a name already exists and you find the description suitable to file your information under this heading, please apply the existing one.
  • If there is no name that suits the information you want to store, create a new name and describe the purpose of the column on the page so others may use it later on.

Mandatory columns for the monadic template

Column name Column number Mandatory? Standardized? Type Description Comments
cow_code 1 yes yes Numeric Country code according to the COW scheme See the list of country codes used for WeSIS and WeSISpedia.
country_name 2 yes yes String Country name See the list of country codes used for WeSIS and WeSISpedia.
year 3 yes yes Numeric The year value refers to Note that a date, like election dates or the date of introduction of a policy, is a value. year, thus may "double" the value.
technical_variable_name 4 yes partly String Technical name of an indicator Make sure to follow the naming convention for technical names in WeSIS and WeSISpedia.
value 5 yes yes Numeric, string or date time The actual value of the indicator Always use the dot (.) as the decimal separator for numeric data. For dates follow the date time format.
unit 6 yes yes String The unit of value For dates the unit is "date", otherwise name the actual unit of the value (e.g. "per 1000 inhabitants", "as % of GDP" etc.)
scale 7 yes yes String The scale of value Only five possible values are allowed here:
  • Binary
  • Multinominal
  • Ordinal
  • Metric
  • String
  • Date
source 8 yes partly String The source of value If value derives from a common source please check if there is already a harmonized abbreviation.
publication_date 9 yes yes Date Date of data collection and/or upload to WeSIS This date refers to the data collection, the last check or the upload, i.e. it does not refer to year or date of value. Yet, the common date time format applies accordingly.
category 10 yes partly String Technical name of a category an indicator belongs to Category refers to any upper level an indicator belongs to ("parent topic"). In WeSISpedia these are called "Subcategories" of the main topics (Y, X1, X2) and are assigned by the projects.
label 11 yes no Label of the (technical) category name This is the easy-to-read label for category.
data_quality 12 yes yes TBC Currently, this is a place holder for the quality rating as the decision is still pending.
data_quality_confidence 13 yes yes TBC Currently, this is a place holder for the quality rating as the decision is still pending.

Long format: Dyadic

Explanation of mandatory columns.

Explanation of meta-data columns.

Explanation of optional columns.