Difference between revisions of "File formats"

From WeSISpedia
Jump to: navigation, search
(Mandatory columns for the monadic template)
(Mandatory columns for the dyadic template)
Line 96: Line 96:
 
|category||12||partly||String||Name of the category an indicator belongs to||Category refers to the upper level an indicator belongs to ("parent topic"). In WeSISpedia these are called "categories" of the main topics (Y, X1, X2).
 
|category||12||partly||String||Name of the category an indicator belongs to||Category refers to the upper level an indicator belongs to ("parent topic"). In WeSISpedia these are called "categories" of the main topics (Y, X1, X2).
 
|-
 
|-
|label||13||no||String||Label of the (technical) category name||This is the easy-to-read label for ''category''.
+
|label||13||no||String||Label of the (technical) category name||This is an easy-to-read label of the indicator. It is the same as "indicator name" in WeSISpedia and ''label'' in the Quick info box of an indicator page in WeSISpedia.
 
|-
 
|-
 
|data_quality||14||yes||TBC||TBC||Currently, this is a place holder for the quality rating as the decision is still pending.
 
|data_quality||14||yes||TBC||TBC||Currently, this is a place holder for the quality rating as the decision is still pending.

Revision as of 23:13, 16 March 2020

This page describes the file formats suitable and accepted for upload to WeSIS. Two templates - one for monadic and one for dyadic data - are stored in "Seafile > WeSIS > Data Templates" (each as a csv- and a xlsx-file).

NOTE: Both templates are approved and ready to be applied. How to judge the data quality is still an ongoing process, though. Consequently, how to input the judgement into the template still needs to be confirmed. Yet, make sure to retain both columns.


General remarks

When preparing the file, the first row - and only the first row (!) - can be used for any kind of additional information (coder information, project notes, last updated etc.). The line must start with the hashtag character # (as if it was a comment in R).

Many coding rules have been agreed upon and shall be followed. Please be aware of these rules and regularly check the site to see if further decisions have been added. These rules encompass among others country codes, date time formats, technical variable names and other aspects that affect the data collection, ensure consistency and reduce error.

Both templates include

  1. mandatory and standardized columns for the data per se,
  2. mandatory and standardized columns for metadata, and
  3. (unlimited) optional columns for additional information regarding the data.

Standards for optional column names will evolve applying the principle of "first come, first serve" that was agreed upon. If you want to use optional columns,

  • check the page on optional column names if any other project already made a suggestion that fits your needs.
  • If a name already exists and you find the description suitable to file your information under this heading, please apply the existing one.
  • If there is no name that suits the information you want to store, create a new name and describe the purpose of the column on the page so others may use it later on.

Mandatory columns for the monadic template

Column name Column number Standardized? Type Description Comments
cow_code 1 yes Numeric Country code according to the COW scheme See the list of country codes used for WeSIS and WeSISpedia.
country_name 2 yes String Country name See the list of country codes used for WeSIS and WeSISpedia.
year 3 yes Numeric The year value refers to Note that a date, like election dates or the date of introduction of a policy, is a value. year, thus may "double" the value.
technical_variable_name 4 partly String Technical name of an indicator Make sure to follow the naming convention for technical names in WeSIS and WeSISpedia.
value 5 yes Numeric, string or date The actual value of the indicator Always use the dot (.) as the decimal separator for numeric data. For dates follow the date time format.
unit 6 yes String The unit of value For dates the unit is "date", otherwise name the actual unit of the value (e.g. "per 1000 inhabitants", "as % of GDP" etc.)
scale 7 yes String The scale of value Only one of six possible values is allowed here:
  • Binary
  • Multinominal
  • Ordinal
  • Metric
  • String
  • Date
source 8 partly String The source of value If value derives from a common source please check if there is already a harmonized abbreviation.
publication_date 9 yes Date Date of data collection and/or upload to WeSIS This date refers to the data collection, the last check or the upload, i.e. it does not refer to year or date of value. Yet, the common date time format applies accordingly.
category 10 partly String Name of the category an indicator belongs to Category refers to the upper level an indicator belongs to ("parent topic"). In WeSISpedia these are called "categories" of the main topics (Y, X1, X2).
label 11 no String Label of the (technical) category name This is an easy-to-read label of the indicator. It is the same as "indicator name" in WeSISpedia and label in the Quick info box of an indicator page in WeSISpedia.
data_quality 12 yes TBC TBC Currently, this is a place holder for the quality rating as the decision is still pending.
data_quality_confidence 13 yes TBC TBC Currently, this is a place holder for the quality rating as the decision is still pending.

Mandatory columns for the dyadic template

Column name Column number Standardized? Type Description Comments
cow_code_sender 1 yes Numeric Country code of the "sender" according to the COW scheme "Sender" is used here synonymously to any node in undirected networks. See the list of country codes used for WeSIS and WeSISpedia.
country_name_sender 2 yes String Country name of the "sender" See the list of country codes used for WeSIS and WeSISpedia.
cow_code_receiver 3 yes Numeric Country code of the "receiver" according to the COW scheme "Receiver" is used here synonymously to any node in undirected networks. See the list of country codes used for WeSIS and WeSISpedia.
country_name_receiver 4 yes String Country name of the "receiver" See the list of country codes used for WeSIS and WeSISpedia.
year 5 yes Numeric The year value refers to Note that a date, like election dates or the date of introduction of a policy, is a value. year, thus may "double" the value.
technical_variable_name 6 partly String Technical name of an indicator Make sure to follow the naming convention for technical names in WeSIS and WeSISpedia.
value 7 yes Numeric, string or date time The actual value of the indicator Always use the dot (.) as the decimal separator for numeric data. For dates follow the date time format.
unit 8 yes String The unit of value For dates the unit is "date", otherwise name the actual unit of the value (e.g. "per 1000 inhabitants", "as % of GDP" etc.)
scale 9 yes String The scale of value Only one of six possible values is allowed here:
  • Binary
  • Multinominal
  • Ordinal
  • Metric
  • String
  • Date
source 10 partly String The source of value If value derives from a common source please check if there is already a harmonized abbreviation.
publication_date 11 yes Date Date of data collection and/or upload to WeSIS This date refers to the data collection, the last check or the upload, i.e. it does not refer to year or date of value. Yet, the common date time format applies accordingly.
category 12 partly String Name of the category an indicator belongs to Category refers to the upper level an indicator belongs to ("parent topic"). In WeSISpedia these are called "categories" of the main topics (Y, X1, X2).
label 13 no String Label of the (technical) category name This is an easy-to-read label of the indicator. It is the same as "indicator name" in WeSISpedia and label in the Quick info box of an indicator page in WeSISpedia.
data_quality 14 yes TBC TBC Currently, this is a place holder for the quality rating as the decision is still pending.
data_quality_confidence 15 yes TBC TBC Currently, this is a place holder for the quality rating as the decision is still pending.

Misc

Contributors: Sebastian Haunss, Nils Düpont, Gabriela León, Gabriella Skitalinska, and Nate Breznau

Revisions:

  • CRC internal release of templates on March 06, 2019