Difference between revisions of "File formats"
Nils Duepont (talk | contribs) |
Nils Duepont (talk | contribs) |
||
Line 1: | Line 1: | ||
This page describes the file formats suitable and accepted for upload to WeSIS. Two templates - one for monadic and one for dyadic data - are stored in "Seafile > WeSIS > Data Templates" (each as a csv- and a xlsx-file). | This page describes the file formats suitable and accepted for upload to WeSIS. Two templates - one for monadic and one for dyadic data - are stored in "Seafile > WeSIS > Data Templates" (each as a csv- and a xlsx-file). | ||
+ | |||
NOTE: THIS PAGE IS STILL UNDER CONSTRUCTION! | NOTE: THIS PAGE IS STILL UNDER CONSTRUCTION! | ||
+ | |||
__TOC__ | __TOC__ |
Revision as of 15:14, 18 March 2019
This page describes the file formats suitable and accepted for upload to WeSIS. Two templates - one for monadic and one for dyadic data - are stored in "Seafile > WeSIS > Data Templates" (each as a csv- and a xlsx-file).
NOTE: THIS PAGE IS STILL UNDER CONSTRUCTION!
Contents
General remarks
When preparing the file, the first row - and only the first row (!) - can be used for any kind of additional information (coder information, project notes, last updated etc.). The line must start with the hashtag character # (as if it was a comment in R).
Many coding rules have been agreed upon and shall be followed. Please be aware of these rules and regularly check the site to see if further decisions have been added. These rules encompass among others country codes, date-time formats, technical variable names and other aspects that affect the data collection, ensure consistency and reduce error.
Both templates include
- mandatory and standardized columns for the data per se,
- mandatory and standardized columns for metadata, and
- (unlimited) optional columns for additional information regarding the data.
Standards for optional column names will evolve applying the principle of "first come, first serve" that was agreed upon. If you want to use optional columns,
- check the page on optional column names if any other project already made a suggestion that fits your needs.
- If a name already exists and you find the description suitable to file your information under this heading, please apply the existing one.
- If there is no name that suits the information you want to store, create a new name and describe the purpose of the column on the page so others may use it later on.
Mandatory columns for the monadic template
Column name | Column number | Mandatory? | Standardized? | Type | Description | Comments |
---|---|---|---|---|---|---|
cow_code | 1 | yes | yes | Numeric | Country code according to the COW scheme | See the list of country codes used for WeSIS and WeSISpedia. |
country_name | 2 | yes | yes | String | Country name | See the list of country codes used for WeSIS and WeSISpedia. |
year | 3 | yes | yes | Numeric | The year value refers to | Note that a date, like election dates or the date of introduction of a policy, is a value. year, thus may "double" the value. |
technical_variable_name | 4 | yes | partly | String | Technical name of an indicator | Make sure to follow the naming convention for technical names in WeSIS and WeSISpedia. |
value | 5 | yes | yes | Numeric, string or date time | The actual value of the indicator | Always use the dot (.) as the decimal separator for numeric data. For dates follow the date time format. |
unit | 6 | yes | yes | String | The unit of value | For dates the unit is "date", otherwise name the actual unit of the value (e.g. "per 1000 inhabitants", "as % of GDP" etc.) |
scale | 7 | yes | yes | String | The scale of value | Only five possible values are allowed here:
|
source | 8 | yes | partly | String | The source of value | If value derives from a common source please check if there is already a harmonized abbreviation. |
publication_date | 9 | yes | yes | Date | Date of data collection and/or upload to WeSIS | This date refers to the data collection, the last check or the upload, i.e. it does not refer to year or date of value. Yet, the common date time format applies accordingly. |
category | 10 | yes | partly | String | Technical name of a category an indicator belongs to | Category refers to any upper level an indicator belongs to ("parent topic"). In WeSISpedia these are called "Subcategories" of the main topics (Y, X1, X2) and are assigned by the projects. |
label | 11 | yes | no | String | Label of the (technical) category name | This is the easy-to-read label for category. |
data_quality | 12 | yes | yes | TBC | TBC | Currently, this is a place holder for the quality rating as the decision is still pending. |
data_quality_confidence | 13 | yes | yes | TBC | TBC | Currently, this is a place holder for the quality rating as the decision is still pending. |
Mandatory columns for the dyadic template
Column name | Column number | Mandatory? | Standardized? | Type | Description | Comments |
---|---|---|---|---|---|---|
cow_code_sender | 1 | yes | yes | Numeric | Country code of the "sender" according to the COW scheme | "Sender" is used here synonymously to any node in undirected networks. See the list of country codes used for WeSIS and WeSISpedia. |
country_name_sender | 2 | yes | yes | String | Country name of the "sender" | See the list of country codes used for WeSIS and WeSISpedia. |
cow_code_receiver | 3 | yes | yes | Numeric | Country code of the "receiver" according to the COW scheme | "Receiver" is used here synonymously to any node in undirected networks. See the list of country codes used for WeSIS and WeSISpedia. |
country_name_receiver | 4 | yes | yes | String | Country name of the "receiver" | See the list of country codes used for WeSIS and WeSISpedia. |
year | 5 | yes | yes | Numeric | The year value refers to | Note that a date, like election dates or the date of introduction of a policy, is a value. year, thus may "double" the value. |
technical_variable_name | 6 | yes | partly | String | Technical name of an indicator | Make sure to follow the naming convention for technical names in WeSIS and WeSISpedia. |
value | 7 | yes | yes | Numeric, string or date time | The actual value of the indicator | Always use the dot (.) as the decimal separator for numeric data. For dates follow the date time format. |
unit | 8 | yes | yes | String | The unit of value | For dates the unit is "date", otherwise name the actual unit of the value (e.g. "per 1000 inhabitants", "as % of GDP" etc.) |
scale | 9 | yes | yes | String | The scale of value | Only five possible values are allowed here:
|
source | 10 | yes | partly | String | The source of value | If value derives from a common source please check if there is already a harmonized abbreviation. |
publication_date | 11 | yes | yes | Date | Date of data collection and/or upload to WeSIS | This date refers to the data collection, the last check or the upload, i.e. it does not refer to year or date of value. Yet, the common date time format applies accordingly. |
category | 12 | yes | partly | String | Technical name of a category an indicator belongs to | Category refers to any upper level an indicator belongs to ("parent topic"). In WeSISpedia these are called "Subcategories" of the main topics (Y, X1, X2) and are assigned by the projects. |
label | 13 | yes | no | String | Label of the (technical) category name | This is the easy-to-read label for category. |
data_quality | 14 | yes | yes | TBC | TBC | Currently, this is a place holder for the quality rating as the decision is still pending. |
data_quality_confidence | 15 | yes | yes | TBC | TBC | Currently, this is a place holder for the quality rating as the decision is still pending. |