Data quality

Note: Currently, this page is under construction...

Data quality in the social sciences is often discussed under the headings of validity and reliability when talking about concepts, measurement and data. However, few discuss transparency and data quality, and there is no consistent understanding what data quality means in the social sciences. In their famous book, King et al. (1994, 24) set out five guidelines for "improving data quality" encompassing

valid measures,
reliable data collections,
replicable analyses,
a thorough documentation of the data generating process, and
to "collect data on as many of [a theory’s] observable implications as possible".

Nowadays, much more data is available posing new challenges, though; counterfeit data, typos, and unintended mistakes in the data collection, for example, pose challenges beyond invalid and unreliable measures. For this reason, we opted for a broader concept based on data quality definitions commonly used in information systems design as a starting point (cf. Pipino et al. 2002).

References

King, Gary, Robert O. Keohane, and Sidney Verba. 1994. Designing Social Inquiry: Scientific Inference in Qualitative Research. Princeton: Princeton University Press.
Pipino, Leo L., Yang W. Lee, and Richard Y. Wang. 2002. "Data quality assessment." Communications of the ACM 45 (4): 211–18.

Data quality

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Resources

Tools