DH201 Post #7: Big Data, Second Draft

OK, here’s a more coherent entry for my Digital Humanities Encyclopedia project.

The term big data is actually descriptive of the underlying phenomenon to which it refers. Big data refers to the streams of bits, numbers, texts, images, interactions, stimuli, or information that is too massive in quantity to process using single computers or even small clusters of machines. In the field of digital humanities, practitioners fuse the use of data (big and not so big) and technical tools to ask and answer humanist questions. An example to illustrate this is the project written up in Quantitative Analysis of Culture Using Millions of Digitized Books. In this project, the authors analyzed a subset of the Google Books corpus to shed insight into issues like cultural forgetfulness, censorship, and changes in language use.

  • Videos:
  • Academic (but engaging) articles on the phenomenon:

  • Image (linked to a Smithsonian article): “big data is getting bigger at at a stunning rate”

Network graph of people on twitter connecting to the topics of Big Data, infochimps or Hadoop

  • Tags: when talking about big data, people may use any of the following terms, big data, research data, data deluge, big science, fourth paradigm
  • Twitter hashtag: #bigdata is the suggested hashtag to use when Tweeting about big data. It’s already being used a aplenty in Twitter-land.
