Week 1: Wednesday

Data Literacy

What is Data?

  • What do you think?
  • The quantitiees, characters, or symbols on which operations are performed by a computer, being stored and transmitted in the form electrical signals and recorded on magnetic, optical, or mechanical recording media. (Oxford languages dictionary)
  • Things known or assumed as fact, making the bassis fo reasoning or calculation (Oxford languages dictionary)
  • The symbolic representation of reality (me)

All Data has a type

  • integer (whole number) = 1
  • float (decimal number) = 1.0
  • string (text) = “text”
  • boolean (true or false) = true
  • date (date) = 1990-01-01 (YYYY-MM-DD)

Why are these data types so important? Binary

  • ones and zeroes
  • 8 bits = 1 byte
  • What limits the size of the data we can store?
  • ASCII (American Standard code for information interchange, 1963)(128 characters) vs UTF-8 (1993)(1,112,064 characters)

Complex Data Types

  • list/array = [1, 2, 3]
  • dictionary/object = {“key”: “value”}
  • tuple = (1, 2, 3)
  • set = {1, 2, 3}

How do we store data?

  • CSV (comma separated values)
  • TSV (tab separated values)
  • JSON (JavaScript Object Notation)
  • databases (SQL, NoSQL)
  • text data (txt, doc, PDF)

CSV

ID,Name,Birthday,Age
1,John,01/01/1990,31
2,Jane,02/02/1990,31
3,Jack,03/03/1990,31

TSV

ID  Name    Birthday    Age
1   John    01/01/1990  31
2   Jane    02/02/1990  31
3   Jack    03/03/1990  31

JSON

[{
  "ID": 1,
  "Name": "John",
  "Birthday": "01/01/1990",
  "Age": 31
},
{
  "ID": 2,
  "Name": "Jane",
  "Birthday": "02/02/1990",
  "Age": 31
}]

Text

  • John is 31 years old. He was born on 01/01/1990.
  • Jane is 31 years old. She was born on 02/02/1990.
  • Jack is 31 years old. He was born on 03/03/1990.

Different types of Data storage

  • Structured data
  • Semi-structured data
  • Unstructured data

Structured Data

ID Name Birthday Age
1 John 01/01/1990 31
2 Jane 02/02/1990 31
3 Jack 03/03/1990 31

Semi-structured Data

[{
  "ID": 1,
  "Name": "John",
  "Birthday": "01/01/1990",
  "Age": 31,
  "Family": {
    "Father": "Jack",
    "Mother": "Jill"
  }
},
{
  "ID": 2,
  "Name": "Jane",
  "Birthday": "02/02/1990",
  "Age": 31,
  "Family": {
    "Father": "Jack",
    "Mother": "Jill"
  }
}]

Unstructured Data

  • John is 31 years old. He was born on 01/01/1990. His father is Jack and his mother is Jill.
  • Jane is 31 years old. She was born on 02/02/1990. Her father is Jack and her mother is Jill.
  • Jack is 31 years old. He was born on 03/03/1990. His father is Jack and his mother is Jill.

Doing things with Data

Structured Semi-structured Unstructured
ex.CSV ex.JSON ex.Text
Easy to manipulate Easy to manipulate Hard to manipulate
not flexible flexible very flexible
very fast fast slow

Making Data Work for You Example, part I

My notes on a Manuscript

Making Data Work for You Example, part II

Berthier, Sébastien to Henri III, King of France--, 29 january 1585--, Istanbul--, 3--, r
    Corfu: Primarily dealing with some issues around Corfu and Candie, and the Pope offering to support the Candians with 50 galleys (7a)
    Crete: Primarily dealing with some issues around Corfu and Candie, and the Pope offering to support the Candians with 50 galleys (7a)
    Spain: "sus la derniere rechereche notamment de l'espagnol continues de faison par les entremettements dicelles sur la sugjet des lettres du Mariglian a Osman Bassa don’t …" expectation of the continuation of the suspension of arms with Spain will be accomplished. ... Mariglian offered to come [as] ambassador in this Porte. (7b)
    Margliani, Giovanni: "sus la derniere rechereche notamment de l'espagnol continues de faison par les entremettements dicelles sur la sugjet des lettres du Mariglian a Osman Bassa don’t …" expectation of the continuation of the suspension of arms with Spain will be accomplished. Mariglian offered to come [as] ambassador in this Porte. (7b)
    Özdemiroğlu Osman Pasha: "sus la derniere rechereche notamment de l'espagnol continues de faison par les entremettements dicelles sur la sugjet des lettres du Mariglian a Osman Bassa don’t …" expectation of the continuation of the suspension of arms with Spain will be accomplished. 

Making Data Work for You Example, part III

{"src": {
   "author": "Berthier, Sébastien",
   "recipient": "Henri III, King of France",
   "title": "Berthier, Sébastien to Henri III, King of France, 29 january 1585",
   "date": "1585-01-29",
   "auth_loc_town": "Istanbul",
   "pages": " 3 r-"},
   "notes": [{
    "text": "Primarily dealing with some issues around Corfu and Candie, and the Pope offering to support the Candians with 50 galleys (7a)",
    "tag": "Corfu"},
   {"text": "Primarily dealing with some issues around Corfu and Candie, and the Pope offering to support the Candians with 50 galleys (7a)",
    "tag": "Crete"},
   {"text": "'sus la derniere rechereche notamment de l\'espagnol continues de faison par les entremettements dicelles sur la sugjet des lettres du Mariglian a Osman Bassa don’t …' expectation of the continuation of the suspension of arms with Spain will be accomplished. ... Mariglian offered to come [as] ambassador in this Porte. (7b)",
    "tag": "Spain"},
   {"text": "'sus la derniere rechereche notamment de l\'espagnol continues de faison par les entremettements dicelles sur la sugjet des lettres du Mariglian a Osman Bassa don’t …' expectation of the continuation of the suspension of arms with Spain will be accomplished. Mariglian offered to come [as] ambassador in this Porte. (7b)",
    "tag": "Margliani, Giovanni"}]
  }

What can be data?

  • What do you think?