Databases

We will utilize several kinds of databases throughout the semester. Broadly, we define a database as a set of data with some organizational structure. Ideally this structure makes it easy search, add, remove, and update the information stored. There are two broad divisions of databases, namely relational and non-relational.

Relational databases

Relational databases store data in tables. Each table has a number of named columns that store a specific type of data (e.g. integer, floating point number, string, or date). The data are then stored in rows, with each row representing one value for each column. Entries of rows can be pointers to other rows in other tables to indicate some type of relation.

We will discuss the following relational databases:

  1. PostgreSQL
  2. SQLite

Non-relational databases

Non-relational (or NoSQL) databases come in many forms. Some are key-value stores that associate some set of information with some key (maybe associating a user profile with a username), some store documents (e.g. a web page or JSON document), and others represent graphs.

We will discuss the following non-relational databases.

  1. Neo4j, a graph database
  2. MongoDB, a document database

Which to choose?

Although there are performance considerations, the most important factor in deciding which database to use is how easily your data fits in the database model of choice. For example, if you are trying to record stock price information a typical relational database might be a good choice since each datum likely consists of a time, a trading symbol, and a price. However, if you are trying to represent the friendship network in some social network, a graph database might be a more natural fit.