Agenda
A database is a structured collection of data.
Database management systems (DBMSs) provide the tools for managing, querying, and analyzing data stored in the database.
PostgreSQL is a popular DBMS.
A PostgreSQL database stores collections of data called tables. Above is an example of a table called film
.
Each table is made up of rows and columns.
- Each row represents a single object/instance/record in the table
- Each column represents a property/attribute/field of that object. Columns have data types such as integer, string, date, or boolean.
PostgreSQL is an object-relational DBMS, meaning that in addition to storing its data as objects, it allows for relationships between objects across tables.
Here is another table in the same database called actor
:
Every table has a column that serves as the primary key, a unique value assigned to each row in the table.
Q: What is the primary key in the tables above?
Answer
actor_id
is the primary key for theactor
table.film_id
is thep primary key for thefilm
table.
Relationshps can be established between two tables through a join table. This table is called film_actor
:
When the primary keys of another table are used in a join table, they are called foreign keys.
Q: I want to know what movie
actor_id = 3
is in. How would I find it?Answer
- Look at the
film_actor
table and find the row whereactor_id = 3
.- Take note of the
film_id
.- Then, in the
film
table, find the row with thefilm_id
you found earlier.- Then look at the
title
column!
An Entity Relation Diagram (ERD) illistrates the properties of tables (a.k.a. "entities") and their relationships with other tables/entities.
SQL (Structured Querying Language) is the language used by many relational DBMSs to create, modify, and retrieve data.
SQL is written in statements that are composed of keywords and clauses.
SELECT actor_id, first_name
FROM actor
WHERE last_name = 'DAVIS';
Keywords determine the operation to be performed. Clauses modify the operation.
- Common keywords:
SELECT
,FROM
,INSERT INTO
,UPDATE
,DELETE
- Common clauses:
WHERE
,GROUP BY
,ORDER BY
Statements can also include functions, operators, and identifiers.
Q: What does the query above look for?
Answer
The first names of actors who have the last name 'DAVIS'
- SQL is NOT case sensitive. However, it is a standard practice to write keywords in all-caps to distinguish them from values.
- SQL strings MUST be written with single quotes
''
to avoid unwanted errors. - SQL queries MUST end in a semicolon
;
to avoid unwanted errors.
For practice, follow along with these instructions.
-
Go to the command line and connect to your database by running the command
psql
.- Now you are connected to PostgreSQL!
- To exit from PostgreSQL, use
control d
.
-
Run the command
\l
within yourpsql
command line to see existing databases. -
Create a database called
marcy
by running the commandCREATE DATABASE marcy;
- NOTE the semi-colon is important for ending the execution of this command! Makes sure to include it.
- You have successfully created a database if your
psql
command line outputsCREATE DATABASE
.
-
Next, connect to your database by running the command
\c marcy
.- Successfully connecting to your database will output
You are now connected to database "marcy" as user "[your username]"
.
- Successfully connecting to your database will output
Creating a lessons
table with the columns id
, title
, and instructor
:
CREATE TABLE lessons(
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
instructor TEXT NOT NULL
);
id
will be the primary key. TheSERIAL
data type allows you to automatically generate unique integer numbers (IDs, identity, auto-increment, sequence) for a column.title
andinstructor
will be of typeTEXT
and are required fields (NOT NULL
).
Dropping the lessons
table:
DROP TABLE lessons;
Inserting a new record/row/object into the lessons
table with a title
value of 'Data Structures'
and a instructor
value of 'Ben'
. The id
is automatically generated.
INSERT INTO lessons (title, instructor)
VALUES('Data Structure', 'Ben');
Get all data from the title
and instructor
columns in the lessons
table:
SELECT title, instructor
FROM lessons;
Get all data from all columns from the lessons
table:
SELECT *
FROM lessons;
Get only all data from all coluns in the lessons
table, but only from the rows where the value in the instructor
column is 'Ben'
:
SELECT *
FROM lessons
WHERE instructor='Ben';
Same as above, but now we are also including rows where the value in the instructor
column is 'Gonzalo'
:
SELECT *
FROM lessons
WHERE instructor='Ben' OR instructor='Gonzalo';
Renaming queries with AS
:
SELECT title AS "Course Title"
FROM lessons;
Counting the number of rows in the lessons
table:
SELECT COUNT(*)
FROM lessons
Counting the number of lessons taught by 'Ben'
:
SELECT COUNT(*)
FROM lessons
WHERE instructor='Ben'
Collapsing columns with GROUP BY
:
SELECT COUNT(*), instructor
FROM lessons
WHERE instructor='ben'
GROUP BY instructor;
UPDATE lessons
SET instructor = 'Gonzalo'
WHERE title = 'Data Structures';
DELETE FROM lessons
WHERE title = 'Data Structures';