Vedanth's CS50AI Projects

This repository contains the code I wrote while taking Harvard University's reputed course, CS50AI. Showcased here are the projects I completed in each chapter of this course. I really enjoyed taking this course. The CS50AI lectures are available online for free! I'd highly recommend listening to them. Brian Yu does a fantastic job explaining the concepts covered in the course. You can find the playlist here.

Note: These files can also be used as a resource for anyone who hits a dead-end while taking the course. My code can be viewed to gain perspective on how to approach your problem. However, do not blatantly copy and paste the code as this would be against CS50's Academic Honesty Rules.

Down below, I have added a brief background of each project.

Week 0: Search

Degrees (BFS)

The assignment is about finding the shortest path between two nodes
The database comes from IMDb, and the task is to tell how one actor is connected to another through their common movie casts

The solution is based on Breadth-First Search (BFS) because the task requires the shortest path between nodes

$ python degrees.py large
Loading data...
Data loaded.
Name: Emma Watson
Name: Jennifer Lawrence
3 degrees of separation.
1: Emma Watson and Daniel Radcliffe starred in Harry Potter and the Chamber of Secrets
2: Daniel Radcliffe and James McAvoy starred in Victor Frankenstein
3: James McAvoy and Jennifer Lawrence starred in Dark Phoenix

Tic-Tac-Toe (Minimax)

The assignment is about writing an AI algorithm to play Tic-Tac-Toe optimally
The solution is based on the Minimax decision rule which perfectly works for games that clash two opponents against each other
- The algorithm's objective is to calculate the best utility out of all possible solutions.
- The algorithm relies on calculating prospective steps that the opponent might take
The tictactoe.py file (where the solution lies) consists of many minor functions that construct the game of Tic-Tac-Toe (finding out who is a winner, etc.)

Week 1: Knowledge

Knights (Propositional Logic & Inference)

The assignment is about solving puzzles using propositional logic
Using given module logic.py, the puzzles first need to be presented
- We first need to define base knowledge in each of the knowledge bases, eg: that knaves only lie and knights only tell the truth
- Then, given the statements of symbols (e.g. Symbol A says "We're both knaves" and Symbol B says nothing), we need to represent them using logic
  - This case involves using biconditionals to show that if A is a knight, his words are true and if not, they are lies

Using logic, such as and (∧), or (∨), biconditional (↔), inference can be derived that has the answer

$ python puzzle.py 
Puzzle 0
    A is a Knave
Puzzle 1
    A is a Knave
    B is a Knight
Puzzle 2
    A is a Knave
    B is a Knight
Puzzle 3
    A is a Knight
    B is a Knave
    C is a Knight

Minesweeper (Propositional Logic & Inference)

The assignment is about solving Minesweeper by drawing inferences on every available state
Each piece of knowledge is represented as a sentence that has a set of cells and several mines that the set contains
By knowing that a set is a subset of another set, we can tell they share the number of mines, which means that we can eliminate potential cells from the set (which is inference)
It is important to keep trying to derive inferences from the available knowledge every time something new is given or found

Week 2: Uncertainty

PageRank (Markov Models)

The assignment is about using probability to determine PageRank for HTML pages
The pagerank.py has two functions: sample_pagerank and iterative_pagerank
- Random Surfer Model (sample_pagerank) is about using transition models to represent a state in the Markov Chain and choose among its links to pages at random
- Iterative Algorithm (iterative_pagerank) is about using a recursive mathematical expression to see what the PageRank would be

It is important to note that normalization of all the resulted vectors is required in iterative_pagerank as some overall probability might result in more than 1

$ python pagerank.py corpus2
PageRank Results from Sampling (n = 10000)
  ai.html: 0.1888
  algorithms.html: 0.1021
  c.html: 0.1239
  inference.html: 0.1327
  logic.html: 0.0282
  programming.html: 0.2290
  python.html: 0.1232
  recursion.html: 0.0720
PageRank Results from Iteration
  ai.html: 0.1887
  algorithms.html: 0.1066
  c.html: 0.1239
  inference.html: 0.1291
  logic.html: 0.0264
  programming.html: 0.2296
  python.html: 0.1239
  recursion.html: 0.0717

Heredity (Bayesian Networks)

The assignment is about using a Bayesian network that models the relationships between getting a certain gene and making inferences about a population
We are given information about people, who their parents are, and whether they have a trait that is caused by a gene. The AI then infers the probability distribution for each person

The heredity.py has base probabilities for people who do not have parents listed. Using those base probabilities we can make inferences for their children based on the chances they inherited zero genes, one gene, or two genes and whether they exhibit a trait

$ python heredity.py data/family2.csv 
Arthur:
  Gene:
    2: 0.0147
    1: 0.0344
    0: 0.9509
  Trait:
    True: 0.0000
    False: 1.0000
Hermione:
  Gene:
    2: 0.0608
    1: 0.1203
    0: 0.8189
  Trait:
    True: 0.0000
    False: 1.0000
Molly:
  Gene:
    2: 0.0404
    1: 0.0744
    0: 0.8852
  Trait:
    True: 0.0768
    False: 0.9232
Ron:
  Gene:
    2: 0.0043
    1: 0.2149
    0: 0.7808
  Trait:
    True: 0.0000
    False: 1.0000
Rose:
  Gene:
    2: 0.0088
    1: 0.7022
    0: 0.2890
  Trait:
    True: 1.0000
    False: 0.0000

Week 3: Optimization

Crossword (Constraint Satisfaction: Node & Arc Consistency with Backtracking Search)

The assignment is about solving a crossword using a backtracking search that incorporates arc and node consistency
We are given three crossword grids and three-word collections
The generate.py consists of a class that implements Variable and Crossword classes from crossword.py
- It provides some base methods, but the rest (starting with enforce_node_consistency) had to be implemented
To implement the backtracking search, the assignment of words to grid variables first had to be consistent in nodes and edges (arcs)
- Node consistency is a unary constraint that requires all grid variables to only have potential words that are of the same length (grid variable of size 4 cannot fit the word "Hello")
- Arc consistency is a binary constraint that requires all grid variables to only have potential words that are unique from other variables and are consistent in terms of characters (grid variable has to have one identical character with another variable if they share a grid cell)

Week 4: Learning

Shopping (KNN Classifier)

The assignment is about using the provided data to train a nearest-neighbor classifier that would let us know if the user is going to purchase
The provided data set has certain evidence attributes like Administrative, Informational, etc. Those attributes constitute evidence that we use to train the model
The data also has Revenue which indicates if the user bought something
We first parse the data to buffer in load_data, then we use it to train model with scikit-learn's KNeighborsClassifier
Finally, we need to benchmark the model
- For this we use the understanding of sensitivity, that is the proportion of actual positive results to accurately predicted results, and specificity, which is the proportion of actual negative results to accurately predicted results
- In other words, we compare positive actual information to positive predicted values and negative actual information to negative predicted values.
  
  $ python shopping.py shopping.csv Correct: 4076 Incorrect: 856 True Positive Rate: 38.20% True Negative Rate: 90.56%

Nim (Reinforcement Learning)

The assignment is about training a model using Reinforcement Learning, meaning AI will repeatedly play against itself and either reward itself for the right action or punish itself for the wrong action
In particular, the concept used is Q-Learning where losing results in -1 and winning results in 1
- The formula is Q(s, a) <- Q(s, a) + alpha * (new value estimate - old value estimate), where Q(s, a) means a reward for the state s and action a
- alpha is the learning rate, which tells whether we need to make exploratory actions or not

As you can see from the example below, it does not always result in a perfect model that never loses as it does not explore all possible states like Minimax, but it is much less computationally demanding

$ python play.py
Playing training game 1
Playing training game 2
Playing training game 3
...
Playing training game 9999
Playing training game 10000
Done training

Piles:
Pile 0: 1
Pile 1: 3
Pile 2: 5
Pile 3: 7

Your Turn
Choose Pile: 1
Choose Count: 3

Piles:
Pile 0: 1
Pile 1: 0
Pile 2: 5
Pile 3: 7

AI's Turn
AI chose to take 7 from pile 3.

Piles:
Pile 0: 1
Pile 1: 0
Pile 2: 5
Pile 3: 0

Your Turn
Choose Pile: 2
Choose Count: 5

Piles:
Pile 0: 1
Pile 1: 0
Pile 2: 0
Pile 3: 0

AI's Turn
AI chose to take 1 from pile 0.

GAME OVER
Winner is Human
# I actually won here, which never happened to me before
# But it shows that reinforcement learning is not ideal
# as the model could not train for ALL possible outcomes

Week 5: Neural Networks

Traffic (Convolutional Neural Network)

The assignment is about using provided images to train a neural network that classifies road signs
The images were provided by the German Traffic Sign Recognition Benchmark.
To read the images, we use OpenCV-Python, and to build the network, we use Tensorflow Keras
To train the network as efficiently as possible, we applied the concepts of convolutional and pooling layers
- A convolutional layer serves to generalize the image by using a kernel matrix to filter the image into a fewer number of pixels
- A pooling layer serves the same purpose but through pooling one pixel out of its neighboring pixels to bring a more general view of the image. The particular type of pooling used is Max-Pooling which takes the highest pixel out of the square
The final structure of the neural network is [Convolutional, Max-Pooling, Convolutional, Max-Pooling, Flattening, Hidden Layer x3, Output Layer]

Note: I have not included the images (ppm files) provided by the GTSRB as they took too long to upload here. They can be downloaded from here.

$ python traffic.py gtsrb/
Epoch 1/10
497/497 [==============================] - 3s 7ms/step - loss: 2.8686 - accuracy: 0.3052 
Epoch 2/10
497/497 [==============================] - 4s 7ms/step - loss: 1.1733 - accuracy: 0.6533
Epoch 3/10
497/497 [==============================] - 4s 8ms/step - loss: 0.6370 - accuracy: 0.8115
Epoch 4/10
497/497 [==============================] - 4s 8ms/step - loss: 0.4136 - accuracy: 0.8793
Epoch 5/10
497/497 [==============================] - 4s 8ms/step - loss: 0.2983 - accuracy: 0.9171
Epoch 6/10
497/497 [==============================] - 4s 8ms/step - loss: 0.2715 - accuracy: 0.9251
Epoch 7/10
497/497 [==============================] - 4s 8ms/step - loss: 0.2239 - accuracy: 0.9393
Epoch 8/10
497/497 [==============================] - 4s 8ms/step - loss: 0.1857 - accuracy: 0.9497
Epoch 9/10
497/497 [==============================] - 4s 8ms/step - loss: 0.1619 - accuracy: 0.9581
Epoch 10/10
497/497 [==============================] - 4s 8ms/step - loss: 0.1507 - accuracy: 0.9604
331/331 - 1s - loss: 0.1485 - accuracy: 0.9654

Week 6: Language

Parser (Context-Free Grammar)

The assignment is about parsing a sentence to determine its structure
First, the task requires pre-processing the sentence to convert it into a list of words
Second, the task requires a set of context-free grammar rules on how sentences can be structured (the most challenging part)
- Building a set of rules that would let us parse all sentences took me a few hours
- I was able to derive that all 10 sentences are divided into two types,
  - a sentence that starts with a noun phrase and ends with a verb (adverbs can be after or before the verb)
  - a sentence that starts with another sentence (the first type) and ends with a conjunction in front of a verb or another sentence
Third, the task requires a list of noun phrase chunks, which is a noun phrase that does not have another noun phrase within it
- The context-free grammar rules used in our case do not allow such cases, so it is reasonable to count the number of noun phrases (nltk.tree documentation was helpful)

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Week 0 - Search		Week 0 - Search
Week 1 - Knowledge		Week 1 - Knowledge
Week 2 - Uncertainty		Week 2 - Uncertainty
Week 3 - Optimization		Week 3 - Optimization
Week 4 - Learning		Week 4 - Learning
Week 5 - Neural Networks		Week 5 - Neural Networks
Week 6 - Language		Week 6 - Language
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vedanth's CS50AI Projects

Week 0: Search

Week 1: Knowledge

Week 2: Uncertainty

Week 3: Optimization

Week 4: Learning

Week 5: Neural Networks

Week 6: Language

About

Releases

Packages

Languages

V3D4NTH/CS50AI

Folders and files

Latest commit

History

Repository files navigation

Vedanth's CS50AI Projects

Week 0: Search

Week 1: Knowledge

Week 2: Uncertainty

Week 3: Optimization

Week 4: Learning

Week 5: Neural Networks

Week 6: Language

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages