nlp-AI-course

Pre-course test codes

01 The Spiral Memory Problem - see Peter Norvig's solution here.
02 Number Finding
03 Grammar - see solution in lecture 01.

See Gao's solutions here.

人工智能主要解决的问题
- 两个维度 --- ”类人“与”理性“
- 如何自动化“智慧”
自然语言处理解决问题的难点
- text = logic
- unstructured data
- diversity & hetereogeneity
人工智能解决问题的范式--Part-1
- 基于搜索 (BFS & DFS; Map application & Decision Tree)
- 基于规则，从规则驱动到数据驱动 (From Rules to Data-Driven)
- 基于数学分析 -- (第二课)
- 基于概率 -- (第二课)
- 基于机器学习 -- (第三课)

Assignment 01 -- 基于规则实现简单中文对话

Week 2

Language Models
- from rule-based to probability-based
- one-gram and two-gram models
- using regular expressions

Assignment 02 -- implement language models using the Wikipedia corpus

Week 3

Simple Machine Learning Models
Heuristic Search
- from BFS, DFS to best-first search

Assignment 03 -- implement a search agent using the Beijing Subway data

Week 4

Dyanmic Programming
- three steps for solving a DP problem:
  - 1. 分析子问题重复性;
  - 1. 将子问题的解存储起来；
  - 1. 解析和构建solution
Basic NLP methods
- edit distance

Assignment 04 -- finish the edit distance solution parsing

Week 5

Word Embedding
- from edit distance to word embedding (why?)
- Word2Vec
Name Entity Recogntion & Dependency Parsing

Week 6

Keywords Extraction
- from word2vec to vectorizing chunks of text
- TF-IDF (Term Frequency-Inverse Document Frequency)
- Word cloud
- Scikit-learning & simple classification model
  - cosine similarity
Search Engine
- boolean search
- from the naive search (TF-IDF) to page rank
Group project 01 - keywords extraction & search using the news corpus

Week 8 - 10

Basic ML Methods
- Supervised learning:
  - linear & logistic regressions (gradient descent, MSE, loss function, cross-entropy)
  - KNN
  - SVM (kernel function, support vector)
  - Naïve Bayesian Classfier
  - Decision Tree
  - Random Forest (XGBoost)
- Unsupervised learning:
  - Clustering (hierarchical, k-means)
  - Embedding cluster
- Semi-supervised & active learning
Model evaluation:
- Underfitting vs. Overfitting
- precision, recall, F1, F2, MSE, loss function
- bias & variance
Preprocessing data:
- balance, noise, colinearity, normalization/rescaling

Week 11 - 12

Neural Networks
- the rise of NN
- architecture of NN (layers, activation fucntion, back propagation)
- loss function, cross-entropy, (stochastic) gradient descent
- implementing a simple NN
- tensorflow, keras, pytorch

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
L00		L00
L01-Intro-Search-Rules		L01-Intro-Search-Rules
L02-LanguageModels		L02-LanguageModels
L03-MachineLearning-HeuristicSearch		L03-MachineLearning-HeuristicSearch
L04-DynamicProgramming-EditDistance		L04-DynamicProgramming-EditDistance
L05-word2vec-Gensim		L05-word2vec-Gensim
L06-KeywordsExtraction-SearchEngine		L06-KeywordsExtraction-SearchEngine
L08-BasicML		L08-BasicML
L09-L10-BasicML-part2		L09-L10-BasicML-part2
L12-NeuralNet-fromScratch		L12-NeuralNet-fromScratch
Project01-ExtractingKeywords		Project01-ExtractingKeywords
pre-test		pre-test
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nlp-AI-course

Pre-course test codes

Table of Contents

Week 0 Math Basics

Week 1 Basics

Week 2

Week 3

Week 4

Week 5

Week 6

Week 8 - 10

Week 11 - 12

About

Releases

Packages

Languages

xinweixu1/nlp-AI-course

Folders and files

Latest commit

History

Repository files navigation

nlp-AI-course

Pre-course test codes

Table of Contents

Week 0 Math Basics

Week 1 Basics

Week 2

Week 3

Week 4

Week 5

Week 6

Week 8 - 10

Week 11 - 12

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages