Skip to content

xinweixu1/nlp-AI-course

Repository files navigation

nlp-AI-course

Pre-course test codes

  • 01 The Spiral Memory Problem - see Peter Norvig's solution here.
  • 02 Number Finding
  • 03 Grammar - see solution in lecture 01.

See Gao's solutions here.

Table of Contents

Week 0 Math Basics

  • Representation
  • Calulus
  • Logic
  • Linear Algebra
  • Probability
  • Graph
  • Dynamic Programming (动态规划)

Week 1 Basics

  • 人工智能主要内容介绍
  • 人工智能主要解决的问题
    • 两个维度 --- ”类人“与”理性“
    • 如何自动化“智慧”
  • 自然语言处理解决问题的难点
    • text = logic
    • unstructured data
    • diversity & hetereogeneity
  • 人工智能解决问题的范式--Part-1
    • 基于搜索 (BFS & DFS; Map application & Decision Tree)
    • 基于规则,从规则驱动到数据驱动 (From Rules to Data-Driven)
    • 基于数学分析 -- (第二课)
    • 基于概率 -- (第二课)
    • 基于机器学习 -- (第三课)
  • Assignment 01 -- 基于规则实现简单中文对话

Week 2

  • Language Models
    • from rule-based to probability-based
    • one-gram and two-gram models
    • using regular expressions
  • Assignment 02 -- implement language models using the Wikipedia corpus

Week 3

  • Simple Machine Learning Models
  • Heuristic Search
    • from BFS, DFS to best-first search
  • Assignment 03 -- implement a search agent using the Beijing Subway data

Week 4

  • Dyanmic Programming
    • three steps for solving a DP problem:
        1. 分析子问题重复性;
        1. 将子问题的解存储起来;
        1. 解析和构建solution
  • Basic NLP methods
    • edit distance

Week 5

  • Word Embedding
    • from edit distance to word embedding (why?)
    • Word2Vec
  • Name Entity Recogntion & Dependency Parsing

Week 6

  • Keywords Extraction
    • from word2vec to vectorizing chunks of text
    • TF-IDF (Term Frequency-Inverse Document Frequency)
    • Word cloud
    • Scikit-learning & simple classification model
      • cosine similarity
  • Search Engine
    • boolean search
    • from the naive search (TF-IDF) to page rank
  • Group project 01 - keywords extraction & search using the news corpus

Week 8 - 10

  • Basic ML Methods
    • Supervised learning:
      • linear & logistic regressions (gradient descent, MSE, loss function, cross-entropy)
      • KNN
      • SVM (kernel function, support vector)
      • Naïve Bayesian Classfier
      • Decision Tree
      • Random Forest (XGBoost)
    • Unsupervised learning:
      • Clustering (hierarchical, k-means)
      • Embedding cluster
    • Semi-supervised & active learning
  • Model evaluation:
    • Underfitting vs. Overfitting
    • precision, recall, F1, F2, MSE, loss function
    • bias & variance
  • Preprocessing data:
    • balance, noise, colinearity, normalization/rescaling

Week 11 - 12

  • Neural Networks
    • the rise of NN
    • architecture of NN (layers, activation fucntion, back propagation)
    • loss function, cross-entropy, (stochastic) gradient descent
    • implementing a simple NN
    • tensorflow, keras, pytorch

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published