Introduction to BERT

In this lab, we will use BERT model for some NLP tasks.

Why BERT?

BERT (Bidirectional Encoder Representations from Transformers) was first introduced towards the end of 2018, and quickly became a hot topic in NLP. The main reasons are :

It demonstrates a very sophisticated knowledge of language, achieving human-level performance on certain tasks.
It can be applied to a variety of tasks.
It offers the benefits of pre-training and fine-tuning. BERT has been pre-trained on a very large text corpus, and we can leverage its understanding of language by taking the pre-trained model and fine-tuning it on your own application (e.g., classification, entity recognition, question answering, etc.). This can allow to achieve highly accurate results on other target tasks with minimal design work.
It also put into emphasis the benefits of the principle of self-supervised learning.

Before we start

In this lab, and given how computationally expensive BERT is, you can't run the experiments on your local machines. So you will use Google Colab. All you need to do is to upload all of the notebooks in this folder from you local machine to your Google drive account. Then, when you want to run a specific notebook, simply go to your drive, and open the notebook using google colaboratory.

Important note: every time you start the google colaboratory, make sure you are using a GPU. Go to Edit -> Notebook settings. Then in Hardware accelerator, select GPU.

This lab

In this lab, you will try to get a high-level understanding of BERT. The lab is based on the awesome Transformers library from huggingface 🤗, and will consist of the following step:

Intro to BERT: in this document (link), we present a high-level overview of BERT, and go through the necessary components that we need for the rest of the lab. Please take a bit of time and read the intro to bert.pdf.
BERT lab1: investigating the BERT vocabulary.
BERT lab2: Applying BERT to sentence classification with CoLA dataset. The task is to predict if a sentence is grammatically correct or not.
BERT lab3: Applying BERT to sentence classification Wikipedia Personal Attacks. The task is to predict if a comment contains a personal attack or not.
BERT lab4: Applying BERT to question answering. The task if to predict the span of the answer given a reference text containing it.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Figs		Figs
BERT_vocabulary.ipynb		BERT_vocabulary.ipynb
QA.pdf		QA.pdf
README.md		README.md
document_classification.ipynb		document_classification.ipynb
intro_to_bert.pdf		intro_to_bert.pdf
question_answering.ipynb		question_answering.ipynb
sentence_classification.ipynb		sentence_classification.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction to BERT

Why BERT?

Before we start

This lab

Going deeper. Some resources:

Papers

Blogs & Tutorials

About

Releases

Packages

Languages

hudelotc/Etudes_de_cas_mention_IA_CS

Folders and files

Latest commit

History

Repository files navigation

Introduction to BERT

Why BERT?

Before we start

This lab

Going deeper. Some resources:

Papers

Blogs & Tutorials

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages