Skip to content

πŸ‘¨πŸ»β€πŸ«πŸš§ Experiment: Training a Neural Network with Wikipedia Articles

Notifications You must be signed in to change notification settings

trbndev/wikibert

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ‘¨πŸ»β€πŸ« Wikibert

This is an experimental weekend project for testing purposes.

πŸ“ Overview

Wikibert is a project that demonstrates text generation using Wikipedia data. It consists of two main components:

  • 🐍 A Python script (get_data.py) for fetching and saving Wikipedia pages.
  • πŸ““ A Jupyter notebook (wikibert.ipynb) for training a text generation model using TensorFlow.

πŸ› οΈ get_data.py

The get_data.py script performs the following tasks:

  • πŸ”„ Fetches a Wikipedia page and its linked pages recursively up to a specified depth.
  • πŸ’Ύ Saves the content of each page into individual text files in a data folder.
  • 🧹 Sanitizes filenames to ensure compatibility with the file system.

πŸ“Š wikibert.ipynb

The wikibert.ipynb notebook includes the following steps:

  • πŸ” Loads and preprocesses text data from the saved Wikipedia pages.
  • πŸ—οΈ Builds and trains a GRU-based RNN model for text generation using TensorFlow.
  • πŸ’Ύ Saves model checkpoints during training.
  • πŸ“ Generates text using the trained model.

πŸš€ Usage

  1. Run get_data.py to fetch and save Wikipedia pages.
  2. Open wikibert.ipynb in Jupyter Notebook or Google Colab.
  3. Follow the cells in the notebook to train the text generation model and generate text.

πŸ“‹ Requirements

  • Python 3.x
  • wikipedia-api library
  • TensorFlow
  • Jupyter Notebook (for wikibert.ipynb)

πŸ› οΈ Installation

Install the required libraries using pip:

pip install wikipedia-api tensorflow

πŸ“„ License

This project is licensed under the MIT License. See the LICENSE file for details.

About

πŸ‘¨πŸ»β€πŸ«πŸš§ Experiment: Training a Neural Network with Wikipedia Articles

Resources

Stars

Watchers

Forks