Speller is a high-performance program that checks the spelling of words in a text file against a dictionary. Built with a hash table for efficient lookups, it identifies misspelled words while providing detailed performance metrics for key operations like loading the dictionary, checking words, and freeing memory.
This project demonstrates core principles of data structures, algorithms, and memory management in C.
- Efficient Spell-Checking: Uses a hash table for rapid word lookups.
- Customizable Dictionaries: Accepts any dictionary file in plain text format.
- Performance Metrics: Reports the time taken for:
- Loading the dictionary.
- Checking words.
- Determining dictionary size.
- Unloading memory.
- Optimized Hashing: Implements the
djb2
hash function for balanced performance and distribution.
Before running the program, ensure your system has:
- A C compiler (
clang
orgcc
). - The
make
build tool.
Follow the instructions below to install Clang and Make based on your operating system.
- Update your package list:
sudo apt update
- Install Clang and Make:
sudo apt install clang make
- Verify Clang installation:
clang --version
- Install Xcode Command Line Tools:
xcode-select --install
- Verify Clang installation:
clang --version
- Download the LLVM installer from the official LLVM website.
- Run the installer and follow the on-screen instructions.
- Add the LLVM
bin
directory (e.g.,C:\Program Files\LLVM\bin
) to your system'sPATH
environment variable:- Open "Environment Variables" in Windows settings.
- Edit the
Path
variable and add the directory.
- Verify Clang installation:
clang --version
Clone the project repository to your local machine:
git clone https://github.com/Shahir-47/speller.git
cd speller/speller
Use the provided Makefile
to build the program:
make speller
This will generate an executable file named speller
.
Run the program using the following syntax:
./speller [DICTIONARY] text
- DICTIONARY (optional): Path to the dictionary file. Defaults to
dictionaries/large
if not provided. - text (required): Path to the text file to be spell-checked.
./speller texts/lalaland.txt
./speller dictionaries/small texts/shakespeare.txt
MISSPELLED WORDS
[...]
AHHHHHHHHHHHHHHHHHHHHHHHHHHHT
[...]
Shangri
[...]
fianc
[...]
WORDS MISSPELLED: 15
WORDS IN DICTIONARY: 143091
WORDS IN TEXT: 1024
TIME IN load: 0.12
TIME IN check: 0.45
TIME IN size: 0.01
TIME IN unload: 0.04
TIME IN TOTAL: 0.62
├── speller/
│ ├── dictionaries/
│ │ ├── small
│ │ └── large
│ ├── texts/
│ │ ├── (various test text files)
│ ├── dictionary.c
│ ├── dictionary.h
│ ├── speller.c
│ ├── Makefile
└── README.md
dictionaries/
: Contains dictionary files, includingsmall
for debugging andlarge
with 143,091 words.texts/
: Test text files for running the program.dictionary.c
: Implements dictionary functions likeload
,check
, andunload
.dictionary.h
: Header file with function prototypes and constants.speller.c
: Main program that integrates dictionary functions and benchmarks.Makefile
: Automates the build process.
load
: Reads the dictionary file into a hash table.check
: Checks if a word exists in the hash table.hash
: Maps a word to a specific bucket in the hash table.size
: Returns the number of words loaded in the dictionary.unload
: Frees all memory used by the dictionary.
- Number of Buckets (
N
): 24,000 for optimized performance. - Hash Function: Based on the djb2 algorithm, designed for speed and uniqueness.
The texts/
directory contains various test files, such as:
lalaland.txt
shakespeare.txt
tolstoy.txt
To test the program:
./speller dictionaries/small texts/shakespeare.txt
- A list of misspelled words from the text.
- Performance benchmarks for dictionary loading, word checking, and unloading.
Developed by Shahir Ahmed.
This project is licensed under the MIT License.