bayes_fss

Feature subset selection for Naive Bayes classification.

Purpose

This is a small tool for automatically improving the performance of a Naive Bayes classifier. Given a labelled dataset consisting in a number of categorized features, it creates several classifiers that each use a distinct feature subset and evaluates their performance through cross-validation, looking for the best performing one. When an optimal feature subset is found, processing stops, and a summary of the achieved performance is displayed.

Example

Look at the sample dataset in the test/data directory. It uses six features to predict the quality of a car: number of doors, price, etc. We can check the performance of the Naive Bayes classifier that uses all these features as follows:

$ bayes_fss --search=none test/data/cars.tsv
{
   "subset": ["buying","maint","doors","persons","lug_boot","safety"],
   "accuracy": 92.956522,
   "precision": 77.187566,
   "recall": 58.949352,
   "F1": 64.245426,
   "subsets_evaluated": 1,
   "interrupted": false
}

Can we do better? Let's check:

$ bayes_fss --search=backward-join test/data/cars.tsv
{
   "subset": [["buying","maint"],["doors","lug_boot","safety"],"persons"],
   "accuracy": 97.159420,
   "precision": 94.168320,
   "recall": 86.875141,
   "F1": 90.011069,
   "subsets_evaluated": 53,
   "interrupted": false
}

The above means that, to obtain a better performance, the buying and maint features should be merged into a single feature, as well as the features doors, lug_boot and safety, so that merely three features remain.

For full details, see the PDF manual in the doc directory, or type man bayes_fss after installation.

Building

You need a C11 compiler, which typically means GCC or Clang on Unix. You can then invoke the usual:

$ make && sudo make install

There is no other dependency than the C standard library.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
doc		doc
scripts		scripts
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
bayes_fss		bayes_fss

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bayes_fss

Purpose

Example

Building

References

About

Releases

Packages

Languages

License

michaelnmmeyer/bayes_fss

Folders and files

Latest commit

History

Repository files navigation

bayes_fss

Purpose

Example

Building

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages