Skip to content

LeonS-creator/NOAH

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

README.md for NOAH


NOAH: Negative Outlier Analysis Hub

Introduction

NOAH (Negative Outlier Analysis Hub) is a Python toolkit designed to facilitate the sampling of negative instances based on both syntactic and semantic similarity measures. This tool is intended for researchers and developers working in machine learning, particularly those involved in natural language processing and recommendation systems, who require robust methods for identifying dissimilar or outlier examples in their datasets.

Features

  • Syntactic Similarity Sampling: Utilizes SpaCy to generate syntactic embeddings and sample negatives.
  • Semantic Similarity Sampling: Leverages the power of Sentence Transformers for deep semantic analysis.
  • Easy Integration: Designed to be easily integrated with existing machine learning pipelines.
  • Customizable: Allows for customization of similarity thresholds and sampling strategies.

Requirements

To run NOAH, you will need:

  • Python 3.6+
  • SpaCy
  • Sentence Transformers
  • NumPy
  • TensorFlow (optional, for handling large datasets efficiently)

Installation

Install NOAH by cloning this repository and installing the required packages:

git clone https://github.com/LeonS-creator/NOAH.git

How It Works

NOAH uses NLP models to encode sentences and calculates their similarity. You can choose between syntactic and semantic models based on your specific needs:

  • Syntactic Model: Uses SpaCy for faster, rule-based vectorizations.
  • Semantic Model: Utilizes Sentence Transformers for deeper, context-aware embeddings.

Contributing

We welcome contributions to NOAH! If you have suggestions for improvements or want to contribute code, please:

  1. Fork the repository.
  2. Create a new branch for your changes.
  3. Submit a pull request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Negative Outlier Analysis Hub

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages