Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adicionando o meu pacote de analise de textos em pt-br e eng #7

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 12 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,31 @@
# package_name
# analisador_texto_pt_br_e_eng

Description.
The package package_name is used to:
-
-
The package analisador_texto_pt_br_e_eng is used to:
-Analisa os textos em português ou em inglês, identificando palavras mais comuns, removendo stopwords e realizando outras análises de texto.
-Usa a biblioteca spaCy para processamento de linguagem natural em ambas as línguas.

## Installation

Use the package manager [pip](https://pip.pypa.io/en/stable/) to install package_name

```bash
pip install package_name
pip install analisador_texto_pt_br_e_eng
```

## Usage

```python
from package_name.module1_name import file1_name
file1_name.my_function()
from analisador_texto_pt_br_e_eng.analisador_texto_pt_br import pt_br
pt_br.analisar_texto("Seu texto em português aqui.")
```

```
from analisador_texto_pt_br_e_eng.analisador_texto_eng import eng
eng.analyze_text("Your English text here.")
```
## Author
My_name
Alexsandro

## License
[MIT](https://choosealicense.com/licenses/mit/)
File renamed without changes.
15 changes: 15 additions & 0 deletions analisador_texto_pt_br_e_eng/analisador_texto_pt_br/pt_br.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import spacy
from collections import Counter


nlp = spacy.load("pt_core_news_lg")

def analisar_texto(texto):
doc = nlp(texto.lower())
tokens_filtrados = [token.text for token in doc if not token.is_stop and not token.is_punct]
contagem_palavras = Counter(tokens_filtrados)
return {
'total_palavras': len([token for token in doc if not token.is_punct]),
'palavras_filtradas': len(tokens_filtrados),
'palavras_mais_comuns': contagem_palavras.most_common(5)
}
14 changes: 14 additions & 0 deletions analisador_texto_pt_br_e_eng/analyze_text_eng/eng.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
import spacy
from collections import Counter

nlp = spacy.load("en_core_web_lg")

def analyze_text(text):
doc = nlp(text.lower())
filtered_tokens = [token.text for token in doc if not token.is_stop and not token.is_punct]
word_count = Counter(filtered_tokens)
return {
'total_words': len([token for token in doc if not token.is_punct]),
'filtered_words': len(filtered_tokens),
'most_common_words': word_count.most_common(5)
}
Empty file.
Empty file.
Empty file.
Empty file.
2 changes: 2 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
spacy >= 3.0
setuptools >= 42
10 changes: 5 additions & 5 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,14 @@
requirements = f.read().splitlines()

setup(
name="package_name",
name="Analisador_Texto_pt_br_e_ingles",
version="0.0.1",
author="my_name",
author_email="my_email",
description="My short description",
author="Alexsandro",
author_email="alecsbezerra@gmail.com",
description="Analisador de texto em português e inglês usando spaCy.",
long_description=page_description,
long_description_content_type="text/markdown",
url="my_github_repository_project_link"
url="https://github.com/alexxs2/package-template",
packages=find_packages(),
install_requires=requirements,
python_requires='>=3.8',
Expand Down