-
Notifications
You must be signed in to change notification settings - Fork 8
Pipeline components
melisa-qordoba edited this page Sep 23, 2020
·
8 revisions
ReplaCy pipeline API has been inspired by spaCy pipelines.
A pipeline component signature is List[Span]
-> List[Span]
, that is, each pipeline component takes a list of spans, and then passes it to the next component.
Any function with the following properties can be added to the replaCy pipeline, which makes easy to write and use custom replaCy extensions.
Be default replaCy pipeline consist of sorter
, filter
and joiner
.
If replaCy is instantiated with a kenlm model
(see: ranking)
-
sorter
- sorts suggestions -
filter
- filters suggestions according tomax_count
properties (see: filtering) -
joiner
- joins spans into text
if kenlm
is not passed, sorter
and filter
do nothing.
import en_core_web_sm
from replacy import ReplaceMatcher
from replacy.db import load_json
nlp = en_core_web_sm.load()
replaCy = ReplaceMatcher(nlp, load_json('path to match dict(s)'))
replaCy.pipe_names
Example:
import en_core_web_sm
from replacy import ReplaceMatcher
from replacy.db import load_json
from spacy.util import filter_spans
nlp = en_core_web_sm.load()
replaCy = ReplaceMatcher(nlp, load_json('path to match dict(s)'))
replaCy.add_pipe(filter_spans, name="filter_spans", before="joiner")
Check the list of custom replaCy extensions.