Merge branch 'master' of github.com:wikilinks/nel

semuar · Apr 27, 2016 · ea08134 · ea08134
2 parents a366d15 + 913e879
commit ea08134
Show file tree

Hide file tree

Showing 3 changed files with 14 additions and 15 deletions.
diff --git a/doc.requirements.txt b/doc.requirements.txt
diff --git a/docs/guides/models.md b/docs/guides/models.md
@@ -18,7 +18,7 @@ First, we must extract redirect mappings to properly resolve inter-article links
 sift build-corpus --save redirects WikipediaRedirects latest json
 ```
 
-The extracted rediect mappings are now stored under the `redirects` directory.
+The extracted redirect mappings are now stored under the `redirects` directory.
 
 Next, we perform full plain-text extraction over the Wikipedia dump, mapping links to their current Wikipedia target.
 
@@ -32,17 +32,17 @@ Our wikipedia corpus is now in the standard format for __sift__ corpora from whi
 
 We will now extract two simple count driven models from this corpus which are useful in entity linking.
 
-The first of model, "EntityCounts" is simply the total number of linkings for an entity over the corpus.
+The first model, "EntityCounts" simply collects the total count of inlinks for each entity over the corpus.
 
-We use this statistic as a proxy for the prior probability of an entity and expect that entities with a higher prior are more likely to linked.
+We use this statistic as a proxy for the prior probability of an entity and expect that entities with higher counts are more likely to be linked.
 
 ```
 sift build-doc-model --save ecounts EntityCounts processed redis --prefix models:ecounts[wikipedia]:
 ```
 
 The second model, "EntityNameCounts" collects the number of times a given anchor text string is used to link an entity.
 
-This statistic helps us model the conditional probability of an entity given the name used in text.
+This statistic helps us model the conditional probability of an entity given the name used to reference it in text.
 
 ```
 sift build-doc-model --save necounts EntityNameCounts processed --lowercase redis --prefix models:necounts[wikipedia]:

diff --git a/docs/index.md b/docs/index.md
@@ -4,21 +4,20 @@ __nel__ is an fast, accurate and highly modular framework for linking entities i
 
 Out of the box, __nel__ provides:
 
-- named entity recognition (DIY, or plug-in a NER system like Stanford, spaCy or Schwa)
-- in-document coreference clustering
-- candidate generation
-- multiple disambiguation features
+- named entity recognition
+- coreference clustering and candidate generation
+- multipple entity disambiguation feature models
 - a supervised learning-to-rank framework for entity disambiguation
 - a supervised nil detection system with configurable confidence thresholds
-- nil clustering
-- support for evaluation and error analysis of linking system output
+- basic nil clustering for out-of-KB entities
+- support for evaluating linker performance and running error analysis
 
-__nel__ is completely modular, it can:
+__nel__ is modular, it can:
 
-- link entities to any knowledge base you like (not limited to just Wikipedia or Freebase)
-- update, rebuild and redeploy linking models as a knowledge base changes over time
-- retrain recognition and disambiguation models on your own corpus of documents
-- easily adapt a linking pipeline to meet performance and accuracy tradeoffs
+- link entity mentions to any knowledge base you like (not just Wikipedia and Freebase!)
+- update, rebuild and redeploy models as a knowledge base changes over time
+- retrain recognition and disambiguation classifiers on your own corpus of documents
+- adapt linking pipelines to meet performance, precision and recall tradeoffs
 
 __nel__ is flexible, you can run it: