documentation update

GEUS-Glaciology-and-Climate · Dec 23, 2021 · 2c73bc1 · 2c73bc1
1 parent 9f878a0
commit 2c73bc1
Show file tree

Hide file tree

Showing 7 changed files with 64 additions and 28 deletions.
diff --git a/README.md b/README.md
@@ -14,7 +14,7 @@ There are four key objects within Biblyser:
 
 
 ## Quick start
-Biblyser can either be installed with pip or clones from the Github repository.
+Biblyser can either be installed with pip or cloned from the Github repository.
 
 ```python
 pip install biblyser
@@ -27,7 +27,7 @@ git clone https://github.com/GEUS-Glaciology-and-Climate/Biblyser
 When cloning from the Github repository, you will need to create a conda environment with the required package dependencies by installing the Biblyser's dependencies using pip.
 
 ```python
-pip install pybyliometrics, habanero, scholarly, gender_guesser, pandas, beautifulsoup4
+pip install pybyliometrics, habanero, scholarly, gender_guesser, pandas, numpy
 ```
 
 Try running one of the example scripts from the repository to see that it works. To access the Scopus API through the pybliometrics package, you will need to configure your API key.
@@ -43,7 +43,7 @@ An API key or Insttoken is needed to use the Scopus API. An API key can be gener
 After this initial set-up, no editing of the example scripts should be needed - the scripts should run as they are. If they don't, there is likely an issue with your python environment.
 
 
-## Name.py
+## name.py
 The Name object holds attributes about an individual to aid in searching for associated publications. This can be initialised using an individual's full name, with job title and gender as optional inputs, and additional keyword inputs for Orcid ID, Scopus ID, Google Scholar ID, and h-index. 
 
 ```python
@@ -61,7 +61,7 @@ n = Name('Jane Emily Doe',
 Various name and initial formats are computed from Name object, which maximise the chance of finding all associated publications. The gender of each name can either be provided during initialisatoin, or guessed using `gender_guesser`. The gender definition is used later on to analyse gender distribution in a **BibCollection**.
 
 
-## Organisation.py
+## organisation.py
 The Organisation object holds a collection of **Name** objects which represent a group of authors, department, or organisation. The GEUS G&K organisation can be fetched either from the GEUS G&K Pure portal (only retrieves registered authors) or from the staff directory webpage (all G&K members). This information is fed directly into an Organisation object.
 
 ```python
@@ -112,7 +112,7 @@ df = org.asDataFrame()
 ```
 
 
-## Bib.py
+## bib.py
 A Bib object holds the relevant information associated with a single publication, namely:
 
 + DOI
@@ -142,12 +142,12 @@ Bib attributes are populated using the Scopus API provided by [pybliometrics](ht
 Authorship of a publication can be queried within the Bib object, including queries by organisation and (guessed) gender.
 
 
-## BibCollection.py
+## bibcollection.py
 A BibCollection object holds a collection of **Bib** objects, i.e. a database of all associated or selected publications. A BibCollection can be initialised from an **Organisation** (for which the BibCollection will search for all publications linked to each name in the organisation), a list of **Bib** objects, or a list of doi strings.
 
 ```python
 from biblyser.organisation import Organisation
-from biblyser.bibCollection import BibCollection
+from biblyser.bibcollection import BibCollection
 
 
 #BibCollection from an Organisation
@@ -199,7 +199,7 @@ df = bibs.asDataFrame()
 ```
 
 ## Computing gender metrics
-Genders of each author within the Bib object are firstly guessed, and if the guessed gender is not certian then a gender database is used to check if the author and an associated gender exists. This database is an Organisation object, retaining all information about each author's name and gender. If a name is not found in the database then the user is prompted to manually define the gender, and then retains this new addition. 
+Genders of each author within the Bib object are firstly guessed, and if the guessed gender is not certian then a gender database is used to check if the author and an associated gender exists. This database is an **Organisation** object, retaining all information about each author's name and gender. If a name is not found in the database then the user is prompted to manually define the gender, and then retains this new addition. 
 
 ```python
 import copy
@@ -211,6 +211,25 @@ gdb = copy.copy(org)
 bibs.getAllGenders(gdb)
 ```
 
+The computed gender metrics can be used to determine a diversity index for an individual or organisation. This diversity index is based on the gender and affiliation/country composition in all publication authorships. Generally, this is determined from publications in the last five years, but can be changed as an optional parameter. 
+
+```python
+from biblyser.bibcollection import calcDivIdx
+
+calcDivIdx('Penelope How',     #Name
+	    5,                 #Years to calculate
+	    scopus=True,       #Bibs from scopus
+	    scholar=False,     #from scholar
+	    crossref=False,    #from crossref
+            check=True)        #User check bibs?
+```
+
+An example script for calculating diveristy index is available in the Github repository [here](https://github.com/GEUS-Glaciology-and-Climate/Biblyser/blob/main/biblyser/examples/getDiv.py), which can be run from the command line. 
+
+```python
+python getDiv calcDivIdx --name "Penelope How"
+```
+
 ## Further development we are working on
 + Incorporation of other search APIs for publications, such as [Web Of Science](https://pypi.org/project/wos/)
 + Fetch journal impact factor

diff --git a/docs/source/diversityindex.rst b/docs/source/diversityindex.rst
@@ -1,4 +1,22 @@
 Diversity Index
 ===============
 
-The diversity index is a metric for evaluating diversity in an individual's co-authorship. 
+The computed bibcollection metrics can be used to determine a diversity index for an individual or organisation. This diversity index is based on the gender and affiliation/country composition in all publication authorships. Generally, this is determined from publications in the last five years, but can be changed as an optional parameter. 
+
+.. code-block:: python
+
+   from biblyser.bibcollection import calcDivIdx
+
+   calcDivIdx('Penelope How',    #Name
+   	      5,                 #Years to calculate
+	      scopus=True,       #Bibs from scopus
+	      scholar=False,     #from scholar
+	      crossref=False,    #from crossref
+              check=True)        #User check bibs?
+
+An example script for calculating diveristy index is available in the Github repository [here](https://github.com/GEUS-Glaciology-and-Climate/Biblyser/blob/main/biblyser/examples/getDiv.py), which can be run from the command line. 
+
+.. code-block:: python
+
+   python getDiv calcDivIdx --name "Penelope How"
+
diff --git a/docs/source/guide.rst b/docs/source/guide.rst
@@ -8,7 +8,7 @@ The Name object holds attributes about an individual to aid in searching for ass
 
 .. code-block:: python
 
-   from Name import Name
+   from biblyser.name import Name
 	
    # With fullname string
    n = Name('Jane Emily Doe')
@@ -29,7 +29,7 @@ The Organisation object holds a collection of **Name** objects which represent a
 
 .. code-block:: python
 
-   from Organisation import Organisation, fetchWebInfo
+   from biblyser.organisation import Organisation, fetchWebInfo
 
    def fetchWebInfo(url, parser, fid, classtype, classid):
        '''Get all up-to-date information (e.g. names, titles) from a 
@@ -92,7 +92,7 @@ A Bib object can either be initiated from a doi string, a title string, or from
 
 .. code-block:: python
 
-   from Bib import Bib
+   from biblyser.bib import Bib
 
    # Bib object from doi string
    pub = Bib(doi='10.5194/tc-11-2691-2017') 		
@@ -114,8 +114,8 @@ A BibCollection object holds a collection of **Bib** objects, i.e. a database of
 
 .. code-block:: python
 
-   from Organisation import Organisation
-   from BibCollection import BibCollection
+   from biblyser.organisation import Organisation
+   from biblyser.bibcollection import BibCollection
 
    # BibCollection from an Organisation
    names = ['Penelope How', 'Nanna B. Karlsson', 'Kenneth D. Mankoff']

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -6,6 +6,8 @@
 Biblyser
 ==========
 
+**Biblyser** is an object-oriented Python workflow for computing and analysing bibliometrics for an individual or organisation.
+
 .. toctree::
    :maxdepth: 2
    :caption: Contents:

diff --git a/docs/source/installation.rst b/docs/source/installation.rst
@@ -4,24 +4,21 @@ Installation
 Quickstart
 ----------
 
-Clone `this repository <https://github.com/GEUS-Glaciology-and-Climate/Biblyser>`_ into your local directory
+Biblyser can either be installed with pip or cloned from `this repository <https://github.com/GEUS-Glaciology-and-Climate/Biblyser>`_ into your local directory.
 
 .. code-block:: python
 
-   git clone https://github.com/GEUS-Glaciology-and-Climate/Biblyser
-
-Create a conda environment with the required package dependencies, either using the environment file provided in the repository.
+   pip install biblyser
 
 .. code-block:: python
 
-   conda env create --file environment.yml
-
+   git clone https://github.com/GEUS-Glaciology-and-Climate/Biblyser
 
-Or by installing the packages into your conda environment with pip
+When cloning the repository, you will need to create a python environment with the required package dependencies, which can be installed with pip. either using the environment file provided in the repository.
 
 .. code-block:: python
 
-   pip install pybyliometrics, habanero, scholarly, gender_guesser, pandas
+   pip install pybyliometrics, habanero, scholarly, gender_guesser, pandas, numpy
 
 
 Scopus API configuration

diff --git a/docs/source/modules.rst b/docs/source/modules.rst
@@ -1,7 +1,7 @@
 Modules
 =======
 
-Name
+name
 ----
 
 .. automodule:: name
@@ -10,7 +10,7 @@ Name
    :show-inheritance:
 
 
-Organisation
+organisation
 ------------
 
 .. automodule:: organisation
@@ -19,7 +19,7 @@ Organisation
    :show-inheritance:
 
 
-Bib
+bib
 ---
 
 .. automodule:: bib
@@ -28,7 +28,7 @@ Bib
    :show-inheritance:
 
 
-BibCollection
+bibcollection
 -------------
 
 .. automodule:: bibcollection

diff --git a/setup.py b/setup.py
@@ -29,8 +29,8 @@
         "Bug Tracker": "https://github.com/GEUS-Glaciology-and-Climate/Biblyser/issues",
     },
     keywords="publications citations academia science bibliometrics",
-#    package_dir={"": "Biblyser"},
-#    packages=setuptools.find_packages(where="Biblyser"),
+#    package_dir={"": "biblyser"},
+    #packages=setuptools.find_packages(where="biblyser"),
     packages=setuptools.find_packages(),
     classifiers=[
         "Programming Language :: Python :: 3",