Skip to content

Commit

Permalink
documentation update
Browse files Browse the repository at this point in the history
  • Loading branch information
PennyHow committed Dec 23, 2021
1 parent 9f878a0 commit 2c73bc1
Show file tree
Hide file tree
Showing 7 changed files with 64 additions and 28 deletions.
35 changes: 27 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ There are four key objects within Biblyser:


## Quick start
Biblyser can either be installed with pip or clones from the Github repository.
Biblyser can either be installed with pip or cloned from the Github repository.

```python
pip install biblyser
Expand All @@ -27,7 +27,7 @@ git clone https://github.com/GEUS-Glaciology-and-Climate/Biblyser
When cloning from the Github repository, you will need to create a conda environment with the required package dependencies by installing the Biblyser's dependencies using pip.

```python
pip install pybyliometrics, habanero, scholarly, gender_guesser, pandas, beautifulsoup4
pip install pybyliometrics, habanero, scholarly, gender_guesser, pandas, numpy
```

Try running one of the example scripts from the repository to see that it works. To access the Scopus API through the pybliometrics package, you will need to configure your API key.
Expand All @@ -43,7 +43,7 @@ An API key or Insttoken is needed to use the Scopus API. An API key can be gener
After this initial set-up, no editing of the example scripts should be needed - the scripts should run as they are. If they don't, there is likely an issue with your python environment.


## Name.py
## name.py
The Name object holds attributes about an individual to aid in searching for associated publications. This can be initialised using an individual's full name, with job title and gender as optional inputs, and additional keyword inputs for Orcid ID, Scopus ID, Google Scholar ID, and h-index.

```python
Expand All @@ -61,7 +61,7 @@ n = Name('Jane Emily Doe',
Various name and initial formats are computed from Name object, which maximise the chance of finding all associated publications. The gender of each name can either be provided during initialisatoin, or guessed using `gender_guesser`. The gender definition is used later on to analyse gender distribution in a **BibCollection**.


## Organisation.py
## organisation.py
The Organisation object holds a collection of **Name** objects which represent a group of authors, department, or organisation. The GEUS G&K organisation can be fetched either from the GEUS G&K Pure portal (only retrieves registered authors) or from the staff directory webpage (all G&K members). This information is fed directly into an Organisation object.

```python
Expand Down Expand Up @@ -112,7 +112,7 @@ df = org.asDataFrame()
```


## Bib.py
## bib.py
A Bib object holds the relevant information associated with a single publication, namely:

+ DOI
Expand Down Expand Up @@ -142,12 +142,12 @@ Bib attributes are populated using the Scopus API provided by [pybliometrics](ht
Authorship of a publication can be queried within the Bib object, including queries by organisation and (guessed) gender.


## BibCollection.py
## bibcollection.py
A BibCollection object holds a collection of **Bib** objects, i.e. a database of all associated or selected publications. A BibCollection can be initialised from an **Organisation** (for which the BibCollection will search for all publications linked to each name in the organisation), a list of **Bib** objects, or a list of doi strings.

```python
from biblyser.organisation import Organisation
from biblyser.bibCollection import BibCollection
from biblyser.bibcollection import BibCollection


#BibCollection from an Organisation
Expand Down Expand Up @@ -199,7 +199,7 @@ df = bibs.asDataFrame()
```

## Computing gender metrics
Genders of each author within the Bib object are firstly guessed, and if the guessed gender is not certian then a gender database is used to check if the author and an associated gender exists. This database is an Organisation object, retaining all information about each author's name and gender. If a name is not found in the database then the user is prompted to manually define the gender, and then retains this new addition.
Genders of each author within the Bib object are firstly guessed, and if the guessed gender is not certian then a gender database is used to check if the author and an associated gender exists. This database is an **Organisation** object, retaining all information about each author's name and gender. If a name is not found in the database then the user is prompted to manually define the gender, and then retains this new addition.

```python
import copy
Expand All @@ -211,6 +211,25 @@ gdb = copy.copy(org)
bibs.getAllGenders(gdb)
```

The computed gender metrics can be used to determine a diversity index for an individual or organisation. This diversity index is based on the gender and affiliation/country composition in all publication authorships. Generally, this is determined from publications in the last five years, but can be changed as an optional parameter.

```python
from biblyser.bibcollection import calcDivIdx

calcDivIdx('Penelope How', #Name
5, #Years to calculate
scopus=True, #Bibs from scopus
scholar=False, #from scholar
crossref=False, #from crossref
check=True) #User check bibs?
```

An example script for calculating diveristy index is available in the Github repository [here](https://github.com/GEUS-Glaciology-and-Climate/Biblyser/blob/main/biblyser/examples/getDiv.py), which can be run from the command line.

```python
python getDiv calcDivIdx --name "Penelope How"
```

## Further development we are working on
+ Incorporation of other search APIs for publications, such as [Web Of Science](https://pypi.org/project/wos/)
+ Fetch journal impact factor
Expand Down
20 changes: 19 additions & 1 deletion docs/source/diversityindex.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,22 @@
Diversity Index
===============

The diversity index is a metric for evaluating diversity in an individual's co-authorship.
The computed bibcollection metrics can be used to determine a diversity index for an individual or organisation. This diversity index is based on the gender and affiliation/country composition in all publication authorships. Generally, this is determined from publications in the last five years, but can be changed as an optional parameter.

.. code-block:: python
from biblyser.bibcollection import calcDivIdx
calcDivIdx('Penelope How', #Name
5, #Years to calculate
scopus=True, #Bibs from scopus
scholar=False, #from scholar
crossref=False, #from crossref
check=True) #User check bibs?
An example script for calculating diveristy index is available in the Github repository [here](https://github.com/GEUS-Glaciology-and-Climate/Biblyser/blob/main/biblyser/examples/getDiv.py), which can be run from the command line.

.. code-block:: python
python getDiv calcDivIdx --name "Penelope How"
10 changes: 5 additions & 5 deletions docs/source/guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ The Name object holds attributes about an individual to aid in searching for ass

.. code-block:: python
from Name import Name
from biblyser.name import Name
# With fullname string
n = Name('Jane Emily Doe')
Expand All @@ -29,7 +29,7 @@ The Organisation object holds a collection of **Name** objects which represent a

.. code-block:: python
from Organisation import Organisation, fetchWebInfo
from biblyser.organisation import Organisation, fetchWebInfo
def fetchWebInfo(url, parser, fid, classtype, classid):
'''Get all up-to-date information (e.g. names, titles) from a
Expand Down Expand Up @@ -92,7 +92,7 @@ A Bib object can either be initiated from a doi string, a title string, or from

.. code-block:: python
from Bib import Bib
from biblyser.bib import Bib
# Bib object from doi string
pub = Bib(doi='10.5194/tc-11-2691-2017')
Expand All @@ -114,8 +114,8 @@ A BibCollection object holds a collection of **Bib** objects, i.e. a database of

.. code-block:: python
from Organisation import Organisation
from BibCollection import BibCollection
from biblyser.organisation import Organisation
from biblyser.bibcollection import BibCollection
# BibCollection from an Organisation
names = ['Penelope How', 'Nanna B. Karlsson', 'Kenneth D. Mankoff']
Expand Down
2 changes: 2 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
Biblyser
==========

**Biblyser** is an object-oriented Python workflow for computing and analysing bibliometrics for an individual or organisation.

.. toctree::
:maxdepth: 2
:caption: Contents:
Expand Down
13 changes: 5 additions & 8 deletions docs/source/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,24 +4,21 @@ Installation
Quickstart
----------

Clone `this repository <https://github.com/GEUS-Glaciology-and-Climate/Biblyser>`_ into your local directory
Biblyser can either be installed with pip or cloned from `this repository <https://github.com/GEUS-Glaciology-and-Climate/Biblyser>`_ into your local directory.

.. code-block:: python
git clone https://github.com/GEUS-Glaciology-and-Climate/Biblyser
Create a conda environment with the required package dependencies, either using the environment file provided in the repository.
pip install biblyser
.. code-block:: python
conda env create --file environment.yml
git clone https://github.com/GEUS-Glaciology-and-Climate/Biblyser
Or by installing the packages into your conda environment with pip
When cloning the repository, you will need to create a python environment with the required package dependencies, which can be installed with pip. either using the environment file provided in the repository.

.. code-block:: python
pip install pybyliometrics, habanero, scholarly, gender_guesser, pandas
pip install pybyliometrics, habanero, scholarly, gender_guesser, pandas, numpy
Scopus API configuration
Expand Down
8 changes: 4 additions & 4 deletions docs/source/modules.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Modules
=======

Name
name
----

.. automodule:: name
Expand All @@ -10,7 +10,7 @@ Name
:show-inheritance:


Organisation
organisation
------------

.. automodule:: organisation
Expand All @@ -19,7 +19,7 @@ Organisation
:show-inheritance:


Bib
bib
---

.. automodule:: bib
Expand All @@ -28,7 +28,7 @@ Bib
:show-inheritance:


BibCollection
bibcollection
-------------

.. automodule:: bibcollection
Expand Down
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@
"Bug Tracker": "https://github.com/GEUS-Glaciology-and-Climate/Biblyser/issues",
},
keywords="publications citations academia science bibliometrics",
# package_dir={"": "Biblyser"},
# packages=setuptools.find_packages(where="Biblyser"),
# package_dir={"": "biblyser"},
#packages=setuptools.find_packages(where="biblyser"),
packages=setuptools.find_packages(),
classifiers=[
"Programming Language :: Python :: 3",
Expand Down

0 comments on commit 2c73bc1

Please sign in to comment.