Skip to content

Commit

Permalink
Merge pull request #315 from sbillinge/sidocs
Browse files Browse the repository at this point in the history
doc: simon tweaks on the docs before release
  • Loading branch information
sbillinge authored Jan 1, 2025
2 parents 5359054 + 0977b7f commit 913d824
Show file tree
Hide file tree
Showing 6 changed files with 130 additions and 55 deletions.
6 changes: 3 additions & 3 deletions doc/source/examples/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ Examples
Landing page for diffpy.utils examples.

.. toctree::
parsers_example
diffraction_objects_example
transforms_example
resample_example
parsers_example
tools_example
transforms_example
diffraction_objects_example
82 changes: 56 additions & 26 deletions doc/source/examples/parsers_example.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,49 +9,65 @@ This example will demonstrate how diffpy.utils lets us easily process and serial
Using the parsers module, we can load file data into simple and easy-to-work-with Python objects.

1) To begin, unzip :download:`parser_data<./example_data/parser_data.zip>` and take a look at ``data.txt``.
Our goal will be to extract and serialize the data table as well as the parameters listed in the header of this file.
This is a fairly standard format for 1D powder diffraction data.
Our goal will be to extract the data, and the parameters listed in the header, from this file and
load it into our program.

2) To get the data table, we will use the ``loadData`` function. The default behavior of this
function is to find and extract a data table from a file.::
function is to find and extract a data table from a file.

.. code-block:: python
from diffpy.utils.parsers.loaddata import loadData
data_table = loadData('<PATH to data.txt>')
While this will work with most datasets, on our ``data.txt`` file, we got a ``ValueError``. The reason for this is
due to the comments ``$ Phase Transition Near This Temperature Range`` and ``--> Note Significant Jump in Rw <--``
embedded within the dataset. To fix this, try using the ``comments`` parameter. ::
While this will work with most datasets, on our ``data.txt`` file, we got a ``ValueError``. The reason for this is
due to the comments ``$ Phase Transition Near This Temperature Range`` and ``--> Note Significant Jump in Rw <--``
embedded within the dataset. To fix this, try using the ``comments`` parameter.

.. code-block:: python
data_table = loadData('<PATH to data.txt>', comments=['$', '-->'])
This parameter tells ``loadData`` that any lines beginning with ``$`` and ``-->`` are just comments and
more entries in our data table may follow.
This parameter tells ``loadData`` that any lines beginning with ``$`` and ``-->`` are just comments and
more entries in our data table may follow.

Here are a few other parameters to test out:
Here are a few other parameters to test out:

* ``delimiter=','``: Look for a comma-separated data table. Useful for csv file types.
However, since ``data.txt`` is whitespace separated, running ::
However, since ``data.txt`` is whitespace separated, running

.. code-block:: python
loadData('<PATH to data.txt>', comments=['$', '-->'], delimiter=',')
returns an empty list.
returns an empty list.
* ``minrows=50``: Only look for data tables with at least 50 rows. Since our data table has much less than that many
rows, running ::
rows, running

.. code-block:: python
loadData('<PATH to data.txt>', comments=['$', '-->'], minrows=50)
returns an empty list.
returns an empty list.
* ``usecols=[0, 3]``: Only return the 0th and 3rd columns (zero-indexed) of the data table. For ``data.txt``, this
corresponds to the temperature and rw columns. ::
corresponds to the temperature and rw columns.

.. code-block:: python
loadData('<PATH to data.txt>', comments=['$', '-->'], usecols=[0, 3])
3) Next, to get the header information, we can again use ``loadData``,
but this time with the ``headers`` parameter enabled. ::
but this time with the ``headers`` parameter enabled.

.. code-block:: python
hdata = loadData('<PATH to data.txt>', comments=['$', '-->'], headers=True)
4) Rather than working with separate ``data_table`` and ``hdata`` objects, it may be easier to combine them into a single
dictionary. We can do so using the ``serialize_data`` function. ::
dictionary. We can do so using the ``serialize_data`` function.

.. code-block:: python
from diffpy.utils.parsers.loaddata import serialize_data
file_data = serialize_data('<PATH to data.txt', hdata, data_table)
Expand All @@ -60,45 +76,59 @@ Using the parsers module, we can load file data into simple and easy-to-work-wit
# The entry is a dictionary containing data from hdata and data_table
data_dict = file_data['data.txt']
This dictionary ``data_dict`` contains all entries in ``hdata`` and an additional entry named
``data table`` containing ``data_table``. ::
This dictionary ``data_dict`` contains all entries in ``hdata`` and an additional entry named
``data table`` containing ``data_table``.

.. code-block:: python
here_is_the_data_table = data_dict['data table']
There is also an option to name columns in the data table and save those columns as entries instead. ::
There is also an option to name columns in the data table and save those columns as entries instead.

.. code-block:: python
data_table_column_names = ['temperature', 'scale', 'stretch', 'rw'] # names of the columns in data.txt
file_data = serialize_data('<PATH to data.txt>', hdata, data_table, dt_colnames=data_table_column_names)
data_dict = file_data['data.txt']
Now we can extract specific data table columns from the dictionary. ::
Now we can extract specific data table columns from the dictionary.

.. code-block:: python
data_table_temperature_column = data_dict['temperature']
data_table_rw_column = data_dict['rw']
5) When we are done working with the data, we can store it on disc for later use. This can also be done using the
``serialize_data`` function with an additional ``serial_file`` parameter.::
5) When we are done working with the data, we can store it on disk for later use. This can also be done using the
``serialize_data`` function with an additional ``serial_file`` parameter.

.. code-block:: python
parsed_file_data = serialize_data('<PATH to data.txt>', hdata, data_table, serial_file='<PATH to serialfile.json>')
The returned value, ``parsed_file_data``, is the dictionary we just added to ``serialfile.json``.
To extract the data from the serial file, we use ``deserialize_data``. ::
To extract the data from the serial file, we use ``deserialize_data``.
.. code-block:: python
from diffpy.utils.parsers.serialization import deserialize_data
parsed_file_data = deserialize_data('<PATH to serialdata.json>')
6) Finally, ``serialize_data`` allows us to store data from multiple text file in a single serial file. For one last bit
of practice, we will extract and add the data from ``moredata.txt`` into the same ``serialdata.json`` file.::
6) Finally, ``serialize_data`` allows us to store data from multiple text files in a single serial file. For one last bit
of practice, we will extract and add the data from ``moredata.txt`` into the same ``serialdata.json`` file.

.. code-block:: python
data_table = loadData('<PATH to moredata.txt>')
hdata = loadData('<PATH to moredata.txt>', headers=True)
serialize_data('<PATH to moredata.txt>', hdata, data_table, serial_file='<PATH to serialdata.json>')
The serial file ``serialfile.json`` should now contain two entries: ``data.txt`` and ``moredata.txt``.
The data from each file can be accessed using ::
The serial file ``serialfile.json`` should now contain two entries: ``data.txt`` and ``moredata.txt``.
The data from each file can be accessed using

.. code-block:: python
serial_data = deserialize_data('<PATH to serialdata.json>')
data_txt_data = serial_data['data.txt'] # Access data.txt data
moredata_txt_data = serial_data['moredata.txt'] # Access moredata.txt data
For more information, check out the :ref:`documentation<Parsers Documentation>` of the ``parsers`` module.
b
63 changes: 45 additions & 18 deletions doc/source/examples/resample_example.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,29 @@ given enough datapoints.
1) To start, unzip :download:`parser_data<./example_data/parser_data.zip>`. Then, load the data table from ``Nickel.gr``
and ``NiTarget.gr``. These datasets are based on data from `Atomic Pair Distribution Function Analysis: A Primer
<https://global.oup.com/academic/product/atomic-pair-distribution-function-analysis-9780198885801?cc=us&lang=en&>`_.
::

from diffpy.utils.parsers.loaddata import loadData
nickel_datatable = loadData('<PATH to Nickel.gr>')
nitarget_datatable = loadData('<PATH to NiTarget.gr>')
.. code-block:: python
Each data table has two columns: first is the grid and second is the function value.
To extract the columns, we can utilize the serialize function ... ::
from diffpy.utils.parsers.loaddata import loadData
nickel_datatable = loadData('<PATH to Nickel.gr>')
nitarget_datatable = loadData('<PATH to NiTarget.gr>')
Each data table has two columns: first is the grid and second is the function value.
To extract the columns, we can utilize the serialize function ...

.. code-block:: python
from diffpy.utils.parsers.serialization import serialize_data
nickel_data = serialize_data('Nickel.gr', {}, nickel_datatable, dt_colnames=['grid', 'func'])
nickel_grid = nickel_data['Nickel.gr']['grid']
nickel_func = nickel_data['Nickel.gr']['func']
target_data = serialize_data('NiTarget.gr', {}, nitarget_datatable, dt_colnames=['grid', 'function'])
target_grid = nickel_data['Nickel.gr']['grid']
target_func = nickel_data['Nickel.gr']['func']
To extract the columns, we can utilize the serialize function ...

.. code-block:: python
from diffpy.utils.parsers.serialization import serialize_data
nickel_data = serialize_data('Nickel.gr', {}, nickel_datatable, dt_colnames=['grid', 'func'])
Expand All @@ -30,42 +45,52 @@ given enough datapoints.
target_grid = nickel_data['Nickel.gr']['grid']
target_func = nickel_data['Nickel.gr']['func']
... or you can use any other column extracting method you prefer.
... or you can use any other column extracting method you prefer.

2) If we plot the two on top of each other ::
2) If we plot the two on top of each other

.. code-block:: python
import matplotlib.pyplot as plt
plt.plot(target_grid, target_func, linewidth=3)
plt.plot(nickel_grid, nickel_func, linewidth=1)
they look pretty similar, but to truly see the difference, we should plot the difference between the two.
We may want to run something like ... ::
they look pretty similar, but to truly see the difference, we should plot the difference between the two.
We may want to run something like ...

.. code-block:: python
import numpy as np
difference = np.subtract(target_func, nickel_func)
... but this will only produce the right result if the ``target_func`` and ``nickel_func`` are on the same grid.
Checking the lengths of ``target_grid`` and ``nickel_grid`` shows that these grids are clearly distinct.
... but this will only produce the right result if the ``target_func`` and ``nickel_func`` are on the same grid.
Checking the lengths of ``target_grid`` and ``nickel_grid`` shows that these grids are clearly distinct.

3) However, we can resample the two functions to be on the same grid. Since both functions have grids spanning
``[0, 60]``, let us define a new grid ... ::
``[0, 60]``, let us define a new grid ...

.. code-block:: python
grid = np.linspace(0, 60, 6001)
... and use the diffpy.utils ``wsinterp`` function to resample on this grid.::
... and use the diffpy.utils ``wsinterp`` function to resample on this grid.

.. code-block:: python
from diffpy.utils.resampler import wsinterp
nickel_resample = wsinterp(grid, nickel_grid, nickel_func)
target_resample = wsinterp(grid, target_grid, target_func)
We can now plot the difference to see that these two functions are quite similar.::
We can now plot the difference to see that these two functions are quite similar.

.. code-block:: python
plt.plot(grid, target_resample)
plt.plot(grid, nickel_resample)
plt.plot(grid, target_resample - nickel_resample)
This is the desired result as the data in ``Nickel.gr`` is every tenth data point in ``NiTarget.gr``.
This also shows us that ``wsinterp`` can help us reconstruct a function from incomplete data.
This is the desired result as the data in ``Nickel.gr`` is every tenth data point in ``NiTarget.gr``.
This also shows us that ``wsinterp`` can help us reconstruct a function from incomplete data.

4) In order for our function reconstruction to be perfect up to a truncation error, we require that (a) the function is
a Fourier transform of a band-limited dataset and (b) the original grid has enough equally-spaced datapoints based on
Expand All @@ -79,7 +104,9 @@ given enough datapoints.
Thus, our original grid requires :math:`25.0 * 60.0 / \pi < 478`. Since our grid has :math:`601` datapoints, our
reconstruction was perfect as shown from the comparison between ``Nickel.gr`` and ``NiTarget.gr``.

This computation is implemented in the function ``nsinterp``.::
This computation is implemented in the function ``nsinterp``.

.. code-block:: python
from diffpy.utils.resampler import nsinterp
qmin = 0
Expand Down
4 changes: 4 additions & 0 deletions doc/source/examples/transforms_example.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ This example will demonstrate how to use the functions in the
# Example: convert q to 2theta
from diffpy.utils.transforms import q_to_tth
wavelength = 0.71
q = np.array([0, 0.2, 0.4, 0.6, 0.8, 1])
tth = q_to_tth(q, wavelength)
Expand All @@ -32,6 +33,7 @@ This example will demonstrate how to use the functions in the
# Example: convert 2theta to q
from diffpy.utils.transforms import tth_to_q
wavelength = 0.71
tth = np.array([0, 30, 60, 90, 120, 180])
q = tth_to_q(tth, wavelength)
Expand All @@ -49,11 +51,13 @@ This example will demonstrate how to use the functions in the
# Example: convert d to q
from diffpy.utils.transforms import d_to_q
d = np.array([1.0, 0.8, 0.6, 0.4, 0.2])
q = d_to_q(d)
# Example: convert d to 2theta
from diffpy.utils.transforms import d_to_tth
wavelength = 0.71
d = np.array([1.0, 0.8, 0.6, 0.4, 0.2])
tth = d_to_tth(d, wavelength)
26 changes: 20 additions & 6 deletions doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,18 @@

.. |title| replace:: diffpy.utils documentation

diffpy.utils - Shared utilities for diffpy packages.
diffpy.utils - General utilities for analyzing diffraction data

| Software version |release|.
| Last updated |today|.
The diffpy.utils package provides general functions for extracting data from variously formatted text files as well as
some PDF-specific functionality. These include wx GUI utilities used by the PDFgui program and an interpolation function
based on the Whittaker-Shannon formula for resampling a bandlimited PDF or other profile function.
The diffpy.utils package provides a number of functions and classes designed to help
researchers analyze their diffraction data. It also includes some functionality for
carrying out PDF analysis. Examples are parsers for reading common format diffraction
data files, ``DiffractionObjects`` that allow you to do algebra on diffraction patterns,
tools for better capture and propagation of metadata,
diffraction-friendly interpolation routines, as well as some other tools used across
diffpy libraries.

Click :ref:`here<Utilities>` for a full list of utilities offered by diffpy.utils.

Expand All @@ -20,6 +24,7 @@ Examples
========
Illustrations of when and how one would use various diffpy.utils functions.

* :ref:`Manipulate and do algebra on diffraction data<Diffraction Objects Example>`
* :ref:`File Data Extraction<Parsers Example>`
* :ref:`Resampling and Data Reconstruction<Resample Example>`
* :ref:`Load and Manage User and Package Information<Tools Example>`
Expand All @@ -30,8 +35,9 @@ Authors

diffpy.utils is developed by members of the Billinge Group at
Columbia University and at Brookhaven National Laboratory including
Pavol Juhás, Christopher L. Farrow, the Billinge Group
and its community contributors.
Pavol Juhás, Christopher L. Farrow, Simon J. L. Billinge, Andrew Yang,
with contributions from many Billinge Group members and
members of the diffpy community.

For a detailed list of contributors see
https://github.com/diffpy/diffpy.utils/graphs/contributors.
Expand All @@ -43,6 +49,14 @@ Installation
See the `README <https://github.com/diffpy/diffpy.utils#installation>`_
file included with the distribution.

========
Citation
========

If you use this program for a scientific research that leads to publication, we ask that you acknowledge use of the program by citing the following paper in your publication:

P. Juhás, C. L. Farrow, X. Yang, K. R. Knox and S. J. L. Billinge, Complex modeling: a strategy and software program for combining multiple information sources to solve ill posed structure and nanostructure inverse problems, Acta Crystallogr. A 71, 562-568 (2015).

=================
Table of contents
=================
Expand Down
4 changes: 2 additions & 2 deletions doc/source/license.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@ OPEN SOURCE LICENSE AGREEMENT
Lawrence Berkeley National Laboratory
| Copyright (c) 2014, Australian Synchrotron Research Program Inc., ("ASRP")
| Copyright (c) 2006-2007, Board of Trustees of Michigan State University
| Copyright (c) 2008-2012, The Trustees of Columbia University in
the City of New York
| Copyright (c) 2014-2019, Brookhaven Science Associates,
Brookhaven National Laboratory
| Copyright (c) 2008-2025, The Trustees of Columbia University in
the City of New York

The "DiffPy-CMI" is distributed subject to the following license conditions:
Expand Down

0 comments on commit 913d824

Please sign in to comment.