DIY Raman spectroscopy for biological research

Purpose

This repository accompanies the pub, "DIY Raman spectroscopy for biological research." It contains the data acquired and analyzed in this effort, Jupyter notebooks for calibration and figure generation, a Python script for applying calibration to raw data, and a summary of data from the spectral library.

Installation and Setup

This repository uses conda to manage software environments and installations. You can find operating system-specific instructions for installing miniconda here. After installing conda and mamba, run the following command to create the pipeline run environment.

mamba env create -n 2025-diyraman-bio --file envs/dev.yml
conda activate 2025-diyraman-bio

Developer Notes (click to expand/collapse)

Install your pre-commit hooks:
```
pre-commit install
```
This installs the pre-commit hooks defined in your config (./.pre-commit-config.yaml).
Export your conda environment before sharing:

As your project develops, the number of dependencies in your environment may increase. Whenever you install new dependencies (using either pip install or mamba install), you should update the environment file using the following command.
```
conda env export --from-history --no-builds > envs/dev.yml
```
--from-history only exports packages that were explicitly added by you (e.g., the packages you installed with pip or mamba) and --no-builds removes build specification from the exported packages to increase portability between different platforms.

Data

The data shared here is collected on the DIY spontaneous Raman system at Arcadia Science, which is the OpenRaman Starter Edition (532 nm excitation), originally created by Luc B (https://www.open-raman.org/about/). The input files are the raw data in CSV format collected from a range of biological research samples that are in the Arcadia sample library and calibration standards. The outputs include calibration equations for each acquisition day, calibrated data with additional pre-processing, quick look graphs for each sample's data, and performance metrics for the system.

Overview

Description of the folder structure

data/: Data for each sample presented in the pub. It contains folders for raw data (meaning as acquired on the spectrometer) and processed (meaning calibrated or otherwise modified) These are organized into subfolders for each date.
notebooks/: Jupyter notebooks for generating calibration correction equations and plotting processed data to generate the figures shown in the pub.
scripts/: Python script for applying calibration correction to acquired data and generating quick plots.
envs/: This repository uses conda to manage software installations and versions.
figures/: Base images used for the figures in the pub. The final pub figures were edited in Adobe Illustrator.
spectral_library/: The spectral library includes peak positions for all samples collected in this pub and additional notes for some samples.
LICENSE: License specifying the re-use terms for the code in this repository.
README.md: File outlining the contents of this repository and how to use them.
.github/, .vscode/, .gitignore, .pre-commit-config.yaml, Makefile, pyproject.toml: Files that control the development environment of the repository.

Typical workflow

The user typically starts by putting data they've collected with the spectrometer into the data/raw folder, organized by date. Ensure that you follow the file naming convention, which is detailed in step 3 of the generate_calibration notebook, and data are in CSV format. Make sure that you have both an acetonitrile and a neon spectrum in the dataset, ideally acquired with the same system configuration and acquisition parameters as your samples of interest. These are calibration standards.

Run the generate_calibration notebook, which uses acetonitrile and neon data to create calibration equations that can be used to correct all the sample data. In this notebook, you can select the acetonitrile and neon files to use. In general, select acetonitrile and neon files acquired with the same configuration (i.e. liquid or solid), that also match the sample files that you want to calibrate. We also typically use 1,000 or 10,000 ms exposure for acetonitrile, and 1,000 ms for neon. In both cases, 0 dB gain and 5 averaged acquisitions are typical and select files the least amount of processing during acquisition (i.e. "n" for blanking, baselining, and filtering). So the files you may want could be "YYYY-MM-DD_acetonitrile_n_n_n_solid_10000_0_5.csv" and "YYYY-MM-DD_neon_n_n_n_solid_1000_0_5.csv". This step generates several files in different folders:

data/processed/calibration:

coefficients files for linear calibration equations for the calibrants (CSV)

peaks files that have the peak parameters for the calibrants (CSV)

plotted spectra with fitted peaks marked for each calibrant (SVG)

data/processed/performance:

summary files that contain the mean error, stdev error, max difference, and min difference between each peak in the calibrant spectra compared to reference literature (CSV)

plots showing error across the spectrum for both calibrants (SVG)

Run the apply_calibration.py script. This applies the calibration correction equations generated in the previous to your sample data of interest. Select the calibration files for acetonitrile and neon that most closely match the acquisition parameters for your sample data. This generates calibrated data files (CSV) and plots of the calibrated data (PNG) in the data/processed/processed_data folder. These data can be compared to published references or other instrument data.

Compute Specifications

We executed this project on an Apple MacBook Pro machine running macOS Sonoma version 14.5. The machine has 36 GB memory and an Apple M3 Max chip, though this specific configuration is not required for running the calibration and analysis.

Contributing

See how we recognize feedback and contributions to our code.

For Developers

This section contains information for developers who are working off of this template. Please adjust or edit this section as appropriate when you're ready to share your repo.

GitHub templates

This template uses GitHub templates to provide checklists when making new pull requests. These templates are stored in the .github/ directory.

VSCode

This template includes recommendations to VSCode users for extensions, particularly the ruff linter. These recommendations are stored in .vscode/extensions.json. When you open the repository in VSCode, you should see a prompt to install the recommended extensions.

`.gitignore`

This template uses a .gitignore file to prevent certain files from being committed to the repository.

`pyproject.toml`

pyproject.toml is a configuration file to specify your project's metadata and to set the behavior of other tools such as linters, type checkers etc. You can learn more here

Linting

This template automates linting and formatting using GitHub Actions and the ruff linter. When you push changes to your repository, GitHub will automatically run the linter and report any errors, blocking merges until they are resolved.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DIY Raman spectroscopy for biological research

Purpose

Installation and Setup

Data

Overview

Description of the folder structure

Typical workflow

Compute Specifications

Contributing

For Developers

GitHub templates

VSCode

`.gitignore`

`pyproject.toml`

Linting

About

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github		.github
.vscode		.vscode
data		data
envs		envs
figures		figures
notebooks		notebooks
scripts		scripts
spectral_library		spectral_library
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

License

Arcadia-Science/2025-diyraman-bio

Folders and files

Latest commit

History

Repository files navigation

DIY Raman spectroscopy for biological research

Purpose

Installation and Setup

Data

Overview

Description of the folder structure

Typical workflow

Compute Specifications

Contributing

For Developers

GitHub templates

VSCode

.gitignore

pyproject.toml

Linting

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

`.gitignore`

`pyproject.toml`

Packages