python beta
SpectraKit
A lightweight, pip-installable Python library for preprocessing and analyzing spectral data from IR, Raman, and NIR spectroscopy. Functional API, NumPy-native, two core dependencies.
SpectraKit is a Python library for vibrational spectroscopy data processing. Functional design — all functions accept and return NumPy arrays. Core dependencies are just NumPy and SciPy. Everything else is optional.
Install
pip install pyspectrakit # core (numpy + scipy only)
pip install pyspectrakit[all] # everything: IO, CLI, plotting, sklearn, baselines
Core Modules
- Baseline correction: ALS, ArPLS, SNIP, polynomial, rubberband — all with optional
ConvergenceInforeturn - Smoothing: Savitzky-Golay, Whittaker (with wavenumber-aware penalty for non-uniform grids)
- Normalization: SNV, min-max, area, L2 vector norm
- Derivatives: Savitzky-Golay polynomial, Norris-Williams gap-segment
- Scatter correction: MSC, Extended MSC with Legendre polynomial basis
- Transforms: Kubelka-Munk, ATR correction, absorbance ↔ transmittance
- Peak analysis: Detection and integration with
PeakResultcontainers - Similarity: Cosine, Pearson, spectral angle, Euclidean — batch 2D × 2D support
- I/O: JCAMP-DX, SPC, CSV, HDF5, Bruker OPUS (native parser, no external deps)
- Pipeline: Composable transform chains, scikit-learn compatible via
SpectralTransformer
Quick Start
from spectrakit.smooth import smooth_savgol
from spectrakit.baseline import baseline_als
from spectrakit.normalize import normalize_snv
from spectrakit.pipeline import Pipeline
pipe = Pipeline()
pipe.add("smooth", smooth_savgol, window_length=11, polyorder=2)
pipe.add("baseline", baseline_als, lam=1e6, p=0.01)
pipe.add("normalize", normalize_snv)
processed = pipe.transform(spectra)
Stats
- v1.7.1 — latest release
- 619 tests — including hypothesis property-based, golden reference regression, adversarial I/O
- 0 mypy strict errors, 0 ruff errors
- Python 3.10–3.13 supported
- MIT licensed