python beta

SpectraKit

A lightweight, pip-installable Python library for preprocessing and analyzing spectral data from IR, Raman, and NIR spectroscopy. Functional API, NumPy-native, two core dependencies.

SpectraKit is a Python library for vibrational spectroscopy data processing. Functional design — all functions accept and return NumPy arrays. Core dependencies are just NumPy and SciPy. Everything else is optional.

Install

pip install pyspectrakit          # core (numpy + scipy only)
pip install pyspectrakit[all]     # everything: IO, CLI, plotting, sklearn, baselines

Core Modules

  • Baseline correction: ALS, ArPLS, SNIP, polynomial, rubberband — all with optional ConvergenceInfo return
  • Smoothing: Savitzky-Golay, Whittaker (with wavenumber-aware penalty for non-uniform grids)
  • Normalization: SNV, min-max, area, L2 vector norm
  • Derivatives: Savitzky-Golay polynomial, Norris-Williams gap-segment
  • Scatter correction: MSC, Extended MSC with Legendre polynomial basis
  • Transforms: Kubelka-Munk, ATR correction, absorbance ↔ transmittance
  • Peak analysis: Detection and integration with PeakResult containers
  • Similarity: Cosine, Pearson, spectral angle, Euclidean — batch 2D × 2D support
  • I/O: JCAMP-DX, SPC, CSV, HDF5, Bruker OPUS (native parser, no external deps)
  • Pipeline: Composable transform chains, scikit-learn compatible via SpectralTransformer

Quick Start

from spectrakit.smooth import smooth_savgol
from spectrakit.baseline import baseline_als
from spectrakit.normalize import normalize_snv
from spectrakit.pipeline import Pipeline

pipe = Pipeline()
pipe.add("smooth", smooth_savgol, window_length=11, polyorder=2)
pipe.add("baseline", baseline_als, lam=1e6, p=0.01)
pipe.add("normalize", normalize_snv)
processed = pipe.transform(spectra)

Stats

  • v1.7.1 — latest release
  • 619 tests — including hypothesis property-based, golden reference regression, adversarial I/O
  • 0 mypy strict errors, 0 ruff errors
  • Python 3.10–3.13 supported
  • MIT licensed