Skip to content

Installation

Prerequisites

BactScout uses Pixi for reproducible environment and dependency management.

Install Pixi

curl -fsSL https://pixi.sh/install.sh | bash
conda install -c conda-forge pixi
brew install pixi

After installation, restart your terminal or refresh your shell configuration.

Installation Steps

1. Clone the Repository

git clone https://github.com/ghruproject/bactscout.git
cd bactscout

2. Install Dependencies

pixi install

This will install all required dependencies including:

  • fastp - Read quality control and trimming
  • sylph - Ultra-fast taxonomic profiling
  • stringMLST - MLST analysis
  • Python 3.11+ - Core runtime
  • typer - CLI framework
  • rich - Beautiful terminal output

3. Verify Installation

pixi run bactscout --help

Docker Installation

BactScout is also available as a Docker image:

# Build locally
docker build -t bactscout:latest -f docker/Dockerfile .

# Run with Docker
docker run -v /path/to/fastq:/input -v /path/to/output:/output \
  bactscout:latest pixi run bactscout qc /input -o /output

If you prefer to run BactScout from a pre-built container, pull the image from Docker Hub and run it with your data mounted. The official image is published at: https://hub.docker.com/repository/docker/happykhan/bactscout/general

# Pull the latest image
docker pull happykhan/bactscout:latest

# Run BactScout to perform QC on FASTQ files mounted from the current directory
docker run --rm \
    --volume "$PWD":/data \
    --user "$(id -u):$(id -g)" \
    happykhan/bactscout:latest \
    bactscout qc /data/fastq -o /data/results

# Show available commands
docker run --rm happykhan/bactscout:latest bactscout --help

Supported platforms: linux/amd64, linux/arm64 (for Apple Silicon)

Automatic Database Setup

BactScout automatically downloads and configures required databases on first run:

  • Sylph GTDB database - For taxonomic profiling
  • MLST databases - For sequence typing (E. coli, Salmonella, Klebsiella, Pseudomonas, Acinetobacter)

The sylph database will occupy the most disk space (~4GB). No additional manual setup required! If you want to pre-download databases, run:

pixi run bactscout preflight

This command performs all validation checks and downloads necessary databases without running the full QC pipeline. This is useful for ensuring everything is set up correctly before processing large datasets (on infrastructure with limited internet access, etc.)

Troubleshooting

Pixi not found after installation?

Restart your terminal or run:

source ~/.bashrc  # or ~/.zshrc for macOS

Permission denied installing Pixi?

Try installing to a different location:

curl -fsSL https://pixi.sh/install.sh | PIXI_HOME=~/.local/pixi bash

See Troubleshooting for more help.