Skip to content

Using Conda on Klone

Overview

Warning

As a best practice, please use our bioinformatics container(s) for your Conda workflows instead of installing conda packages directly on Klone. Containers provide a consistent and reproducible environment, reducing dependency issues and simplifying software management.

Conda (including Miniforge, Miniconda, and Anaconda) is a popular package manager for Python and other languages. However, conda installations and environments can quickly consume significant disk space. This guide explains how to properly install and configure conda on Klone to avoid storage limitations.

Storage Considerations

Klone has different storage locations with varying capacities:

  • Home directory (/mmfs1/home/<UW_NetID>): Only 10GB - NOT recommended for conda
  • Group storage (/mmfs1/gscratch/srlab/<UW_NetID>): 1.024TB shared - Recommended for conda
  • Temporary storage (/gscratch/scrubbed/<UW_NetID>): 200TB but files deleted after 30 days

Important: Always install conda in group storage (/mmfs1/gscratch/srlab/<UW_NetID>) to avoid hitting the 10GB home directory limit.

Fresh Conda Installation

# Navigate to your group storage directory
cd /mmfs1/gscratch/srlab/${USER}

# Download Miniforge (recommended over Anaconda for smaller size)
wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh

Step 2: Install to Group Storage

# Run the installer
bash Miniforge3-Linux-x86_64.sh

# When prompted for installation location, specify:
# /mmfs1/gscratch/srlab/${USER}/miniforge3

# Answer "yes" when asked to initialize conda

Step 3: Configure Your Shell

Add the following to your ~/.bashrc file:

# Initialize conda from group storage location
eval "$(/mmfs1/gscratch/srlab/${USER}/miniforge3/bin/conda shell.bash hook)"

Then reload your shell:

source ~/.bashrc

Moving Existing Conda Installation

If you already have conda installed in your home directory and are running out of space:

Step 1: Check Current Installation

# Check where conda is currently installed
which conda
conda info --base

# Check current storage usage
hyakstorage

Step 2: Create New Installation Location

# Create directory in group storage
mkdir -p /mmfs1/gscratch/srlab/${USER}

Step 3: Move the Installation

# Stop any running conda processes
conda deactivate

# Move the entire conda installation
mv /mmfs1/home/${USER}/miniforge3 /mmfs1/gscratch/srlab/${USER}/

# Or if you have anaconda3:
# mv /mmfs1/home/${USER}/anaconda3 /mmfs1/gscratch/srlab/${USER}/

Step 4: Update Shell Configuration

Edit your ~/.bashrc file to remove the old conda initialization and add the new one:

# Remove or comment out old lines like:
# eval "$(/mmfs1/home/${USER}/miniforge3/bin/conda shell.bash hook)"

# Add new initialization:
eval "$(/mmfs1/gscratch/srlab/${USER}/miniforge3/bin/conda shell.bash hook)"

Reload your shell:

source ~/.bashrc

Step 5: Verify the Move

# Check new location
which conda
conda info --base

# Check that environments are accessible
conda env list

# Verify storage usage improvement
hyakstorage

Configuring Conda Environment Location

To ensure all conda environments are created in group storage:

Method 1: Set Default Environment Directory

# Create conda configuration directory
mkdir -p ~/.conda

# Create/edit conda configuration file
cat > ~/.conda/condarc << EOF
envs_dirs:
  - /mmfs1/gscratch/srlab/${USER}/miniforge3/envs
  - /mmfs1/gscratch/srlab/${USER}/conda_envs
pkgs_dirs:
  - /mmfs1/gscratch/srlab/${USER}/miniforge3/pkgs
  - /mmfs1/gscratch/srlab/${USER}/conda_pkgs
EOF

Method 2: Always Specify Environment Location

When creating environments, explicitly specify the location:

# Create environment in group storage
conda create --prefix /mmfs1/gscratch/srlab/${USER}/conda_envs/myenv_name package_name

# Activate environment
conda activate /mmfs1/gscratch/srlab/${USER}/conda_envs/myenv_name

Best Practices

Environment Management

  • Use descriptive names: Name environments after projects or specific purposes
  • Create project-specific environments: Avoid conflicts by keeping environments separate
  • Regular cleanup: Remove unused environments to save space
  • Document dependencies: Keep track of package requirements for reproducibility

Space Management

# Check conda disk usage
du -sh /mmfs1/gscratch/srlab/${USER}/miniforge3

# Clean conda cache periodically
conda clean --all

# List environments and their sizes
conda env list
du -sh /mmfs1/gscratch/srlab/${USER}/miniforge3/envs/*

Backup and Sharing

# Export environment for sharing/backup
conda env export --name myenv > environment.yml

# Recreate environment from file
conda env create --file environment.yml --prefix /mmfs1/gscratch/srlab/${USER}/conda_envs/myenv_restored

Integration with SLURM Jobs

When using conda environments in SLURM jobs, ensure you activate the environment correctly:

#!/bin/bash
#SBATCH --job-name=my_conda_job
#SBATCH --partition=compute
#SBATCH --nodes=1
#SBATCH --time=1:00:00

# Initialize conda
eval "$(/mmfs1/gscratch/srlab/${USER}/miniforge3/bin/conda shell.bash hook)"

# Activate your environment
conda activate /mmfs1/gscratch/srlab/${USER}/conda_envs/myenv_name

# Run your analysis
python my_script.py

Troubleshooting

Environment Not Found

If conda can't find your environments after moving:

# Check conda configuration
conda config --show envs_dirs

# List all environments with full paths
conda env list

# Manually specify environment path
conda activate /full/path/to/environment

Permission Issues

If you encounter permission errors:

# Check directory permissions
ls -la /mmfs1/gscratch/srlab/${USER}/

# Fix permissions if needed
chmod -R u+rwX /mmfs1/gscratch/srlab/${USER}/miniforge3

Storage Still Full

If home directory is still full after moving conda:

# Check what's using space
du -sh /mmfs1/home/${USER}/*
du -sh /mmfs1/home/${USER}/.*

# Common culprits to move to group storage:
# - .nextflow directory
# - .sra cache
# - Large data files
# - Git repositories

See Also