Troubleshooting & FAQ

Common Issues

Out of GPU memory (CUDA out of memory)

Reduce memory usage by:

Setting batch_size=1
Offloading mosaicking to CPU: mosaic_device="cpu"

pred_paths = predict_from_load_func(
    scene_paths=scene_paths,
    load_func=load_s2,
    batch_size=1,
    mosaic_device="cpu",
)

Slow CPU inference

Use inference_dtype="fp32" (not fp16/bf16) on CPU
Set PyTorch thread count:

import torch
torch.set_num_threads(4)  # adjust to your CPU core count

Missing cloud detections

Some clouds may not be detected if pixel values are clipped by sensor saturation or preprocessing. Areas with no texture remaining cannot be classified. To resolve:

Identify saturated/clipped regions in your input
Set those regions to 0 (the default no-data value)
Re-run prediction

Model download errors

If Hugging Face downloads fail, try Google Drive as an alternative:

mask = predict_from_array(
    input_array,
    model_download_source="google_drive",
)

Or download models manually and specify the directory:

mask = predict_from_array(
    input_array,
    destination_model_dir="/path/to/models",
)

Input array shape errors

OmniCloudMask expects arrays with shape (3, height, width):

Dimension 0: 3 bands (Red, Green, NIR in that order)
Dimension 1: height in pixels
Dimension 2: width in pixels

If your data is in (height, width, bands) format, transpose it:

input_array = np.transpose(input_array, (2, 0, 1))

Minimum image size

Images must be at least 32x32 pixels. For best results, use images of at least 96x96 pixels. Accuracy improves rapidly up to this size, then continues to improve more gradually with larger patches. See Spatial Context for detailed benchmarks on how patch size affects accuracy at different resolutions.

Performance Tips

If processing many files, use predict_from_load_func instead of predict_from_array - it preloads data during inference for faster processing
See the Usage Guide for GPU optimization, batch size tuning, downscaling strategies, and CPU inference configuration

FAQ

What bands are required?

Red, Green, and NIR bands. The model was trained on these three bands from the CloudSEN12 dataset.

If you don’t have a NIR band, you can try passing Red, Green, and an empty third band - this has shown reasonable results in most cases, though NIR is recommended.

How do I interpret confidence maps?

When export_confidence=True, the output has shape (4, height, width) with one channel per class:

Channel 0: Clear probability
Channel 1: Thick cloud probability
Channel 2: Thin cloud probability
Channel 3: Cloud shadow probability

Values range from ~0.001 to ~0.999 after softmax normalization.

Which model version should I use?

Use the default (latest) version unless you have a specific reason to use an older one. Version 4.0+ uses segmentation-models-pytorch; versions 1-3 require fastai.