v0.12.0:4 - hotfix: torchaudio build fails without --no-build-isolation
Build was crashing inside torchaudio's setup.py with:
ModuleNotFoundError: No module named 'torch'
PIP_CONSTRAINT was correctly pinning torch/torchvision in the install
target env, but pip's PEP 517 build isolation creates a SEPARATE fresh
Python env just for the build wheel step — and that env has no torch
in it. torchaudio's setup.py imports torch to discover CUDA flags, so
it crashes. Pip even printed a deprecation warning that this isolation
behavior is hardening, not relaxing.
Fix:
1. Pre-install torchaudio's build deps (setuptools, wheel, ninja,
pybind11) into the main env since we're disabling isolation.
2. Add --no-build-isolation to the torchaudio install so the build
uses NGC's torch directly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -43,11 +43,24 @@ sys.stdout.write(f'torch=={torch.__version__}\ntorchvision=={torchvision.__versi
|
|||||||
# NGC PyTorch images don't include torchaudio (NVIDIA optimizes for
|
# NGC PyTorch images don't include torchaudio (NVIDIA optimizes for
|
||||||
# vision/text workloads). Stock torchaudio wheels are ABI-incompatible with
|
# vision/text workloads). Stock torchaudio wheels are ABI-incompatible with
|
||||||
# NGC's custom torch 2.10a, so the only working option is building from
|
# NGC's custom torch 2.10a, so the only working option is building from
|
||||||
# source against the NGC torch already in the image. Pinning to v2.5.1 — the
|
# source against the NGC torch already in the image.
|
||||||
# last torchaudio tag that builds cleanly against torch 2.5–2.10 and is a
|
#
|
||||||
# proven compatibility target.
|
# Build env knobs:
|
||||||
|
# USE_CUDA=1 — build CUDA kernels (we have a GPU)
|
||||||
|
# BUILD_SOX=0 — skip libsox (we only use audio decoding)
|
||||||
|
# TORCH_CUDA_ARCH_LIST=... — build kernels for Hopper + Blackwell datacenter
|
||||||
|
# + Blackwell consumer (sm_120 = GB10)
|
||||||
|
# --no-build-isolation — CRITICAL: PEP 517 build isolation creates a
|
||||||
|
# fresh env with no torch in it. torchaudio's
|
||||||
|
# setup.py imports torch to discover the build
|
||||||
|
# flags, so it crashes without this flag.
|
||||||
|
# With it, the build uses NGC's torch directly.
|
||||||
ENV USE_CUDA=1 BUILD_SOX=0 TORCH_CUDA_ARCH_LIST="9.0;10.0;12.0"
|
ENV USE_CUDA=1 BUILD_SOX=0 TORCH_CUDA_ARCH_LIST="9.0;10.0;12.0"
|
||||||
|
# Pre-install torchaudio's build-time deps (PEP 517 would normally install
|
||||||
|
# these in the isolated build env, but we just turned isolation off).
|
||||||
RUN pip install --break-system-packages --no-cache-dir \
|
RUN pip install --break-system-packages --no-cache-dir \
|
||||||
|
"setuptools>=61" wheel ninja "pybind11>=2.10"
|
||||||
|
RUN pip install --break-system-packages --no-cache-dir --no-build-isolation \
|
||||||
git+https://github.com/pytorch/audio.git@v2.5.1 \
|
git+https://github.com/pytorch/audio.git@v2.5.1 \
|
||||||
&& python3 -c "import torchaudio; print('torchaudio built:', torchaudio.__version__)"
|
&& python3 -c "import torchaudio; print('torchaudio built:', torchaudio.__version__)"
|
||||||
|
|
||||||
|
|||||||
@@ -1,10 +1,10 @@
|
|||||||
import { VersionInfo, IMPOSSIBLE } from '@start9labs/start-sdk'
|
import { VersionInfo, IMPOSSIBLE } from '@start9labs/start-sdk'
|
||||||
|
|
||||||
export const v0_1_0 = VersionInfo.of({
|
export const v0_1_0 = VersionInfo.of({
|
||||||
version: '0.12.0:3',
|
version: '0.12.0:4',
|
||||||
releaseNotes: {
|
releaseNotes: {
|
||||||
en_US:
|
en_US:
|
||||||
'v0.12.0:3 — hotfix: deeper torchaudio fix. The Spark is ARM64 (Grace + GB10 Blackwell), and the NGC PyTorch container — the only base with a working torch for sm_120 ARM64 — does NOT ship torchaudio at all. Stock pip wheels are amd64-only and ABI-incompatible with NGC\'s custom torch anyway. Real fix: build torchaudio from source against NGC\'s torch (v2.5.1, the last torchaudio tag that compiles cleanly against torch 2.5–2.10) with TORCH_CUDA_ARCH_LIST set for Blackwell sm_120. Build adds ~3-5 min to the first WhisperX install (only first time — Docker layer cache reuses it after). Plus the constraints.txt approach from 0.12.0:2 to lock torch + torchvision + torchaudio against any later pip swap-out.',
|
'v0.12.0:4 — hotfix: torchaudio build was failing with "ModuleNotFoundError: No module named torch" during its setup.py. Root cause: pip\'s PEP 517 build isolation creates a fresh Python env for the build that doesn\'t see NGC\'s torch (which is what we need for ABI compat). Fix: add --no-build-isolation to the pip install so the build uses the existing torch, plus pre-install setuptools/wheel/ninja/pybind11 since pip won\'t auto-pull them when build isolation is off. Should now finally compile torchaudio v2.5.1 against NGC\'s torch 2.10 and proceed to the whisperx install.',
|
||||||
},
|
},
|
||||||
migrations: {
|
migrations: {
|
||||||
up: async ({ effects }) => {},
|
up: async ({ effects }) => {},
|
||||||
|
|||||||
Reference in New Issue
Block a user