Troubleshooting Open WebUI Docker Image Build Failures A Comprehensive Guide
Hey guys,
We've been wrestling with some Docker image build failures for Open WebUI, and it's been a real head-scratcher. We're hoping to get some insights from the community to help us nail this down. So, let's dive into the details and see if we can figure this out together!
The Issue at a Glance
We're running into problems building the Docker image for Open WebUI, specifically from version v0.6.16 onwards. Up until v0.6.15, everything was smooth sailing, but something's changed, and we're not quite sure what. Our CI/CD pipeline is throwing errors during the build process, and despite our best efforts to troubleshoot, we're still stuck. Let's walk through the specifics, so you have a clear picture of what's happening.
Background Information
Before we jump into the nitty-gritty, let's lay out some groundwork. This will give you a solid understanding of our setup and the context of the issue.
Environment
- Operating System: Ubuntu 22.04
- Installation Method: Git Clone
- Open WebUI Version: v0.6.18 (issue started from v0.6.16)
- Ollama Version: Not applicable in this case
Steps Taken
Here’s a rundown of what we’ve done to try and resolve this. We’ve made sure to follow the best practices and have tried several approaches, but no luck so far.
- Searched Existing Issues: We’ve scoured the existing issues and discussions to see if anyone else has encountered the same problem. Nothing quite matched our scenario.
- Latest Versions: We’re using the latest version of Open WebUI to ensure we’re not dealing with a known bug in an older release.
- README.md: We've carefully read and followed all instructions in the
README.md
file. - Logs and Configurations: We've gathered all relevant logs, configurations, and environment variables to provide a clear picture of our setup.
Configuration Details
To give you a comprehensive understanding, here’s a breakdown of our configuration and setup:
- GitLab CI/CD: We use GitLab CI/CD for our build process, which has been working flawlessly for other projects.
- Dockerfile: The Dockerfile we’re using hasn’t been modified since the last successful build (v0.6.15). This makes the issue even more perplexing.
- Network: We initially suspected network issues, but after thorough checks and adjustments (like increasing timeouts), the problem persists.
Expected vs. Actual Behavior
Expected Behavior
Our expectation is straightforward: the Docker image should build successfully using the Dockerfile in our project. This process worked without a hitch up to version v0.6.15.
Actual Behavior
Starting from version v0.6.16, the Docker image build fails consistently. This is the core of our problem, and it’s blocking our deployment pipeline.
Reproduction Steps
To help you understand and potentially reproduce the issue, here are the step-by-step instructions:
- Git Clone: Clone the project from our self-hosted GitLab repository.
- Build Image via CI/CD: Trigger the Docker image build process through GitLab CI/CD.
- Version v0.6.15 (Success): Notice that the image builds successfully for version v0.6.15.
- Version v0.6.16+ (Failure): Attempt to build the image for v0.6.16, v0.6.17, or v0.6.18. The build fails.
- Error Logs: Examine the error logs, which indicate a network-related issue during the
npm ci
phase.
Detailed Steps Breakdown
Let's break down each step to ensure clarity and reproducibility.
1. Git Clone
The first step is to clone the project from our GitLab repository. This ensures you have the exact same codebase we're working with. Use the following command:
git clone <our_gitlab_repository_url>
cd open-webui
Replace <our_gitlab_repository_url>
with the actual URL of our repository.
2. Build Image via CI/CD
Next, trigger the Docker image build process through GitLab CI/CD. This involves pushing the code to the GitLab repository, which in turn triggers the CI/CD pipeline defined in the .gitlab-ci.yml
file. This file contains the instructions for building the Docker image.
3. Version v0.6.15 (Success)
To verify that the issue started after v0.6.15, you can checkout the specific tag for v0.6.15 and trigger the build. This will confirm that the build succeeds for this version.
git checkout v0.6.15
Then, push this tag to your GitLab repository to trigger the CI/CD pipeline. Observe that the build completes successfully.
4. Version v0.6.16+ (Failure)
Now, checkout any version from v0.6.16 onwards (e.g., v0.6.18) and trigger the build again.
git checkout v0.6.18
Push this tag to your GitLab repository and observe that the build fails. This confirms the issue we're experiencing.
5. Error Logs
The key to understanding the issue lies in the error logs. Examine the logs from the failed build, which typically indicate a network-related problem during the npm ci
phase. This is where the build process attempts to install the necessary Node.js dependencies.
The Dreaded Logs & Screenshots
Here's the crucial part – the error log extracted from our CI/CD pipeline. This is where the mystery deepens.
INFO[0910] Taking snapshot of full filesystem...
INFO[0911] COPY package.json package-lock.json RootCA.crt ./
INFO[0911] Taking snapshot of files...
INFO[0911] COPY onnxruntime-binaries/ /tmp/onnxruntime/
INFO[0911] Taking snapshot of files...
INFO[0912] RUN ONNXRUNTIME_NODE_INSTALL=skip npm ci --legacy-peer-deps && mkdir -p /app/node_modules/onnxruntime-node/bin/napi-v3/linux/x64 && cp -r /tmp/onnxruntime/* /app/node_modules/onnxruntime-node/bin/napi-v3/linux/x64/
INFO[0912] Cmd: /bin/sh
INFO[0912] Args: [-c ONNXRUNTIME_NODE_INSTALL=skip npm ci --legacy-peer-deps && mkdir -p /app/node_modules/onnxruntime-node/bin/napi-v3/linux/x64 && cp -r /tmp/onnxruntime/* /app/node_modules/onnxruntime-node/bin/napi-v3/linux/x64/]
INFO[0912] Running: [/bin/sh -c ONNXRUNTIME_NODE_INSTALL=skip npm ci --legacy-peer-deps && mkdir -p /app/node_modules/onnxruntime-node/bin/napi-v3/linux/x64 && cp -r /tmp/onnxruntime/* /app/node_modules/onnxruntime-node/bin/napi-v3/linux/x64/]
npm error code 1
npm error path /app/node_modules/onnxruntime-node
npm error command failed
npm error command sh -c node ./script/install
npm error Downloading "https://github.com/microsoft/onnxruntime/releases/download/v1.20.1/onnxruntime-linux-x64-gpu-1.20.1.tgz"...
npm error Extracting "libonnxruntime_providers_cuda.so" to "/app/node_modules/onnxruntime-node/bin/napi-v3/linux/x64"...
npm error node:events:496
npm error throw er; // Unhandled 'error' event
npm error ^
npm error
npm error TypeError: terminated
npm error at Fetch.onAborted (node:internal/deps/undici/undici:11132:53)
npm error at Fetch.emit (node:events:530:35)
npm error at Fetch.terminate (node:internal/deps/undici/undici:10290:14)
npm error at Object.onError (node:internal/deps/undici/undici:11253:38)
npm error at Request.onError (node:internal/deps/undici/undici:2094:31)
npm error at Object.errorRequest (node:internal/deps/undici/undici:1591:17)
npm error at TLSSocket.<anonymous> (node:internal/deps/undici/undici:6319:16)
npm error at TLSSocket.emit (node:events:530:35)
npm error at node:net:351:12
npm error at TCP.done (node:_tls_wrap:650:7)
npm error Emitted 'error' event on Readable instance at:
npm error at emitErrorNT (node:internal/streams/destroy:170:8)
npm error at emitErrorCloseNT (node:internal/streams/destroy:129:3)
npm error at process.processTicksAndRejections (node:internal/process/task_queues:90:21) {
npm error [cause]: SocketError: other side closed
npm error at TLSSocket.<anonymous> (node:internal/deps/undici/undici:6294:28)
npm error at TLSSocket.emit (node:events:530:35)
npm error at endReadableNT (node:internal/streams/readable:1698:12)
npm error at process.processTicksAndRejections (node:internal/process/task_queues:90:21) {
npm error code: 'UND_ERR_SOCKET',
npm error socket: {
npm error localAddress: '10.244.1.152',
npm error localPort: 58314,
npm error remoteAddress: '185.199.108.133',
npm error remotePort: 443,
npm error remoteFamily: 'IPv4',
npm error timeout: undefined,
npm error bytesWritten: 1102,
npm error bytesRead: 10486569
npm error }
npm error }
npm error }
npm error
npm error Node.js v22.16.0
npm notice
npm notice New major version of npm available! 10.9.2 -> 11.5.2
npm notice Changelog: https://github.com/npm/cli/releases/tag/v11.5.2
npm notice To update run: npm install -g npm@11.5.2
npm notice
npm error A complete log of this run can be found in: /root/.npm/_logs/2025-08-03T14_09_18_872Z-debug-0.log
error building image: error building stage: failed to execute command: waiting for process to exit: exit status 1
Key Observations from the Logs
- Network Issues: The logs point to a network problem, specifically a
SocketError: other side closed
. This suggests that the connection to the server hosting theonnxruntime
binaries is being terminated prematurely. npm ci
Failure: The build fails during thenpm ci
command, which is responsible for installing the project dependencies.onnxruntime
Download: The error occurs while downloadingonnxruntime-linux-x64-gpu-1.20.1.tgz
from GitHub. This indicates that the issue might be related to the download of this specific package.
Our Troubleshooting Efforts
We didn’t just stop at reading the logs. We rolled up our sleeves and tried several fixes, but none have worked so far. Here’s what we’ve attempted:
Network Checks
Since the error logs suggested a network issue, we started by thoroughly checking our network configuration. We ensured that there were no firewall rules or proxy settings interfering with the download process.
Increased Timeouts
We suspected that the download might be timing out due to slow network speeds, so we increased the timeout values in our Dockerfile and CI/CD configuration. Unfortunately, this didn’t resolve the issue.
Skipping GPU Installation
To isolate the problem, we tried skipping the installation of the GPU-related components of onnxruntime
. We set the ONNXRUNTIME_NODE_INSTALL
environment variable to skip
, but the build still failed.
Using a Cache
We also tried leveraging Docker’s caching mechanism to reduce the reliance on external downloads. We configured our CI/CD pipeline to cache the node_modules
directory, but the issue persisted.
Potential Culprits and Next Steps
Despite our efforts, the Docker image build continues to fail. We’re scratching our heads, but here are some potential culprits we’ve identified:
GitHub Rate Limiting
One possibility is that we’re hitting GitHub’s rate limits for downloading packages. This could explain why the connection is being terminated prematurely.
Intermittent Network Issues
Another possibility is that there are intermittent network issues between our build environment and GitHub’s servers. This could cause the download to fail sporadically.
Changes in onnxruntime
Package
It’s also possible that there have been changes in the onnxruntime
package or its hosting on GitHub that are causing the download to fail.
Docker Version Incompatibility
Although less likely, there might be some incompatibility issues between our Docker version and the newer versions of Open WebUI.
Call for Help!
We're officially reaching out to the community for help! We've hit a wall, and we're hoping someone else might have some insights or suggestions. Have you encountered a similar issue? Do you have any ideas on what might be causing this? Any guidance would be greatly appreciated!
Specific Questions
To help focus the discussion, here are some specific questions we have:
- Has anyone else experienced similar Docker build failures with Open WebUI versions v0.6.16 and later?
- Are there any known issues with downloading
onnxruntime
binaries from GitHub? - Could there be a problem with our Dockerfile or CI/CD configuration that we’re overlooking?
- Are there any alternative ways to install
onnxruntime
or work around this issue?
We're open to any and all suggestions. Thanks in advance for your help!