{"id":19,"date":"2025-02-15T20:34:51","date_gmt":"2025-02-15T20:34:51","guid":{"rendered":"https:\/\/homelab.computer\/?p=19"},"modified":"2025-02-19T20:50:03","modified_gmt":"2025-02-19T20:50:03","slug":"qubes-os-setting-up-a-qube-for-gpu-pass-through","status":"publish","type":"post","link":"https:\/\/homelab.computer\/index.php\/2025\/02\/15\/qubes-os-setting-up-a-qube-for-gpu-pass-through\/","title":{"rendered":"Qubes OS: AI Integration for Productivity Environments"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\"><strong><em>This guide explains how to enable GPU pass-through using an Nvidia GPU and how to install Ollama.<\/em><\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Qubes OS does not have built-in AI features. However, you can integrate AI functionality into common productivity applications by running local language models (LLMs) on your system.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">There are advantages and disadvantages to this approach. On the positive side, it allows you to run the AI locally on your hardware, giving you full control over your data. The downside is that it requires significant system resources, especially for larger models. While you can technically run an LLM without GPU acceleration, performance will be poor, limiting model size and drastically slowing down processing times.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">GPU pass-through for AI integration is simpler to configure compared to setups used for gaming. In this scenario, you won&#8217;t need a video output from the GPU, eliminating the need to configure xorg or use a dedicated display for the GPU. This also means that a qube designed exclusively for AI integration cannot be used for gaming.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If you also want to play video games, see this <a href=\"https:\/\/homelab.computer\/index.php\/2025\/01\/12\/qubes-os-gpu-pass-through-for-gaming\/\" data-type=\"post\" data-id=\"29\">guide<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Before proceeding with GPU pass-through, you must first hide the GPU from dom0. See this <a href=\"https:\/\/homelab.computer\/index.php\/2025\/01\/05\/qubes-os-hiding-a-pcie-device-from-dom0\/\" data-type=\"post\" data-id=\"10\">guide<\/a> for hiding the GPU.<\/p>\n\n\n\n<p class=\"has-medium-font-size wp-block-paragraph\"><strong>Setting up qube<\/strong> <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Create a standalone qube, don\u2019t use a minimal template unless you are familiar with using minimal templates.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"650\" height=\"396\" src=\"https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-1.jpg\" alt=\"\" class=\"wp-image-22\" srcset=\"https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-1.jpg 650w, https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-1-300x183.jpg 300w, https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-1-550x335.jpg 550w, https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-1-160x97.jpg 160w\" sizes=\"auto, (max-width: 650px) 100vw, 650px\" \/><figcaption class=\"wp-element-caption\">Qubes OS: Create qube dialog<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">In the qube settings disable memory balancing, set the kernel to provided by qube, and the mode to HVM.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"862\" height=\"735\" src=\"https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-2.jpg\" alt=\"\" class=\"wp-image-23\" srcset=\"https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-2.jpg 862w, https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-2-300x256.jpg 300w, https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-2-768x655.jpg 768w, https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-2-550x469.jpg 550w, https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-2-160x136.jpg 160w\" sizes=\"auto, (max-width: 862px) 100vw, 862px\" \/><figcaption class=\"wp-element-caption\">Qubes OS: Qube settings menu<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">In the Device tab, add the GPU you want passed to the qube.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"853\" height=\"713\" src=\"https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-3.jpg\" alt=\"\" class=\"wp-image-24\" srcset=\"https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-3.jpg 853w, https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-3-300x251.jpg 300w, https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-3-768x642.jpg 768w, https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-3-550x460.jpg 550w, https:\/\/homelab.computer\/wp-content\/uploads\/2025\/01\/gpu-qube-3-160x134.jpg 160w\" sizes=\"auto, (max-width: 853px) 100vw, 853px\" \/><figcaption class=\"wp-element-caption\">Qubes OS: Qube settings menu, devices tab<\/figcaption><\/figure>\n\n\n\n<p class=\"has-medium-font-size wp-block-paragraph\"><strong>Install software requirements<\/strong><\/p>\n\n\n\n<p class=\"has-background has-small-font-size wp-block-paragraph\" style=\"background-color:#eaeaea\"><code>sudo apt install software-properties-common apt-transport-https curl git<\/code><\/p>\n\n\n\n<p class=\"has-medium-font-size wp-block-paragraph\"><strong>Install Nvidia CUDA driver<\/strong><\/p>\n\n\n\n<p class=\"has-background has-small-font-size wp-block-paragraph\" style=\"background-color:#eaeaea\"><code>curl -fsSL <a href=\"https:\/\/developer.download.nvidia.com\/compute\/cuda\/repos\/debian12\/x86_64\/cuda-keyring_1.1-1_all.deb\">https:\/\/developer.download.nvidia.com\/compute\/cuda\/repos\/debian12\/x86_64\/cuda-keyring_1.1-1_all.deb<\/a> -o <a href=\"https:\/\/developer.download.nvidia.com\/compute\/cuda\/repos\/debian12\/x86_64\/cuda-keyring_1.1-1_all.deb\">cuda-keyring_1.1-1_all.deb<\/a><br>sudo dpkg -i cuda-keyring_1.1-1_all.deb<br>sudo add-apt-repository contrib<br>sudo apt update<br>sudo apt install nvidia-kernel-dkms cuda-drivers<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">At this point, reboot the system and run <strong>nvidia-smi<\/strong>, to confirm the GPU is available, and ready for use.<\/p>\n\n\n\n<p class=\"has-medium-font-size wp-block-paragraph\"><strong>Install docker<\/strong><\/p>\n\n\n\n<p class=\"has-background has-small-font-size wp-block-paragraph\" style=\"background-color:#eaeaea\"><code>sudo curl -fsSL https:\/\/download.docker.com\/linux\/debian\/gpg -o \/etc\/apt\/keyrings\/docker.asc<br>sudo chmod a+r \/etc\/apt\/keyrings\/docker.asc<br>echo \"deb [arch=$(dpkg --print-architecture) signed-by=\/etc\/apt\/keyrings\/docker.asc] https:\/\/download.docker.com\/linux\/debian \\<br>$(. \/etc\/os-release &amp;&amp; echo \"$VERSION_CODENAME\") stable\" | sudo tee \/etc\/apt\/sources.list.d\/docker.list &gt; \/dev\/null<br>sudo apt update<br>sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin<br>sudo usermod -aG docker user<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Exit the shell and log back in, to apply user permissions.<\/p>\n\n\n\n<p class=\"has-medium-font-size wp-block-paragraph\"><strong>Install Ollama<\/strong><\/p>\n\n\n\n<p class=\"has-background has-small-font-size wp-block-paragraph\" style=\"background-color:#eaeaea\"><code>curl -fsSL https:\/\/ollama.com\/install.sh | sh<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now Ollama should be installed and ready for use, test it by running a model.<\/p>\n\n\n\n<p class=\"has-background has-small-font-size wp-block-paragraph\" style=\"background-color:#eaeaea\"><code>ollama run llama3<\/code><\/p>\n\n\n\n<p class=\"has-medium-font-size wp-block-paragraph\"><strong>Allowing other qubes to use Ollama<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Using <strong>qubes.ConnectTCP<\/strong> is a good way to allow other qubes to access the Ollama qube. The two main advantages are that you can run Ollama offline, and because connection is done though a local port, it makes integration easier.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In dom0, the file <strong>\/etc\/qubes\/policy.d\/30-user-networking.policy<\/strong> will control what qubes can Ollama.<\/p>\n\n\n\n<p class=\"has-background has-small-font-size wp-block-paragraph\" style=\"background-color:#eaeaea\"><code>qubes.ConnectTCP +11434 browser-qube @default allow target=ollama-qube<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Here is an example, browser-qube is accessing ollama-qube<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In the qube that is connecting to Ollama, you can use systemd to automatically initiate the connection at boot.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In <strong>\/rw\/config<\/strong>, create the following two files<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>ollama@.service<\/strong><\/p>\n\n\n\n<p class=\"has-background has-small-font-size wp-block-paragraph\" style=\"background-color:#eaeaea\"><code>[Unit]<br>Description=Ollama service<\/code><br><code><br>[Service]<br>ExecStart=qrexec-client-vm '' qubes.ConnectTCP+11434<br>StandardInput=socket<br>StandardOutput=inherit<br>Restart=always<br>RestartSec=3<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>ollama.socket<\/strong><\/p>\n\n\n\n<p class=\"has-background has-small-font-size wp-block-paragraph\" style=\"background-color:#eaeaea\"><code>[Unit]<br>Description=Ollama socket<br>[Socket]<br>ListenStream=127.0.0.1:11434<br>Accept=true<br>[Install]<br>WantedBy=sockets.target<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Add the following code to \/rw\/config\/rc.local<\/p>\n\n\n\n<p class=\"has-background has-small-font-size wp-block-paragraph\" style=\"background-color:#eaeaea\"><code>cp -r \/rw\/config\/ollama* \/lib\/systemd\/system\/<br>systemctl daemon-reload<br>systemctl start ollama.socket<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This code til automatically start the Ollama connection when the system boots.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">When the system boots, it binds the local port localhost:11434 to the remote port ollama-qube:11434. To applications running in the qube, it looks like Ollama is running on localhost:11434, and any application that can use Ollama will work out of the box.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">You can repeat this for as many qubes as you need, just add them to the policy file, and set up the systemd files. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>How to get AI integration in everyday application, like VS Code and web browses, but integrating the Ollama API in Qubes OS.<\/p>\n<a class=\"read-more-link\" href=\" https:\/\/homelab.computer\/index.php\/2025\/02\/15\/qubes-os-setting-up-a-qube-for-gpu-pass-through\/ \">Read more<\/a>","protected":false},"author":1,"featured_media":66,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,2],"tags":[],"class_list":["post-19","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-pcie-pass-through","category-qubes-os"],"_links":{"self":[{"href":"https:\/\/homelab.computer\/index.php\/wp-json\/wp\/v2\/posts\/19","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/homelab.computer\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/homelab.computer\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/homelab.computer\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/homelab.computer\/index.php\/wp-json\/wp\/v2\/comments?post=19"}],"version-history":[{"count":14,"href":"https:\/\/homelab.computer\/index.php\/wp-json\/wp\/v2\/posts\/19\/revisions"}],"predecessor-version":[{"id":78,"href":"https:\/\/homelab.computer\/index.php\/wp-json\/wp\/v2\/posts\/19\/revisions\/78"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/homelab.computer\/index.php\/wp-json\/wp\/v2\/media\/66"}],"wp:attachment":[{"href":"https:\/\/homelab.computer\/index.php\/wp-json\/wp\/v2\/media?parent=19"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/homelab.computer\/index.php\/wp-json\/wp\/v2\/categories?post=19"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/homelab.computer\/index.php\/wp-json\/wp\/v2\/tags?post=19"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}