[코드] ollama serve 통해 오픈소스 모델 실행하기

Ollama 오픈소스 모델 사용법

gpt-oss, gemma 등 오픈소스 모델을 돌리고 싶었다

ollama를 설치해서 하는 법을 찾았다

ollama serve

# 1. Ollama 설치
curl -fsSL https://ollama.com/install.sh | sh

# 2. 서버 시작 (별도 터미널)
ollama serve

# 3. 모델 다운로드
ollama pull gpt-oss:20b

# 4. Python 패키지 설치
pip install ollama python-dotenv pydantic tqdm

아래 순서대로 명령어를 치면 된다.

# 1. Ollama 설치

curl -fsSL https://ollama.com/install.sh | sh

# 2. 서버 시작 (별도 터미널)

ollama serve

# 3. 모델 다운로드

ollama pull gpt-oss:20b

# 4. Python 패키지 설치

pip install ollama python-dotenv pydantic tqdm

주의할 점은 ollama serve를 하고는 그 서버는 입력은 불가하고 log만 볼 수 있기 때문에,

다른 터미날 창/탭을 켜서 ollama pull, pip install, python 파일 실행 등을 수행해야 한다.

아래는 순서대로 실행하면 어떻게 떠야하는지 작성.

$ curl -fsSL https://ollama.com/install.sh | sh
>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
######################################################################## 100.0%
>>> Creating ollama user...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
WARNING: systemd is not running
WARNING: Unable to detect NVIDIA/AMD GPU. Install lspci or lshw to automatically detect and install GPU dependencies.
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.

localhost:11434 포트에서 띄워진다.

$ ollama serve

(중략)

time=2025TZ level=INFO source=runner.go: msg="experimental Vul support disabled. To enable, set OLLAMA_VULKAN=1"
time=20255Z level=INFO source=types.go: msg="inference compute" id=GPU- filter_id="" library=CUDA com name=CUDA0 description="NVIDIA A5000" libdirs=ollama,cuda_v12 driver=1 pci_id=00 type=discrete total="24.0 GiB" available="23.6 GiB"

GPU A5000에서 돌린다는 메시지가 보인다.

$ ollama pull gpt-oss:20b
pulling manifest
pulling e7b260: 100% ▕████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏  13 GB
pulling fa671078: 100% ▕████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 7.2 KB
pulling f603547: 100% ▕████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏  11 KB
pulling d8bab3: 100% ▕████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏   18 B
pulling 776b23: 100% ▕████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏  489 B
verifying sha256 digest
writing manifest
success

이렇게 하고 python에서 gpt-oss를 불러오면 된다.

import ollama

self.model = "gpt-oss:20b"
...

            response = ollama.chat(
                model=self.model,
                messages=[
                    {"role": "system", "content": self.SYSTEM_PROMPT},
                    {"role": "user", "content": user_prompt}
                ],
                options={
                    "temperature": self.temperature,
                }
            )
            
            response_text = response['message']['content']

돌려보니까 gpt-oss:20b는 13000MiB정도 차지 하게 된다. (13GB RAM)

공식적으로 16GB 정도 필요하다고 한다.

$ ollama list
NAME ID SIZE MODIFIED
gpt-oss:20b 17e 13 GB 5 minutes ago

지금까지 pull한 모델을 보고 싶으면 ollama list를 치면 된다.

Reference

'Error and Solve' 카테고리의 다른 글

[설치] Windows에서 wsl2 설치하기 / python-is-python3 (0)	2025.12.23
[에러 해결] Value error, The checkpoint you are trying to load has model type `gpt_oss` but Transformers does not recognize this architecture. (0)	2025.12.16
[에러 해결] Config Assertion Error / pip install transformers==4.40.1 (0)	2025.11.05
[오류 해결] ubuntu 에 open-jdk 17 설치 도중 404 not found error 해결 / sources list 바꾸기 (0)	2025.10.05
git repo initial commit / git remote add origin (0)	2025.09.22

Flash Summit

[코드] ollama serve 통해 오픈소스 모델 실행하기

Ollama 오픈소스 모델 사용법

ollama serve

'Error and Solve' 카테고리의 다른 글

티스토리툴바

[코드] ollama serve 통해 오픈소스 모델 실행하기

Ollama 오픈소스 모델 사용법

ollama serve

'Error and Solve' 카테고리의 다른 글

'Error and Solve' Related Articles

티스토리툴바