Ollama

Ollama는 오픈 모델을 로컬에서 실행하고 OpenAI 호환 API를 노출하므로, LibreChat을 자신의 컴퓨터로 연결할 수 있습니다. ollama run <model> 명령어로 모델을 다운로드하고 Ollama Library에서 사용 가능한 모델을 찾아보세요.

구성

Ollama는 API 키를 무시하지만 여전히 해당 필드가 존재하기를 기대하므로, 아무 자리 표시자(placeholder) 값으로 설정하십시오. baseURL을 귀하의 Ollama 서버로 지정하십시오. librechat.yaml의 endpoints.custom 아래에 엔드포인트를 추가하십시오:

    - name: "Ollama"
      apiKey: "ollama"
      # use 'host.docker.internal' instead of localhost if running LibreChat in a docker container
      baseURL: "http://localhost:11434/v1/" 
      models:
        default: [
          "llama2",
          "mistral",
          "codellama",
          "dolphin-mixtral",
          "mistral-openorca"
          ]
        # fetching list of models is supported but the `name` field must start
        # with `ollama` (case-insensitive), as it does in this example.
        fetch: true
      titleConvo: true
      titleModel: "current_model"
      summarize: false
      summaryModel: "current_model"
      modelDisplayLabel: "Ollama"

참고 사항

titleModel을 "current_model"로 설정하면 제목 생성 시 두 번째 모델을 로드하는 대신 대화의 모델을 재사용하게 됩니다. 이를 통해 Ollama는 대화당 하나의 모델만 로드된 상태를 유지할 수 있습니다.
위의 default 배열은 인기 있는 모델들의 샘플 목록입니다. fetch: true를 설정하면, LibreChat은 서버에서 전체 목록을 가져옵니다.

Ollama -> llama3

default parameters에서 stop이 제거되면 아래 문제는 더 이상 발생하지 않습니다.

llama3가 멈추지 않고 계속 생성한다면, addParams 블록에 정지 시퀀스(stop sequences)를 추가하세요:

    - name: "Ollama"
      apiKey: "ollama"
      baseURL: "http://host.docker.internal:11434/v1/"
      models:
        default: [
          "llama3"
        ]
        fetch: false # fetching list of models is not supported
      titleConvo: true
      titleModel: "current_model"
      summarize: false
      summaryModel: "current_model"
      modelDisplayLabel: "Ollama"
      addParams:
          "stop": [
              "<|start_header_id|>",
              "<|end_header_id|>",
              "<|eot_id|>",
              "<|reserved_special_token"
          ]

Ollama에서 llama3만 실행하는 경우, addParams를 통해 설정 수준에서 stop을 설정해도 괜찮습니다. 여러 모델을 실행할 때는 대신 대화 매개변수와 프리셋을 통해 프론트엔드에서 정지 시퀀스(stop sequences)를 추가하고, addParams는 생략하세요:

    - name: "Ollama"
      apiKey: "ollama"
      baseURL: "http://host.docker.internal:11434/v1/" 
      models:
        default: [
          "llama3:latest",
          "mistral"
          ]
        fetch: false # fetching list of models is not supported
      titleConvo: true
      titleModel: "current_model"
      modelDisplayLabel: "Ollama"

대화 매개변수에서 정지 시퀀스(stop sequences)를 설정하세요(그리고 이를 프리셋으로 저장하세요):