Leave get_model, error: Model not found, uid: custom-glm4-chat-1-0 #1205

shuifuture · 2024-09-24T09:55:15Z

Do you need to file an issue?

I have searched the existing issues and this bug is not already filed.
My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

Describe the issue

我的配置'''encoding_model: cl100k_base
skip_workflows: []
llm:
api_key: xinference
type: openai_chat # or azure_openai_chat
model: custom-glm4-chat
model_supports_json: true # recommended if this is available for your model.

max_tokens: 4000

request_timeout: 180.0

api_base: http://0.0.0.0:9997/v1

api_version: 2024-02-15-preview

organization: <organization_id>

deployment_name: <azure_model_deployment_name>

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 25 # the number of parallel inflight requests that may be made

temperature: 0.4 # temperature for sampling

top_p: 1 # top-p sampling

n: 1 # Number of completions to generate

'''

Steps to reproduce

No response

GraphRAG Config Used

# Paste your config here

Logs and screenshots

No response

Additional Information

GraphRAG Version:
Operating System:
Python Version:
Related Issues:

shuifuture · 2024-09-24T09:55:26Z

encoding_model: cl100k_base
skip_workflows: []
llm:
api_key: xinference
type: openai_chat # or azure_openai_chat
model: custom-glm4-chat
model_supports_json: true # recommended if this is available for your model.

max_tokens: 4000

request_timeout: 180.0

api_base: http://0.0.0.0:9997/v1

api_version: 2024-02-15-preview

organization: <organization_id>

deployment_name: <azure_model_deployment_name>

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 25 # the number of parallel inflight requests that may be made

temperature: 0.4 # temperature for sampling

top_p: 1 # top-p sampling

n: 1 # Number of completions to generate

natoverse · 2024-10-01T21:54:21Z

Routing to #657

shuifuture added the triage Default label assignment, indicates new issue needs reviewed by a maintainer label Sep 24, 2024

natoverse closed this as not planned Won't fix, can't repro, duplicate, stale Oct 1, 2024

natoverse added community_support Issue handled by community members and removed triage Default label assignment, indicates new issue needs reviewed by a maintainer labels Oct 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Leave get_model, error: Model not found, uid: custom-glm4-chat-1-0 #1205

Leave get_model, error: Model not found, uid: custom-glm4-chat-1-0 #1205

shuifuture commented Sep 24, 2024

shuifuture commented Sep 24, 2024

natoverse commented Oct 1, 2024

Leave get_model, error: Model not found, uid: custom-glm4-chat-1-0 #1205

Leave get_model, error: Model not found, uid: custom-glm4-chat-1-0 #1205

Comments

shuifuture commented Sep 24, 2024

Do you need to file an issue?

Describe the issue

max_tokens: 4000

request_timeout: 180.0

api_version: 2024-02-15-preview

organization: <organization_id>

deployment_name: <azure_model_deployment_name>

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 25 # the number of parallel inflight requests that may be made

top_p: 1 # top-p sampling

n: 1 # Number of completions to generate

Steps to reproduce

GraphRAG Config Used

Logs and screenshots

Additional Information

shuifuture commented Sep 24, 2024

max_tokens: 4000

request_timeout: 180.0

api_version: 2024-02-15-preview

organization: <organization_id>

deployment_name: <azure_model_deployment_name>

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 25 # the number of parallel inflight requests that may be made

top_p: 1 # top-p sampling

n: 1 # Number of completions to generate

natoverse commented Oct 1, 2024