Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Azure GPT-4 Turbo endpoint for image generation and maximum number of tokens #1623

Open
plopezamaya opened this issue Jun 11, 2024 · 1 comment

Comments

@plopezamaya
Copy link

Image generation is supported only for openai providers and model names equal to gpt-4-turbo.

Some examples of this can be seen in the v3.0.79 :

- backend/danswer/chat/process_message.py:

elif tool_cls.__name__ == ImageGenerationTool.__name__:
                dalle_key = None
                if llm and llm.config.api_key and llm.config.model_provider == "openai":
                    dalle_key = llm.config.api_key
                else:
                    llm_providers = fetch_existing_llm_providers(db_session)
                    openai_provider = next(
                        iter(
                            [
                                llm_provider
                                for llm_provider in llm_providers
                                if llm_provider.provider == "openai"
                            ]
                        ),
                        None,
                    )
                    if not openai_provider or not openai_provider.api_key:
                        raise ValueError(
                            "Image generation tool requires an OpenAI API key"
                        )
                    dalle_key = openai_provider.api_key
                tools.append(ImageGenerationTool(api_key=dalle_key))
  • It can also be seen in web/src/app/admin/assistants/AssistantEditor.tsx:
function checkLLMSupportsImageGeneration(provider: string, model: string) {
  return provider === "openai" && model === "gpt-4-turbo";
}

Azure deployments e.i model is not always gpt-4-turbo as it can follow some deployment conventions. Therefore it will not be recognized neither for the Image Generation nor for the maximum number of tokens using litellm.model_cost function :

def get_llm_max_tokens(
    model_map: dict,
    model_name: str,
    model_provider: str = GEN_AI_MODEL_PROVIDER,
) -> int:
    """Best effort attempt to get the max tokens for the LLM"""
    if GEN_AI_MAX_TOKENS:
        # This is an override, so always return this
        return GEN_AI_MAX_TOKENS

    try:
        model_obj = model_map.get(f"{model_provider}/{model_name}")
        if not model_obj:
            model_obj = model_map[model_name]

        if "max_input_tokens" in model_obj:
            return model_obj["max_input_tokens"]

        if "max_tokens" in model_obj:
            return model_obj["max_tokens"]

        raise RuntimeError("No max tokens found for LLM")

On idea could be to have model type set in the UI (gpt-4-turbo) and leave the model name for the deployment. Using this model type you can now check for the image generation and also for the cost and use the name to query the endpoint.

@plopezamaya
Copy link
Author

Also note that when deploying models into azure the model name will not always be gpt-4-turbo-2024-04-09 or one of the litellm model names. Meaning that for azure providers or custom providers there should be a case Model Base Name
Making a deployment with the name prd-myprojectapi-gpt35-turbo-eu-west-3 on azure have the following properties :

  • Model Name: prd-myprojectapi-gpt35-turbo-eu-west-3
  • Model Base Name: gpt-35-turbo-1106
  • All the other configurations already present

This will allow to use the azure deployments/endpoints while using the dynamic maximum number of tokens for the base model and not having a single maximum number of tokens to be set in GEN_AI_MAX_TOKENS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant