IRS Endpoints

IRS Endpoints

Each IRS service is created by defining an IRS endpoint. The code that gets called when a new request is received at the endpoint depends on the way you configure your IRS endpoint.

The runIRSServer method that we use to start the IRS server takes in a list of configuration objects for the IRS endpoints we want defined. The server will start and listen for incoming requests at the defined IRS endpoints. The IRS server will process the incoming requests, extract the intent and information from the user input, and send back the extracted intent and information in the response. You can provide multiple IRS endpoint configurations to the runIRSServer method to define multiple IRS endpoints.

Below are the configurations that you can provide for an IRS endpoint.

Configuring an IRS Endpoint

You can configure an IRS endpoint by providing a configuration object with the following properties:

  • endpoint: The server endpoint at which the intent recognition requests should be handled.
  • dataSource: The data source from where the IRS data, intents data, and API keys data can be fetched. Check the Data Sources page to learn more.
  • enableAuth: A boolean value indicating whether to enable authentication for the IRS endpoint. If set to true, the endpoint will require an API key to be provided in the request headers. The provided data source should be configured properly with reference to the API keys data. By default, it is set to false. Check the Authentication page to learn more.
  • verbose: If set to true, returns additional information in the response. May include usage information like the number of input and output tokens used, input and output characters, etc. By default, it is set to false.
  • model: An object containing the configuration for the LLM models to be used for the primary model, query expansion model, and response evaluation model. Check the Configuring LLM Models section to learn more.

If you're writing your IRS endpoint configuration in TypeScript, you can use the IRSEndpointConfig type to define the configuration object.

import { type IRSEndpointConfig } from '@oconva/intento';
 
const irsEndpointConfig: IRSEndpointConfig = {
  endpoint: 'irs',
  dataSource: irsDataSource,
  enableAuth: true,
  verbose: true,
  model: {
    primaryModel: {
      name: 'gemini15flash',
      config: {
        temperature: 0.9,
      },
    },
    queryExpansionModel: {
      name: 'gemini15flash',
      config: {
        temperature: 0.9,
      },
    },
    responseEvaluationModel: {
      name: 'gemini15flash',
      config: {
        temperature: 0.9,
      },
    },
  },
};

Configuring LLM Models

You can configure the LLM models to be used for the primary model, query expansion model, and response evaluation model. The model property in the defineIRSEndpoint function takes an object with the following properties:

  • primaryModel: The model to be used for intent recognition task.
  • queryExpansionModel: The model to be used for reformulating the query to produce more accurate results.
  • responseEvaluationModel: The model to be used for evaluating the response generated by the primary model.

Each of these properties takes an object with the following properties:

  • name: Name of the LLM model to use.
  • config: Configuration for the LLM model. This can include parameters like temperature, max output tokens, and safety settings.

Supported LLM Models

Intento requires the use of LLM models that are capable of producing responses in JSON format.

By default, Intento uses the Gemini 15 Flash (gemini15flash) model provided by Google, for all tasks. This model performs extremely well on a wide range of tasks and is optimized for speed and accuracy.

Currently, you can only provide one of the following model names when configuring the IRS endpoint to use a specific model.

name: "gemini15flash" | "gemini15pro" | "gpt4o"
  • gemini15flash: The Gemini 15 Flash model provided by Google. This model is optimized for speed and accuracy. Remember to set the GOOGLE_GENAI_API_KEY environment variable to use this model.
  • gemini15pro: The Gemini 15 Pro model provided by Google. This model is optimized for generating high-quality text. Remember to set the GOOGLE_GENAI_API_KEY environment variable to use this model.
  • gpt4o: The GPT-4o OpenAI model. This model is optimized for generating human-like text. Remember to set the OPENAI_API_KEY environment variable to use this model.

Providing IRS Endpoint to Server

We can provide a list of IRS endpoint configurations for the IRS endpoints we want defined to the runIRSServer method. The server will start and listen for incoming requests at the defined IRS endpoints.

Under the hood, the IRS server will fetch the IRS data from the data source, set up the required prompts with IRS data, and set up the API key store. The server will then start listening for incoming requests at the IRS endpoints.

import { runIRSServer } from '@oconva/intento';
 
// Run server with the IRS endpoint configurations.
runIRSServer({
  endpointConfigs: [irsEndpointConfig],
});