Dialogue System

Sherpa.ai's conversational dialogue system stands out because of its:

  • Technology and years of evolution, which allows natural language conversations to be held between human and machine
  • Vast variety of results, along with its many areas of knowledge, or domains.
  • Build your own domain to personalize your Questions and Answers (Q&A)
  • Results, which are adapted to context
  • Requests in both audio (speech) and text format; Sherpa.ai Conversational AI allows users to interact with the platform either through text or audio (speech).

    • Audio (speech): Request in audio (speech) format
    • Text: Text specifically to be displayed
  • Responses in both audio (speech) and text format that are true, accurate, and natural:

    • Audio (speech): Request in audio (speech) format
    • Text: Text specifically to be displayed
    • Speech: Text ready to be processed by a TTS (text to speech) system
    • Data: Text and audio (speech) responses in JSON format, organized hierarchically to include a URL that plays the audio (speech) of the text returned
  • Sherpa.ai currently supports Spanish and English, which can be configured via the Accept-Language header.

Text/Audio requestes served in Text/Audio responses

The above image displays examples of the following cases:

  • Device with screen and keyboard: The user writes their question or request and the response is displayed on the screen.
  • Device with screen only: The user verbally asks a question or makes a request using the device's microphone, and the answer is displayed on the screen.
  • Device without screen or keyboard: The user verbally asks a question or makes a request using the device's microphone, and Sherpa.ai's response is played through the device's speaker.

Sherpa.ai's response mode is in JSON format, organized hierarchically to include a URL that plays the speech of the text returned. Therefore, the possibilities of Sherpa.ai's conversational system are not limited to just the three previously aforementioned cases; the response could also appear on a screen while simultaneously playing through the device's speaker.

Natural Language Conversations

Sherpa.ai's dialogue system is not limited to a simple question-and-answer mechanism and, instead, allows the ability to hold real conversations between human and machine.

This conversational mechanism is achived thanks to the fact that all of Sherpa.ai's responses make reference to the context. In the following request, Sherpa.ai references the conversation so that the system knows to follow the same context. That way, Sherpa.ai is able to request information from the user, with the objective of better refining their responses.

Situation: Requesting a flight


I want a flight to Berlin for two people.
» Would you tell me the date and departure city of your flight?
This weekend
» Please, tell me your departure city.
Madrid
» Check that the details are correct and click Search Flights (details about the flight are shown).

The reference to the conversation will always be sent through the headers of the requests made to the server. Said reference will be provided by the server in all of its responses.

Conversations with Client-Delegated Actions

Sherpa.ai will not always be able to do everything, and may delegate the finalization of an action to the client.

As illustrated in the previous example, Sherpa.ai will be in charge of finding the fastest route, based on the destination indicated by the user. However, the action of presenting that route will depend upon the applications that the user has installed on their device. Therefore, the final action will be delegated to the client. For example, the client would open Google Maps in order to bring up the route that Sherpa.ai has calculated.

Situation: Traveling by car


I want to go to Madrid.
» How do you want to go? By plane, by car, walking or by public transportation?
By car.
» Here is the route you can take. (Maps app is opened)

You can see the complete list of client-delegated actions in the Next Actions Table

Take a look at our domain examples, in order to get an idea of the scope.