How to build a Translation and Transcription Agent

An AI-powered workflow that translates and/or transcribes user messages or audio files into a chosen language and formality, then emails the polished result to the user.

Challenge

Manual translation and transcription of messages and audio files is slow, error-prone, and often lacks consistency in tone and formality.

Industry

Logistics

Department

Content Creation

Operations

Integrations

Gmail

OpenAI

Workflow Overview

This flow is designed to:

  • Accept a message (typed or as an audio file),

  • Translate it into a user-specified language,

  • Adjust the translation to a desired formality level (e.g., Message, Email, Report),

  • Display the result,

  • And optionally email the finished translation to the user.

Step-by-Step Node Breakdown

1. User Inputs

  • Original Message (in-0)
    Where the user types the message to be translated.

  • Message File (doc-0)
    Lets the user upload an audio file. The file will be transcribed and then translated.

  • Desired Language (in-1)
    User specifies the target language for translation (e.g., Danish, Spanish).

  • Formality Level (in-2)
    User selects the desired formality for the output (Message, Email, or Report).

  • Email Address (in-3)
    User provides their email address to receive the finished translation.

2. Translation and Transcription (llm-0)

  • This is a Large Language Model (LLM) node that:

    • Receives the original message, uploaded file (if any), target language, and formality level.

    • If a file is uploaded, it first transcribes the audio.

    • Translates the message (or transcription) into the specified language.

    • If the audio is a conversation, it summarizes the conversation instead of translating word-for-word.

3. Formality Adjustment (llm-1)

  • Another LLM node that:

    • Takes the translated text from the previous step.

    • Adjusts the translation to match the user’s selected formality level (e.g., making it suitable for an email).

    • Outputs the final, polished translation.

4. Output Display (out-0)

  • Shows the final, formality-adjusted translation to the user in the interface.

5. Send Email (action-0)

  • To tell you more about this tool, let me inspect its exact properties.Here’s a complete overview of how your translation and formality-checking workflow operates, including details about the email-sending step:

Workflow Overview

This flow enables users to:

  1. Enter a message (or upload an audio file),

  2. Specify the target language and desired formality,

  3. Get a translated, formality-adjusted result,

  4. View the result in the app,

  5. And receive the finished translation by email.

Node-by-Node Walkthrough

User Input Nodes

  • Original Message (in-0)
    Where the user types the message to be translated.

  • Message File (doc-0)
    Lets the user upload an audio file. The file will be transcribed and then translated.

  • Desired Language (in-1)
    User specifies the target language for translation (e.g., Danish, Spanish).

  • Formality Level (in-2)
    User selects the desired formality for the output (Message, Email, or Report).

  • Email Address (in-3)
    User provides their email address to receive the finished translation.

Processing and Output Nodes

  • Translation and Transcription (llm-0)

    • Receives the original message, uploaded file (if any), target language, and formality level.

    • If a file is uploaded, it first transcribes the audio.

    • Translates the message (or transcription) into the specified language.

    • If the audio is a conversation, it summarizes the conversation instead of translating word-for-word.

  • Formality Adjustment (llm-1)

    • Takes the translated text from the previous step.

    • Adjusts the translation to match the user’s selected formality level (e.g., making it suitable for an email).

    • Outputs the final, polished translation.

  • Output Display (out-0)

    • Shows the final, formality-adjusted translation to the user in the interface.

Email Sending Node

  • Send Email (action-0)

    • Uses the Gmail API to send the finished translation to the user’s provided email address.

    • The email includes:

      • Recipient: The email address from the user input.

      • Subject: "Your Finished Translation"

      • Content: The translated and formality-adjusted message.

    • This node can also support attachments and different content formats, but in this flow, it sends a simple text email.

How the Flow Connects

  1. User provides inputs (message, file, language, formality, email).

  2. Translation and Transcription LLM processes the message and/or file, translates, and outputs the result.

  3. Formality LLM takes the translation and adjusts it to the desired formality.

  4. Output node displays the final result.

  5. Send Email action emails the final result to the user.

Visual Summary

Node Name

Purpose

Original Message

User types message

Message File

User uploads audio file (optional)

Desired Language

User specifies target language

Formality Level

User specifies formality (Message, Email, Report)

Email Address

User provides email for delivery

Translation & Transcription (LLM)

Transcribes (if needed) and translates message

Formality (LLM)

Adjusts translation to match formality

Output

Displays final translation

Send Email

Emails the final translation to the user

In summary:
This workflow streamlines translation and formality adjustment, supports both text and audio input, and ensures the user receives the result both on-screen and via email.

Get started

Let’s Build AI Agents, Together

Book a demo to see how AI agents can help your team process unstructured documents and perform complex analysis faster and more accurately.

Get started

Let’s Build AI Agents, Together

Book a demo to see how AI agents can help your team process unstructured documents and perform complex analysis faster and more accurately.

Get started

Let’s Build AI Agents, Together

Book a demo to see how AI agents can help your team process unstructured documents and perform complex analysis faster and more accurately.