How to build a Translation and Transcription Agent
An AI-powered workflow that translates and/or transcribes user messages or audio files into a chosen language and formality, then emails the polished result to the user.
Challenge
Manual translation and transcription of messages and audio files is slow, error-prone, and often lacks consistency in tone and formality.
Industry
Logistics
Department
Content Creation
Operations
Integrations
Gmail
OpenAI
Workflow Overview
This flow is designed to:
Accept a message (typed or as an audio file),
Translate it into a user-specified language,
Adjust the translation to a desired formality level (e.g., Message, Email, Report),
Display the result,
And optionally email the finished translation to the user.
Step-by-Step Node Breakdown
1. User Inputs
Original Message (
in-0)
Where the user types the message to be translated.Message File (
doc-0)
Lets the user upload an audio file. The file will be transcribed and then translated.Desired Language (
in-1)
User specifies the target language for translation (e.g., Danish, Spanish).Formality Level (
in-2)
User selects the desired formality for the output (Message, Email, or Report).Email Address (
in-3)
User provides their email address to receive the finished translation.
2. Translation and Transcription (llm-0)
This is a Large Language Model (LLM) node that:
Receives the original message, uploaded file (if any), target language, and formality level.
If a file is uploaded, it first transcribes the audio.
Translates the message (or transcription) into the specified language.
If the audio is a conversation, it summarizes the conversation instead of translating word-for-word.
3. Formality Adjustment (llm-1)
Another LLM node that:
Takes the translated text from the previous step.
Adjusts the translation to match the user’s selected formality level (e.g., making it suitable for an email).
Outputs the final, polished translation.
4. Output Display (out-0)
Shows the final, formality-adjusted translation to the user in the interface.
5. Send Email (action-0)
To tell you more about this tool, let me inspect its exact properties.Here’s a complete overview of how your translation and formality-checking workflow operates, including details about the email-sending step:
Workflow Overview
This flow enables users to:
Enter a message (or upload an audio file),
Specify the target language and desired formality,
Get a translated, formality-adjusted result,
View the result in the app,
And receive the finished translation by email.
Node-by-Node Walkthrough
User Input Nodes
Original Message (
in-0)
Where the user types the message to be translated.Message File (
doc-0)
Lets the user upload an audio file. The file will be transcribed and then translated.Desired Language (
in-1)
User specifies the target language for translation (e.g., Danish, Spanish).Formality Level (
in-2)
User selects the desired formality for the output (Message, Email, or Report).Email Address (
in-3)
User provides their email address to receive the finished translation.
Processing and Output Nodes
Translation and Transcription (
llm-0)Receives the original message, uploaded file (if any), target language, and formality level.
If a file is uploaded, it first transcribes the audio.
Translates the message (or transcription) into the specified language.
If the audio is a conversation, it summarizes the conversation instead of translating word-for-word.
Formality Adjustment (
llm-1)Takes the translated text from the previous step.
Adjusts the translation to match the user’s selected formality level (e.g., making it suitable for an email).
Outputs the final, polished translation.
Output Display (
out-0)Shows the final, formality-adjusted translation to the user in the interface.
Email Sending Node
Send Email (
action-0)Uses the Gmail API to send the finished translation to the user’s provided email address.
The email includes:
Recipient: The email address from the user input.
Subject: "Your Finished Translation"
Content: The translated and formality-adjusted message.
This node can also support attachments and different content formats, but in this flow, it sends a simple text email.
How the Flow Connects
User provides inputs (message, file, language, formality, email).
Translation and Transcription LLM processes the message and/or file, translates, and outputs the result.
Formality LLM takes the translation and adjusts it to the desired formality.
Output node displays the final result.
Send Email action emails the final result to the user.
Visual Summary
Node Name | Purpose |
|---|---|
Original Message | User types message |
Message File | User uploads audio file (optional) |
Desired Language | User specifies target language |
Formality Level | User specifies formality (Message, Email, Report) |
Email Address | User provides email for delivery |
Translation & Transcription (LLM) | Transcribes (if needed) and translates message |
Formality (LLM) | Adjusts translation to match formality |
Output | Displays final translation |
Send Email | Emails the final translation to the user |
In summary:
This workflow streamlines translation and formality adjustment, supports both text and audio input, and ensures the user receives the result both on-screen and via email.





