Transcribe audio using AssemblyAI with Semantic Kernel plugins.
Add the AssemblyAI.SemanticKernel NuGet package to your project.
dotnet add package AssemblyAI.SemanticKernel
Next, register the AssemblyAI
plugin into your kernel:
using AssemblyAI.SemanticKernel;
using Microsoft.SemanticKernel;
// Build your kernel
var kernelBuilder = Kernel.CreateBuilder();
// add services like LLMs etc.
// Get AssemblyAI API key from env variables, or much better, from .NET configuration
string apiKey = Environment.GetEnvironmentVariable("ASSEMBLYAI_API_KEY")
?? throw new Exception("ASSEMBLYAI_API_KEY env variable not configured.");
kernelBuilder.AddAssemblyAIPlugin(new AssemblyAIPluginOptions
{
ApiKey = apiKey,
PluginName = null,
AllowFileSystemAccess = false
});
var kernel = kernelBuilder.Build();
You can configure three options:
- ApiKey: Configure the AssemblyAI API key
- PluginName: Configure the name of the plugin inside of Semantic Kernel. Defaults to
"AssemblyAIPlugin"
. - AllowFileSystemAccess: Allow the plugin to read files from the file system to upload audio files for transcriptions. Defaults to
false
.
kernelBuilder.AddAssemblyAIPlugin
has overloads to configure the plugin using configuration and through a lambda.
Get the Transcribe
function from the transcript plugin and invoke it with the context variables.
var result = await kernel.InvokeAsync<string>(
nameof(AssemblyAIPlugin),
AssemblyAIPlugin.TranscribeFunctionName,
new KernelArguments
{
["INPUT"] = "https://storage.googleapis.com/aai-docs-samples/espn.m4a"
}
);
Console.WriteLine(result);
You can also upload local audio and video file. To do this:
- Set the
AssemblyAIPluginOptions.AllowFileSystemAccess
totrue
. - Configure the
INPUT
variable with a local file path.
kernelBuilder.AddAssemblyAIPlugin(new AssemblyAIPluginOptions
{
ApiKey = apiKey,
AllowFileSystemAccess = true
});
...
var result = await kernel.InvokeAsync<string>(
nameof(AssemblyAIPlugin),
AssemblyAIPlugin.TranscribeFunctionName,
new KernelArguments
{
["INPUT"] = "./espn.m4a"
}
);
Console.WriteLine(result);
You can also invoke the function from within a semantic function like this.
const string prompt = """
Here is a transcript:
{{AssemblyAIPlugin.Transcribe "https://storage.googleapis.com/aai-docs-samples/espn.m4a"}}
---
Summarize the transcript.
""";
var result = await kernel.InvokePromptAsync<string>(prompt);
Console.WriteLine(result);
All the code above explicitly invokes the transcript plugin, but it can also be invoked as part of a plan. Check out the Sample project) which uses a plan to transcribe an audio file in addition to explicit invocation.
- The AssemblyAI integration only supports Semantic Kernel with .NET at this moment. If there's demand, we will extend support to other platforms, so let us know!
- Feel free to file an issue in case of bugs or feature requests.