This post is part of the Voice Assistant on Raspberry Pi series.
Article #2 ended with a hardcoded response, which was enough to confirm the audio pipeline works. Now we swap that line for a real HTTP call to Ollama on the pi-cerveau, and add a system prompt to give the assistant a personality and some knowledge about your home.
The complete code for this article is available on GitHub.
[GPIO Button] → [arecord] → [Whisper] → [Ollama on pi-cerveau] → [Piper TTS] → [aplay]
Step 1: The Ollama service
Create Services/OllamaService.cs:
using System.Net.Http.Json;
using Microsoft.Extensions.Options;
namespace AudioAssistant.Services;
public interface ILlmService
{
Task<string> GenerateAsync(string prompt, CancellationToken cancellationToken);
}
public class OllamaService : ILlmService
{
private readonly HttpClient _http;
private readonly AssistantOptions _options;
private readonly ILogger<OllamaService> _logger;
public OllamaService(HttpClient http, IOptions<AssistantOptions> options, ILogger<OllamaService> logger)
{
_http = http;
_options = options.Value;
_logger = logger;
}
public async Task<string> GenerateAsync(string prompt, CancellationToken cancellationToken)
{
_logger.LogInformation("Sending to LLM: \"{Prompt}\"", prompt);
var request = new
{
model = _options.OllamaModel,
prompt = prompt,
stream = false
};
var response = await _http.PostAsJsonAsync(
$"{_options.OllamaBaseUrl}/api/generate",
request,
cancellationToken);
response.EnsureSuccessStatusCode();
var result = await response.Content.ReadFromJsonAsync<OllamaResponse>(
cancellationToken: cancellationToken);
var text = result?.Response?.Trim() ?? "I don't have a response.";
_logger.LogInformation("LLM response: \"{Text}\"", text);
return text;
}
}
internal record OllamaResponse(string Response);
Step 2: The home context
The system prompt is what gives the assistant its personality and background knowledge. Storing it in appsettings.json means you can tweak it without recompiling.
Create Services/ContextService.cs:
using Microsoft.Extensions.Options;
namespace AudioAssistant.Services;
public interface IContextService
{
string BuildPrompt(string userInput);
}
public class ContextService : IContextService
{
private readonly AssistantOptions _options;
public ContextService(IOptions<AssistantOptions> options)
{
_options = options.Value;
}
public string BuildPrompt(string userInput)
{
return _options.SystemPrompt
+ "\n\nUser question: " + userInput
+ "\n\nAnswer in French, concisely (2-3 sentences max). No markdown formatting.";
}
}
Step 3: Update AssistantOptions
Add the new properties to AssistantOptions.cs:
namespace AudioAssistant;
public class AssistantOptions
{
public int GpioButtonPin { get; set; } = 17;
public string AudioDevice { get; set; } = "hw:1,0";
public int RecordingDurationSeconds { get; set; } = 10;
public string WhisperModel { get; set; } = "ggml-base.bin";
public string PiperBinary { get; set; } = "/home/gabriel/piper/piper/piper";
public string PiperVoice { get; set; } = "/home/gabriel/piper-voices/fr_FR-siwis-low.onnx";
public string AudioOutputDevice { get; set; } = "hw:2,0";
// Ollama
public string OllamaBaseUrl { get; set; } = "http://pi-cerveau.local:11434";
public string OllamaModel { get; set; } = "llama3.2:3b";
// Context
public string SystemPrompt { get; set; } = "";
}
Step 4: Update appsettings.json
{
"Assistant": {
"GpioButtonPin": 17,
"AudioDevice": "hw:3,0",
"RecordingDurationSeconds": 10,
"WhisperModel": "ggml-base.bin",
"PiperBinary": "/home/gabriel/piper/piper/piper",
"PiperVoice": "/home/gabriel/piper-voices/fr_FR-siwis-low.onnx",
"AudioOutputDevice": "hw:3,0",
"OllamaBaseUrl": "http://pi-cerveau.local:11434",
"OllamaModel": "llama3.2:3b",
"SystemPrompt": "You are a personal voice assistant named Alex. You help the Mongeon family living in Blainville, Quebec, Canada. You always respond in French, naturally and warmly. You know the family members: Gabriel (the father, .NET developer) and his family. Local weather is for Blainville in the Laurentians. Be concise: 1 to 3 sentences max, as your responses will be read aloud."
},
"Logging": {
"LogLevel": {
"Default": "Information"
}
}
}
Customize the
SystemPromptwith your family members’ names, preferences, and daily routines. The more specific the context, the smarter the assistant feels without needing a bigger model.
Step 5: NuGet package and DI registration
AddHttpClient requires an extra package:
dotnet add package Microsoft.Extensions.Http
Update Program.cs:
using AudioAssistant;
using AudioAssistant.Services;
var builder = Host.CreateApplicationBuilder(args);
builder.Services.Configure<AssistantOptions>(
builder.Configuration.GetSection("Assistant"));
builder.Services.AddSingleton<IGpioService, GpioService>();
builder.Services.AddSingleton<IAudioRecorderService, AudioRecorderService>();
builder.Services.AddSingleton<ITranscriptionService, WhisperTranscriptionService>();
builder.Services.AddSingleton<ISpeechService, PiperSpeechService>();
builder.Services.AddSingleton<IContextService, ContextService>();
// AddHttpClient handles connection pooling and avoids stale DNS
builder.Services.AddHttpClient<ILlmService, OllamaService>(client =>
{
client.Timeout = TimeSpan.FromSeconds(60);
});
builder.Services.AddHostedService<Worker>();
var host = builder.Build();
host.Run();
Step 6: Update Worker.cs
The only real change from article #2: swap the hardcoded response for _context.BuildPrompt() + _llm.GenerateAsync().
using AudioAssistant.Services;
namespace AudioAssistant;
public class Worker : BackgroundService
{
private readonly IGpioService _gpio;
private readonly IAudioRecorderService _recorder;
private readonly ITranscriptionService _transcription;
private readonly ILlmService _llm;
private readonly IContextService _context;
private readonly ISpeechService _speech;
private readonly ILogger<Worker> _logger;
public Worker(
IGpioService gpio,
IAudioRecorderService recorder,
ITranscriptionService transcription,
ILlmService llm,
IContextService context,
ISpeechService speech,
ILogger<Worker> logger)
{
_gpio = gpio;
_recorder = recorder;
_transcription = transcription;
_llm = llm;
_context = context;
_speech = speech;
_logger = logger;
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
_logger.LogInformation("Assistant started. Press the button to speak.");
while (!stoppingToken.IsCancellationRequested)
{
_gpio.WaitForButtonPress(stoppingToken);
if (stoppingToken.IsCancellationRequested) break;
try
{
var audioFile = await _recorder.RecordAsync(stoppingToken);
var text = await _transcription.TranscribeAsync(audioFile, stoppingToken);
if (string.IsNullOrWhiteSpace(text))
{
await _speech.SpeakAsync("Je n'ai pas bien entendu. Peux-tu répéter?", stoppingToken);
}
else
{
var prompt = _context.BuildPrompt(text);
var response = await _llm.GenerateAsync(prompt, stoppingToken);
await _speech.SpeakAsync(response, stoppingToken);
}
if (File.Exists(audioFile))
File.Delete(audioFile);
}
catch (Exception ex) when (!stoppingToken.IsCancellationRequested)
{
_logger.LogError(ex, "Error in pipeline");
await _speech.SpeakAsync("Une erreur s'est produite.", stoppingToken);
}
}
}
}
Two practical fixes discovered during testing
Issue 1: the mic records at 48 kHz stereo, Whisper expects 16 kHz mono.
Some USB adapters force 48000 Hz stereo regardless of the arecord format flag. Whisper.net needs 16000 Hz mono. The fix is a ffmpeg conversion step after recording.
Replace Services/AudioRecorderService.cs:
using System.Diagnostics;
using Microsoft.Extensions.Options;
namespace AudioAssistant.Services;
public interface IAudioRecorderService
{
Task<string> RecordAsync(CancellationToken cancellationToken);
}
public class AudioRecorderService : IAudioRecorderService
{
private readonly AssistantOptions _options;
private readonly ILogger<AudioRecorderService> _logger;
public AudioRecorderService(IOptions<AssistantOptions> options, ILogger<AudioRecorderService> logger)
{
_options = options.Value;
_logger = logger;
}
public async Task<string> RecordAsync(CancellationToken cancellationToken)
{
var rawFile = Path.Combine(Path.GetTempPath(), $"audio_raw_{Guid.NewGuid()}.wav");
var outputFile = Path.Combine(Path.GetTempPath(), $"audio_{Guid.NewGuid()}.wav");
_logger.LogInformation("Recording started ({Duration}s)...", _options.RecordingDurationSeconds);
// Record in native format (48000 Hz stereo — what the USB adapter supports)
var recordPsi = new ProcessStartInfo
{
FileName = "arecord",
Arguments = $"-D {_options.AudioDevice} -f S16_LE -c 2 -t wav -d {_options.RecordingDurationSeconds} {rawFile}",
RedirectStandardError = true,
UseShellExecute = false
};
using var recordProcess = Process.Start(recordPsi)!;
await recordProcess.WaitForExitAsync(cancellationToken);
// Convert to 16000 Hz mono for Whisper
var ffmpegPsi = new ProcessStartInfo
{
FileName = "ffmpeg",
Arguments = $"-y -i {rawFile} -ar 16000 -ac 1 {outputFile}",
RedirectStandardError = true,
UseShellExecute = false
};
using var ffmpegProcess = Process.Start(ffmpegPsi)!;
await ffmpegProcess.WaitForExitAsync(cancellationToken);
if (File.Exists(rawFile))
File.Delete(rawFile);
_logger.LogInformation("Recording done: {File}", outputFile);
return outputFile;
}
}
ffmpegwas already installed in article #1. The raw file is deleted after conversion.
Issue 2: aplay fails if the USB adapter expects stereo output.
Update the ProcessStartInfo in Services/PiperSpeechService.cs to add -c 2:
var piperPsi = new ProcessStartInfo
{
FileName = _options.PiperBinary,
Arguments = $"--model {_options.PiperVoice} --output_raw",
RedirectStandardInput = true,
RedirectStandardOutput = true,
UseShellExecute = false
};
var aplayPsi = new ProcessStartInfo
{
FileName = "aplay",
Arguments = $"-D {_options.AudioOutputDevice} -r 22050 -f S16_LE -c 2 -t raw -",
RedirectStandardInput = true,
UseShellExecute = false
};
If you’re using the 3.5mm jack instead of a USB adapter, remove
-c 2and keep mono.
Validate Ollama before running the full pipeline
Before dotnet run, confirm the pi-cerveau is responding:
# Check connectivity
curl http://pi-cerveau.local:11434/api/tags
# Test a generation
curl -X POST http://pi-cerveau.local:11434/api/generate \
-H "Content-Type: application/json" \
-d '{"model":"llama3.2:3b","prompt":"Say hello in one sentence.","stream":false}'
If you see a JSON response with a response field, Ollama is ready. Then:
cd ~/projects/AudioAssistant
dotnet build
dotnet run
Press the button and speak. The first response takes a few extra seconds while Llama 3.2 3B warms up. Subsequent ones are faster.
No cloud calls, no data leaving the network. The Pi Client handles audio and transcription; the Pi Cerveau runs the LLM. Everything stays local.
The complete code for this article is available on GitHub.
Series articles
- Setting Up Both Raspberry Pis
- .NET 10 Worker Service and Audio Pipeline
- Ollama Integration and Home Context (this article)
- Memory, Silence Detection, and systemd
- Real-Time Weather and Swapping to the Claude API
- Function Calling: Teaching Tools to the Assistant
- Retrospective, Lessons Learned, and v2 Roadmap
In article #4, we add conversational memory, automatic silence detection, and a systemd unit so the assistant starts at boot.
This post was written with AI assistance and edited by me.