Memory, Silence Detection, and systemd

Memory, silence detection, and systemd on Raspberry Pi

Memory, Silence Detection, and systemd

This post is part of the Voice Assistant on Raspberry Pi series.

The assistant from article #3 works, but every exchange starts from scratch. We fix that in three steps: conversational memory, automatic silence detection, and auto-start at boot with systemd.

The complete code for this article is available on GitHub.

Part 1: Conversational memory

How it works

Ollama supports a chat format with message history, the same as the OpenAI or Claude API. Instead of a plain prompt, you send a list of messages with [{role, content}]. Ollama maintains context and generates responses that are coherent with previous turns.

Turn 1: [system] + [user: "What's your name?"]
Turn 2: [system] + [user: "What's your name?"] + [assistant: "My name is Alex."] + [user: "How old are you?"]
Turn 3: ...

Step 1.1: Update OllamaService

Switch from /api/generate to /api/chat, the Ollama endpoint that supports message history. Replace Services/OllamaService.cs:

using System.Net.Http.Json;
using Microsoft.Extensions.Options;

namespace AudioAssistant.Services;

public interface ILlmService
{
    Task<string> ChatAsync(List<ConversationMessage> history, CancellationToken cancellationToken);
}

public record ConversationMessage(string Role, string Content);

public class OllamaService : ILlmService
{
    private readonly HttpClient _http;
    private readonly AssistantOptions _options;
    private readonly ILogger<OllamaService> _logger;

    public OllamaService(HttpClient http, IOptions<AssistantOptions> options, ILogger<OllamaService> logger)
    {
        _http = http;
        _options = options.Value;
        _logger = logger;
    }

    public async Task<string> ChatAsync(List<ConversationMessage> history, CancellationToken cancellationToken)
    {
        _logger.LogInformation("Sending to LLM ({Count} messages)...", history.Count);

        var request = new
        {
            model = _options.OllamaModel,
            messages = history.Select(m => new { role = m.Role, content = m.Content }),
            stream = false
        };

        var response = await _http.PostAsJsonAsync(
            $"{_options.OllamaBaseUrl}/api/chat",
            request,
            cancellationToken);

        response.EnsureSuccessStatusCode();

        var result = await response.Content.ReadFromJsonAsync<OllamaChatResponse>(
            cancellationToken: cancellationToken);

        var text = result?.Message?.Content?.Trim() ?? "I don't have a response.";
        _logger.LogInformation("LLM response: \"{Text}\"", text);
        return text;
    }
}

internal record OllamaChatMessage(string Role, string Content);
internal record OllamaChatResponse(OllamaChatMessage Message);

Step 1.2: Update ContextService

ContextService becomes a conversation manager that maintains history. Replace Services/ContextService.cs:

using Microsoft.Extensions.Options;

namespace AudioAssistant.Services;

public interface IContextService
{
    List<ConversationMessage> AddUserMessage(string userInput);
    void AddAssistantMessage(string response);
    void Reset();
}

public class ContextService : IContextService
{
    private readonly AssistantOptions _options;
    private readonly List<ConversationMessage> _history = new();
    private readonly ILogger<ContextService> _logger;

    public ContextService(IOptions<AssistantOptions> options, ILogger<ContextService> logger)
    {
        _options = options.Value;
        _logger = logger;
        _history.Add(new ConversationMessage("system", _options.SystemPrompt));
    }

    public List<ConversationMessage> AddUserMessage(string userInput)
    {
        _history.Add(new ConversationMessage("user", userInput));
        _logger.LogInformation("History: {Count} messages", _history.Count);
        return _history;
    }

    public void AddAssistantMessage(string response)
    {
        _history.Add(new ConversationMessage("assistant", response));
    }

    public void Reset()
    {
        _history.Clear();
        _history.Add(new ConversationMessage("system", _options.SystemPrompt));
        _logger.LogInformation("History reset.");
    }
}

History grows with every exchange. If you notice incoherent responses after several turns, call Reset() manually, or add a check on MaxConversationTurns in AddAssistantMessage.

Step 1.3: Add MaxConversationTurns to AssistantOptions

namespace AudioAssistant;

public class AssistantOptions
{
    public int GpioButtonPin { get; set; } = 17;
    public string AudioDevice { get; set; } = "hw:3,0";
    public int RecordingDurationSeconds { get; set; } = 10;
    public string WhisperModel { get; set; } = "ggml-base.bin";
    public string PiperBinary { get; set; } = "/home/gabriel/piper/piper/piper";
    public string PiperVoice { get; set; } = "/home/gabriel/piper-voices/fr_FR-siwis-low.onnx";
    public string AudioOutputDevice { get; set; } = "hw:3,0";
    public string OllamaBaseUrl { get; set; } = "http://pi-cerveau.local:11434";
    public string OllamaModel { get; set; } = "llama3.2:3b";
    public string SystemPrompt { get; set; } = "";
    public int MaxConversationTurns { get; set; } = 10;
}

Step 1.4: Update Worker.cs

using AudioAssistant.Services;

namespace AudioAssistant;

public class Worker : BackgroundService
{
    private readonly IGpioService _gpio;
    private readonly IAudioRecorderService _recorder;
    private readonly ITranscriptionService _transcription;
    private readonly ILlmService _llm;
    private readonly IContextService _context;
    private readonly ISpeechService _speech;
    private readonly ILogger<Worker> _logger;

    public Worker(
        IGpioService gpio,
        IAudioRecorderService recorder,
        ITranscriptionService transcription,
        ILlmService llm,
        IContextService context,
        ISpeechService speech,
        ILogger<Worker> logger)
    {
        _gpio = gpio;
        _recorder = recorder;
        _transcription = transcription;
        _llm = llm;
        _context = context;
        _speech = speech;
        _logger = logger;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        _logger.LogInformation("Assistant started. Press the button to speak.");

        while (!stoppingToken.IsCancellationRequested)
        {
            _gpio.WaitForButtonPress(stoppingToken);
            if (stoppingToken.IsCancellationRequested) break;

            try
            {
                var audioFile = await _recorder.RecordAsync(stoppingToken);
                var text = await _transcription.TranscribeAsync(audioFile, stoppingToken);

                if (string.IsNullOrWhiteSpace(text))
                {
                    await _speech.SpeakAsync("Je n'ai pas bien entendu. Peux-tu répéter?", stoppingToken);
                }
                else
                {
                    var history = _context.AddUserMessage(text);
                    var response = await _llm.ChatAsync(history, stoppingToken);
                    _context.AddAssistantMessage(response);
                    await _speech.SpeakAsync(response, stoppingToken);
                }

                if (File.Exists(audioFile))
                    File.Delete(audioFile);
            }
            catch (Exception ex) when (!stoppingToken.IsCancellationRequested)
            {
                _logger.LogError(ex, "Error in pipeline");
                await _speech.SpeakAsync("Une erreur s'est produite.", stoppingToken);
            }
        }
    }
}

Step 1.5: Update PiperSpeechService

The direct pipe from Piper to aplay caused a rate mismatch with the USB adapter. The fix is to use intermediate files: Piper generates a 22050 Hz WAV, ffmpeg resamples to 48000 Hz stereo, and aplay plays the result. The finally block ensures temp files are always cleaned up.

Replace Services/PiperSpeechService.cs:

using System.Diagnostics;
using Microsoft.Extensions.Options;

namespace AudioAssistant.Services;

public interface ISpeechService
{
    Task SpeakAsync(string text, CancellationToken cancellationToken);
}

public class PiperSpeechService : ISpeechService
{
    private readonly AssistantOptions _options;
    private readonly ILogger<PiperSpeechService> _logger;

    public PiperSpeechService(IOptions<AssistantOptions> options, ILogger<PiperSpeechService> logger)
    {
        _options = options.Value;
        _logger = logger;
    }

    public async Task SpeakAsync(string text, CancellationToken cancellationToken)
    {
        _logger.LogInformation("Speaking: \"{Text}\"", text);

        var piperFile = Path.Combine(Path.GetTempPath(), $"tts_{Guid.NewGuid()}.wav");
        var resampledFile = Path.Combine(Path.GetTempPath(), $"tts_resampled_{Guid.NewGuid()}.wav");

        try
        {
            // 1. Piper generates a 22050 Hz mono WAV
            var piperPsi = new ProcessStartInfo
            {
                FileName = _options.PiperBinary,
                Arguments = $"--model {_options.PiperVoice} --output_file {piperFile}",
                RedirectStandardInput = true,
                UseShellExecute = false
            };

            using var piper = Process.Start(piperPsi)!;
            await piper.StandardInput.WriteLineAsync(text);
            piper.StandardInput.Close();
            await piper.WaitForExitAsync(cancellationToken);

            // 2. ffmpeg resamples to 48000 Hz stereo for the USB adapter
            var ffmpegPsi = new ProcessStartInfo
            {
                FileName = "ffmpeg",
                Arguments = $"-y -i {piperFile} -ar 48000 -ac 2 {resampledFile}",
                RedirectStandardError = true,
                UseShellExecute = false
            };

            using var ffmpeg = Process.Start(ffmpegPsi)!;
            await ffmpeg.WaitForExitAsync(cancellationToken);

            // 3. aplay plays the resampled file
            var aplayPsi = new ProcessStartInfo
            {
                FileName = "aplay",
                Arguments = $"-D {_options.AudioOutputDevice} {resampledFile}",
                UseShellExecute = false
            };

            using var aplay = Process.Start(aplayPsi)!;
            await aplay.WaitForExitAsync(cancellationToken);
        }
        finally
        {
            if (File.Exists(piperFile)) File.Delete(piperFile);
            if (File.Exists(resampledFile)) File.Delete(resampledFile);
        }
    }
}

Testing memory

Run dotnet run and try this sequence:

You  : "What's your name?"
Alex : "My name is Alex."

You  : "Say your name again."
Alex : "My name is Alex." ← it remembers!

Part 2: Automatic silence detection

The fixed 10-second timer means you have to wait it out even for short questions. Silence detection cuts the recording as soon as you stop speaking.

Step 2.1: New parameters in AssistantOptions

public int SilenceDurationMs { get; set; } = 1500;    // 1.5s of silence to stop
public string SilenceThreshold { get; set; } = "-40dB"; // Detection threshold

Step 2.2: Update AudioRecorderService

Replace the fixed timer with ffmpeg’s silencedetect filter. Replace Services/AudioRecorderService.cs:

using System.Diagnostics;
using Microsoft.Extensions.Options;

namespace AudioAssistant.Services;

public interface IAudioRecorderService
{
    Task<string> RecordAsync(CancellationToken cancellationToken);
}

public class AudioRecorderService : IAudioRecorderService
{
    private readonly AssistantOptions _options;
    private readonly ILogger<AudioRecorderService> _logger;

    public AudioRecorderService(IOptions<AssistantOptions> options, ILogger<AudioRecorderService> logger)
    {
        _options = options.Value;
        _logger = logger;
    }

    public async Task<string> RecordAsync(CancellationToken cancellationToken)
    {
        var rawFile = Path.Combine(Path.GetTempPath(), $"audio_raw_{Guid.NewGuid()}.wav");
        var outputFile = Path.Combine(Path.GetTempPath(), $"audio_{Guid.NewGuid()}.wav");

        _logger.LogInformation("Recording started (auto-silence)...");

        // ffmpeg records from ALSA and stops after SilenceDurationMs of silence
        // RecordingDurationSeconds is the safety max duration
        var ffmpegPsi = new ProcessStartInfo
        {
            FileName = "ffmpeg",
            Arguments = string.Join(" ",
                "-f alsa",
                $"-i {_options.AudioDevice}",
                "-af",
                $"silencedetect=noise={_options.SilenceThreshold}:d={_options.SilenceDurationMs / 1000.0}",
                $"-t {_options.RecordingDurationSeconds}",
                "-ar 16000 -ac 1",
                $"-y {rawFile}"),
            RedirectStandardError = true,
            RedirectStandardInput = true,
            UseShellExecute = false
        };

        using var ffmpegProcess = Process.Start(ffmpegPsi)!;

        // Read stderr to detect silence_end
        _ = Task.Run(async () =>
        {
            string? line;
            while ((line = await ffmpegProcess.StandardError.ReadLineAsync()) != null)
            {
                if (line.Contains("silence_end"))
                {
                    _logger.LogInformation("Silence detected, stopping recording.");
                    // Write 'q' rather than Kill() so ffmpeg finishes writing the file
                    ffmpegProcess.StandardInput.Write("q");
                    break;
                }
            }
        }, cancellationToken);

        await ffmpegProcess.WaitForExitAsync(cancellationToken);

        var convertPsi = new ProcessStartInfo
        {
            FileName = "ffmpeg",
            Arguments = $"-y -i {rawFile} -ar 16000 -ac 1 {outputFile}",
            RedirectStandardError = true,
            UseShellExecute = false
        };

        using var convertProcess = Process.Start(convertPsi)!;
        await convertProcess.WaitForExitAsync(cancellationToken);

        if (File.Exists(rawFile))
            File.Delete(rawFile);

        _logger.LogInformation("Recording done: {File}", outputFile);
        return outputFile;
    }
}

Update appsettings.json

{
  "Assistant": {
    "GpioButtonPin": 17,
    "AudioDevice": "hw:3,0",
    "RecordingDurationSeconds": 15,
    "SilenceDurationMs": 1500,
    "SilenceThreshold": "-40dB",
    "WhisperModel": "ggml-base.bin",
    "PiperBinary": "/home/gabriel/piper/piper/piper",
    "PiperVoice": "/home/gabriel/piper-voices/fr_FR-siwis-low.onnx",
    "AudioOutputDevice": "hw:3,0",
    "OllamaBaseUrl": "http://pi-cerveau.local:11434",
    "OllamaModel": "llama3.2:1b",
    "MaxConversationTurns": 10,
    "SystemPrompt": "You are a personal voice assistant named Alex. You help the Mongeon family in Blainville, Quebec. Always respond in French, naturally and concisely. Keep answers to 1-3 sentences — they will be read aloud."
  },
  "Logging": {
    "LogLevel": {
      "Default": "Information"
    }
  }
}

RecordingDurationSeconds is now a safety cap. If silence is never detected, recording stops after that limit. Set it to 15-20 seconds.

On a Pi 4 4 GB with around 800 MB available, llama3.2:3b can exceed the 60-second timeout. llama3.2:1b gives 10-15 second latency, which is plenty for short answers.

Also bump the HTTP client timeout in Program.cs:

builder.Services.AddHttpClient<ILlmService, OllamaService>(client =>
{
    client.Timeout = TimeSpan.FromSeconds(120);
});

Part 3: Auto-start at boot

Step 3.1: Publish the binary

cd ~/projects/AudioAssistant
dotnet publish -c Release -r linux-arm64 --self-contained false -o ~/assistant-publish

Step 3.2: Create the systemd unit

sudo nano /etc/systemd/system/assistant.service
[Unit]
Description=Voice Assistant
After=network.target sound.target

[Service]
Type=simple
User=gabriel
WorkingDirectory=/home/gabriel/assistant-publish
ExecStart=/home/gabriel/.dotnet/dotnet /home/gabriel/assistant-publish/AudioAssistant.dll
Restart=always
RestartSec=5
Environment=DOTNET_ROOT=/home/gabriel/.dotnet
Environment=HOME=/home/gabriel

[Install]
WantedBy=multi-user.target

After=sound.target ensures the audio subsystem is ready before the assistant starts. Without it, Piper can fail on the first boot.

Step 3.3: Enable and start

sudo systemctl daemon-reload
sudo systemctl enable assistant
sudo systemctl start assistant
sudo systemctl status assistant

# Stream logs
journalctl -u assistant -f

Update workflow

cd ~/projects/AudioAssistant
dotnet publish -c Release -r linux-arm64 --self-contained false -o ~/assistant-publish
sudo systemctl restart assistant

From here, the assistant starts on its own at boot, remembers the conversation, and stops recording when you stop talking. No more manual dotnet run.

The complete code for this article is available on GitHub.

Series articles

  1. Setting Up Both Raspberry Pis
  2. .NET 10 Worker Service and Audio Pipeline
  3. Ollama Integration and Home Context
  4. Memory, Silence Detection, and systemd (this article)
  5. Real-Time Weather and Swapping to the Claude API
  6. Function Calling: Teaching Tools to the Assistant
  7. Retrospective, Lessons Learned, and v2 Roadmap

In article #5, we wire up a real-time weather API so the assistant knows the current conditions in Blainville, and explore swapping Ollama for the Claude API when you need more power.


This post was written with AI assistance and edited by me.


See also