Why don't my agent skills trigger when they should?

Most skill descriptions are too terse. The model decides whether to invoke a skill based on its description field, and it tends to undertrigger — skipping skills it should use. Make descriptions 'pushy': include what the skill does, specific phrases users might say, and edge cases where it applies.

How often should I audit my agent's skills?

After any batch of new skills (5+), or when you notice skills not firing. An audit of 20 skills takes 1-2 hours and typically surfaces 3-5 concrete fixes that improve daily reliability.

Should I use hyphens or underscores in skill names?

Depends on your platform. Discord slash commands require underscores — hyphens break routing. If your agent framework doesn't care, pick one convention and enforce it everywhere. Mixed conventions cause silent routing failures.

What a Skill Audit Actually Finds

You built 20 skills for your agent. They all work — you tested each one. Months later you notice the agent ignores half of them unless you invoke them by exact name. The skills aren’t broken. They’re just invisible.

The Problem

Skills rot in ways that don’t throw errors. A description that’s too short means the model never reaches for it. A missing backup call means work gets done but never persisted. A naming mismatch means the skill loads fine in text but fails as a slash command. None of these show up in logs. They show up as the agent being slightly worse than it should be, in ways you can’t quite pin down.

Why This Happens

Skills get written one at a time, usually in the middle of solving a real problem. You get the behavior right, ship it, move on. Nobody goes back and reads all 20 side by side. That’s where the drift happens — each skill is locally correct but globally inconsistent.

The six patterns below came from auditing a real fleet of 20 skills across a production agent. Every one of them was “working.”

The Six Things You’ll Find

1. Descriptions That Don’t Trigger

This is the big one. The model decides whether to use a skill based on its description field in the frontmatter. Most descriptions are written like internal documentation: accurate, concise, and completely insufficient for triggering.

# Before — accurate but invisible
description: Add a task to your task list

# After — pushy enough to actually trigger
description: >
  Add a task to your task list. Use when the user says /add,
  "add a task", "new task", "remind me to", "put X on the list",
  or any variation of wanting to track something they need to do.
  Defaults to Backlog unless the user explicitly says urgent or soon.

The model tends to undertrigger — it errs on the side of not using a skill rather than using the wrong one. Your descriptions need to compensate for this by being explicit about when the skill applies, including the casual phrasings a real user would actually type.

2. Missing Persistence Calls

Some skills modify state (write to files, update task lists) but never commit and push. The work happens, but if the session ends or the VPS restarts, it’s gone. This usually happens because the first few skills were careful about persistence, and later skills copied the structure but forgot the backup step.

Check every skill that writes to disk. If it doesn’t end with a commit/push/backup call, add one.

3. Naming Convention Drift

Skill names accumulate organically. One uses hyphens (save-stash), another uses underscores (clear_stash), a third uses neither (sync). This seems cosmetic until your agent framework does exact-string matching on skill names — then the mismatch becomes a routing failure.

Pick a convention and enforce it. If your skills register as Discord slash commands, underscores are the only option — Discord’s spec doesn’t allow hyphens in command names.

4. Duplicated Boilerplate

Seven skills in our audit had identical 8-line model-switch blocks. Copy-paste is fast when you’re building, but it creates a maintenance surface area that scales linearly with your skill count. When the boilerplate becomes obsolete (ours did), you have to touch every file.

If you see the same block in 3+ skills, extract it to a shared reference file or remove it entirely.

5. Stale Documentation

The repo’s CLAUDE.md listed 8 skills. The directory had 20. Twelve skills existed but weren’t documented — which means any agent or human reading the docs had an incomplete picture of what the system could do.

Your skill registry (whether it’s CLAUDE.md, a README, or a config file) should list every skill. Treat it like an API surface: if it’s not documented, it might as well not exist.

6. Duplicated Skills Across Repos

A skill that started in one agent repo got copied to another for reuse. Both copies drifted independently. Neither was wrong, but they weren’t the same — which means fixes applied to one didn’t reach the other.

Centralize shared skills in one repo. Agent repos reference the canonical copy. When the skill improves, every agent gets the improvement on the next pull.

The Fix

The audit itself is the fix. Read every skill in your fleet side-by-side. For each one, check:

Description: Would the model know when to use this from the description alone? If you have to read the body to understand when it applies, the description needs work.
Persistence: Does it save its work? Every skill that writes to disk should end with a commit/push.
Naming: Does the name match your convention? Does it match what your platform actually routes on?
Boilerplate: Is the same block repeated in 3+ skills? Extract or eliminate.
Documentation: Is this skill listed in the repo’s skill registry?
Duplication: Does this skill exist in another repo? Should it?

Key Takeaway

A skill audit isn’t about finding bugs. It’s about finding the gap between “works when I invoke it by name” and “triggers reliably when it should.” The model is smart enough to use your skills — but only if the descriptions tell it when to. Everything else — persistence, naming, docs — is just hygiene that compounds. An hour of auditing now saves a hundred silent misfires later.