nixcfg/config/opencode/skills/web-search/SKILL.md
Harald Hoyer a63abebda3 feat(home): opencode module — link config/opencode → ~/.config/opencode
Adds metacfg.cli-apps.opencode (default enabled) which mounts the
in-repo opencode config (provider list, web-search skill) via
xdg.configFile, so all hosts pick it up automatically.
2026-05-03 14:30:33 +02:00

86 lines
2.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
name: web-search
description: Search the web and fetch page content via the user's private SearXNG instance at search.hoyer.world. Use this whenever current information is needed - library docs, error message lookups, recent releases, API references, or any general research that goes beyond training data. Trigger words include "search", "look up", "find docs for", "what's the current", "latest version of". Always prefer this over guessing from memory.
---
# Web Search via SearXNG
The user runs a private SearXNG instance at `$SEARXNG_URL`
(default: `https://search.hoyer.world`). Use it for all web research.
Run searches via the `bash` tool. Do NOT attempt MCP or built-in web search.
## Search
```bash
curl -sfG "${SEARXNG_URL:-https://search.hoyer.world}/search" \
--data-urlencode "q=QUERY HERE" \
--data-urlencode 'format=json' \
--data-urlencode 'language=en' \
--data-urlencode 'safesearch=0' \
| jq -r '.results[0:8][] | "## \(.title)\n<\(.url)>\n\(.content // "")\n"'
```
Keep queries short (36 words). For follow-ups, increment `pageno` instead of
re-running the same query:
```bash
... --data-urlencode 'pageno=2' ...
```
## Categories
Bias results to relevant engines via `categories`:
| Category | Use for |
|------------|-----------------------------------------------|
| `general` | default |
| `it` | programming, dev tools (GitHub, SO, MDN, …) |
| `repos` | source-code search |
| `news` | recent events |
| `science` | papers, arXiv, PubMed |
```bash
... --data-urlencode 'categories=it' ...
```
## Time filtering
For "current"/"latest" queries add `time_range=month` or `year` to drop
stale results:
```bash
... --data-urlencode 'time_range=year' ...
```
## Fetching a page
For full content of a result URL, use pandoc via `nix run` (no install needed):
```bash
curl -sfL --max-time 15 \
-H 'User-Agent: Mozilla/5.0' \
"$URL" \
| nix run nixpkgs#pandoc -- -f html -t gfm --wrap=none 2>/dev/null \
| sed -E 's/!\[[^]]*\]\([^)]*\)//g' \
| head -c 12000
```
The first `nix run` invocation may take a few seconds while pandoc is fetched
into the Nix store; subsequent calls are instant.
For very simple pages where you only want plain text:
```bash
curl -sfL --max-time 15 -H 'User-Agent: Mozilla/5.0' "$URL" \
| nix run nixpkgs#lynx -- -dump -nolist -stdin \
| head -c 12000
```
## Don'ts
- Do not paginate by re-running identical queries — use `pageno`.
- Do not fetch more than 3 URLs per task without checking with the user first.
- Do not ignore `time_range` for version- or release-related questions.
- Do not return raw JSON to the user — always render as the markdown shown above.