Adds metacfg.cli-apps.opencode (default enabled) which mounts the in-repo opencode config (provider list, web-search skill) via xdg.configFile, so all hosts pick it up automatically.
86 lines
2.8 KiB
Markdown
86 lines
2.8 KiB
Markdown
---
|
||
name: web-search
|
||
description: Search the web and fetch page content via the user's private SearXNG instance at search.hoyer.world. Use this whenever current information is needed - library docs, error message lookups, recent releases, API references, or any general research that goes beyond training data. Trigger words include "search", "look up", "find docs for", "what's the current", "latest version of". Always prefer this over guessing from memory.
|
||
---
|
||
|
||
# Web Search via SearXNG
|
||
|
||
The user runs a private SearXNG instance at `$SEARXNG_URL`
|
||
(default: `https://search.hoyer.world`). Use it for all web research.
|
||
|
||
Run searches via the `bash` tool. Do NOT attempt MCP or built-in web search.
|
||
|
||
## Search
|
||
|
||
```bash
|
||
curl -sfG "${SEARXNG_URL:-https://search.hoyer.world}/search" \
|
||
--data-urlencode "q=QUERY HERE" \
|
||
--data-urlencode 'format=json' \
|
||
--data-urlencode 'language=en' \
|
||
--data-urlencode 'safesearch=0' \
|
||
| jq -r '.results[0:8][] | "## \(.title)\n<\(.url)>\n\(.content // "")\n"'
|
||
```
|
||
|
||
Keep queries short (3–6 words). For follow-ups, increment `pageno` instead of
|
||
re-running the same query:
|
||
|
||
```bash
|
||
... --data-urlencode 'pageno=2' ...
|
||
```
|
||
|
||
## Categories
|
||
|
||
Bias results to relevant engines via `categories`:
|
||
|
||
| Category | Use for |
|
||
|------------|-----------------------------------------------|
|
||
| `general` | default |
|
||
| `it` | programming, dev tools (GitHub, SO, MDN, …) |
|
||
| `repos` | source-code search |
|
||
| `news` | recent events |
|
||
| `science` | papers, arXiv, PubMed |
|
||
|
||
```bash
|
||
... --data-urlencode 'categories=it' ...
|
||
```
|
||
|
||
## Time filtering
|
||
|
||
For "current"/"latest" queries add `time_range=month` or `year` to drop
|
||
stale results:
|
||
|
||
```bash
|
||
... --data-urlencode 'time_range=year' ...
|
||
```
|
||
|
||
## Fetching a page
|
||
|
||
For full content of a result URL, use pandoc via `nix run` (no install needed):
|
||
|
||
```bash
|
||
curl -sfL --max-time 15 \
|
||
-H 'User-Agent: Mozilla/5.0' \
|
||
"$URL" \
|
||
| nix run nixpkgs#pandoc -- -f html -t gfm --wrap=none 2>/dev/null \
|
||
| sed -E 's/!\[[^]]*\]\([^)]*\)//g' \
|
||
| head -c 12000
|
||
```
|
||
|
||
The first `nix run` invocation may take a few seconds while pandoc is fetched
|
||
into the Nix store; subsequent calls are instant.
|
||
|
||
For very simple pages where you only want plain text:
|
||
|
||
```bash
|
||
curl -sfL --max-time 15 -H 'User-Agent: Mozilla/5.0' "$URL" \
|
||
| nix run nixpkgs#lynx -- -dump -nolist -stdin \
|
||
| head -c 12000
|
||
```
|
||
|
||
## Don'ts
|
||
|
||
- Do not paginate by re-running identical queries — use `pageno`.
|
||
- Do not fetch more than 3 URLs per task without checking with the user first.
|
||
- Do not ignore `time_range` for version- or release-related questions.
|
||
- Do not return raw JSON to the user — always render as the markdown shown above.
|
||
|