--- name: web-search description: Search the web and fetch page content via the user's private SearXNG instance at search.hoyer.world. Use this whenever current information is needed - library docs, error message lookups, recent releases, API references, or any general research that goes beyond training data. Trigger words include "search", "look up", "find docs for", "what's the current", "latest version of". Always prefer this over guessing from memory. --- # Web Search via SearXNG The user runs a private SearXNG instance at `$SEARXNG_URL` (default: `https://search.hoyer.world`). Use it for all web research. Run searches via the `bash` tool. Do NOT attempt MCP or built-in web search. ## Search ```bash curl -sfG "${SEARXNG_URL:-https://search.hoyer.world}/search" \ --data-urlencode "q=QUERY HERE" \ --data-urlencode 'format=json' \ --data-urlencode 'language=en' \ --data-urlencode 'safesearch=0' \ | jq -r '.results[0:8][] | "## \(.title)\n<\(.url)>\n\(.content // "")\n"' ``` Keep queries short (3–6 words). For follow-ups, increment `pageno` instead of re-running the same query: ```bash ... --data-urlencode 'pageno=2' ... ``` ## Categories Bias results to relevant engines via `categories`: | Category | Use for | |------------|-----------------------------------------------| | `general` | default | | `it` | programming, dev tools (GitHub, SO, MDN, …) | | `repos` | source-code search | | `news` | recent events | | `science` | papers, arXiv, PubMed | ```bash ... --data-urlencode 'categories=it' ... ``` ## Time filtering For "current"/"latest" queries add `time_range=month` or `year` to drop stale results: ```bash ... --data-urlencode 'time_range=year' ... ``` ## Fetching a page For full content of a result URL, use pandoc via `nix run` (no install needed): ```bash curl -sfL --max-time 15 \ -H 'User-Agent: Mozilla/5.0' \ "$URL" \ | nix run nixpkgs#pandoc -- -f html -t gfm --wrap=none 2>/dev/null \ | sed -E 's/!\[[^]]*\]\([^)]*\)//g' \ | head -c 12000 ``` The first `nix run` invocation may take a few seconds while pandoc is fetched into the Nix store; subsequent calls are instant. For very simple pages where you only want plain text: ```bash curl -sfL --max-time 15 -H 'User-Agent: Mozilla/5.0' "$URL" \ | nix run nixpkgs#lynx -- -dump -nolist -stdin \ | head -c 12000 ``` ## Don'ts - Do not paginate by re-running identical queries — use `pageno`. - Do not fetch more than 3 URLs per task without checking with the user first. - Do not ignore `time_range` for version- or release-related questions. - Do not return raw JSON to the user — always render as the markdown shown above.