Parallelize Prowlarr search across indexers using aiohttp#63
Conversation
Review Summary by QodoParallelize Prowlarr search across indexers using aiohttp
WalkthroughsDescription• Parallelize Prowlarr search across indexers using async/await • Replace synchronous requests with aiohttp for concurrent API calls • Add per-indexer search URLs instead of single aggregated search • Implement fetch_json and fetch_all async helpers for parallel requests • Add get_indexer_ids function to retrieve enabled indexers with caching Diagramflowchart LR
A["search function"] --> B["get_indexer_ids"]
B --> C["Build per-indexer URLs"]
C --> D["fetch_all async"]
D --> E["fetch_json tasks"]
E --> F["aiohttp.ClientSession"]
F --> G["Parallel API calls"]
G --> H["Aggregate results"]
File Changes1. app/prowlarr.py
|
Code Review by Qodo
1. Event loop not closed
|
| loop = asyncio.new_event_loop() | ||
| asyncio.set_event_loop(loop) | ||
| responses = loop.run_until_complete(fetch_all(urls)) | ||
|
|
There was a problem hiding this comment.
1. Event loop not closed 🐞 Bug ☼ Reliability
prowlarr.search() creates and installs a new asyncio event loop per call but never closes it or restores the prior loop, which can leak loop resources and pollute the thread’s default loop in FastAPI’s long-lived worker threads. Under sustained /api/search traffic this can degrade the process over time (FD/memory growth and hard-to-explain async behavior in the same thread).
Agent Prompt
### Issue description
`app/prowlarr.py::search()` creates a new asyncio event loop and sets it as the current loop, but never closes it and never restores the previous loop. In FastAPI’s sync request handler path, this can leak loop resources and leave a “foreign” loop registered in a long-lived worker thread.
### Issue Context
This code runs per `/api/search` request (`app/app.py` calls `prowlarr.search()`), so even small leaks accumulate.
### Fix Focus Areas
- app/prowlarr.py[91-94]
### Implementation guidance
- Prefer `asyncio.run(fetch_all(urls))` if you’re sure this function is never called from within an already-running event loop.
- Otherwise, explicitly manage the loop:
- Save the previous loop (if any) for the thread.
- `try/finally` to always `loop.close()`.
- Restore the previous loop (or set to `None`) after completion.
- Avoid `asyncio.set_event_loop(loop)` if you don’t need it (e.g., use `loop.run_until_complete(asyncio.gather(*(fetch_json(...))))` or `loop.create_task(...)`).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| async def fetch_json(session: aiohttp.ClientSession, url: str) -> List[Dict[str, Any]]: | ||
| try: | ||
| async with session.get(url) as response: | ||
| data: Any = await response.json() | ||
| if not isinstance(data, list): | ||
| return [] | ||
| return [r for r in data if isinstance(r, dict)] # type: ignore[reportUnknownVariableType] | ||
| except (TimeoutError, aiohttp.ClientError): | ||
| return [] | ||
|
|
||
|
|
||
| async def fetch_all(urls: List[str]) -> List[List[Dict[str, Any]]]: | ||
| async with aiohttp.ClientSession( | ||
| timeout=aiohttp.ClientTimeout(total=5), | ||
| headers={"X-Api-Key": API_KEY}, | ||
| ) as session: | ||
| tasks = [asyncio.ensure_future(fetch_json(session, url)) for url in urls] | ||
| return await asyncio.gather(*tasks) | ||
|
|
There was a problem hiding this comment.
2. One bad json kills search 🐞 Bug ☼ Reliability
fetch_all() uses asyncio.gather() without return_exceptions=True, and fetch_json() doesn’t catch JSON decode errors (ValueError), so a single indexer returning malformed JSON can cause gather() to raise and search() to return no results at all. This defeats the goal of parallelization because partial successes are discarded.
Agent Prompt
### Issue description
A single malformed JSON response from any indexer can make the entire parallel search fail and return empty results.
### Issue Context
- `fetch_json()` does `await response.json()` but does not catch `ValueError`/decode errors.
- `fetch_all()` uses `asyncio.gather()` without `return_exceptions=True`, so one uncaught exception aborts the entire gather.
### Fix Focus Areas
- app/prowlarr.py[62-70]
- app/prowlarr.py[73-80]
### Implementation guidance
Choose one (or combine):
1) Make `fetch_json()` resilient:
- Catch `ValueError` (and optionally `json.JSONDecodeError`) in addition to existing exceptions and return `[]`.
- Consider calling `response.raise_for_status()` before parsing so HTTP errors are handled consistently.
2) Make `fetch_all()` resilient:
- Use `await asyncio.gather(*coros, return_exceptions=True)`.
- Filter exceptions and keep successful lists so partial results still return.
Goal: one indexer failing must not drop other indexers’ results.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
No description provided.