Skip to content

Parallelize Prowlarr search across indexers using aiohttp#63

Merged
hauxir merged 1 commit into
masterfrom
prowlarr-async
May 8, 2026
Merged

Parallelize Prowlarr search across indexers using aiohttp#63
hauxir merged 1 commit into
masterfrom
prowlarr-async

Conversation

@hauxir
Copy link
Copy Markdown
Owner

@hauxir hauxir commented May 8, 2026

No description provided.

@qodo-free-for-open-source-projects
Copy link
Copy Markdown

Review Summary by Qodo

Parallelize Prowlarr search across indexers using aiohttp

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Parallelize Prowlarr search across indexers using async/await
• Replace synchronous requests with aiohttp for concurrent API calls
• Add per-indexer search URLs instead of single aggregated search
• Implement fetch_json and fetch_all async helpers for parallel requests
• Add get_indexer_ids function to retrieve enabled indexers with caching
Diagram
flowchart LR
  A["search function"] --> B["get_indexer_ids"]
  B --> C["Build per-indexer URLs"]
  C --> D["fetch_all async"]
  D --> E["fetch_json tasks"]
  E --> F["aiohttp.ClientSession"]
  F --> G["Parallel API calls"]
  G --> H["Aggregate results"]
Loading

Grey Divider

File Changes

1. app/prowlarr.py ✨ Enhancement +48/-16

Implement async parallel indexer search with aiohttp

• Added asyncio, aiohttp, and urlencode imports for async HTTP operations
• Introduced get_indexer_ids function to fetch and cache enabled indexers from Prowlarr API
• Added fetch_json async helper to safely fetch and parse JSON from individual indexer URLs
• Added fetch_all async function to parallelize requests across multiple indexers using
 aiohttp.ClientSession
• Refactored search function to build per-indexer URLs and use asyncio event loop for concurrent
 requests
• Updated exception handling to catch aiohttp.ClientError in addition to requests.RequestException

app/prowlarr.py


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown

qodo-free-for-open-source-projects Bot commented May 8, 2026

Code Review by Qodo

🐞 Bugs (3) 📘 Rule violations (0)

Grey Divider


Action required

1. Event loop not closed 🐞 Bug ☼ Reliability
Description
prowlarr.search() creates and installs a new asyncio event loop per call but never closes it or
restores the prior loop, which can leak loop resources and pollute the thread’s default loop in
FastAPI’s long-lived worker threads. Under sustained /api/search traffic this can degrade the
process over time (FD/memory growth and hard-to-explain async behavior in the same thread).
Code

app/prowlarr.py[R91-94]

+        loop = asyncio.new_event_loop()
+        asyncio.set_event_loop(loop)
+        responses = loop.run_until_complete(fetch_all(urls))
+
Evidence
The PR creates a new loop and sets it as the thread’s current loop, then runs the async work, but
never calls loop.close() or resets the thread’s event loop. This code is executed on every search
request because the FastAPI /api/search handler is synchronous and calls _indexer_search() which
calls prowlarr.search().

app/prowlarr.py[91-94]
app/app.py[348-352]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`app/prowlarr.py::search()` creates a new asyncio event loop and sets it as the current loop, but never closes it and never restores the previous loop. In FastAPI’s sync request handler path, this can leak loop resources and leave a “foreign” loop registered in a long-lived worker thread.

### Issue Context
This code runs per `/api/search` request (`app/app.py` calls `prowlarr.search()`), so even small leaks accumulate.

### Fix Focus Areas
- app/prowlarr.py[91-94]

### Implementation guidance
- Prefer `asyncio.run(fetch_all(urls))` if you’re sure this function is never called from within an already-running event loop.
- Otherwise, explicitly manage the loop:
 - Save the previous loop (if any) for the thread.
 - `try/finally` to always `loop.close()`.
 - Restore the previous loop (or set to `None`) after completion.
- Avoid `asyncio.set_event_loop(loop)` if you don’t need it (e.g., use `loop.run_until_complete(asyncio.gather(*(fetch_json(...))))` or `loop.create_task(...)`).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. One bad JSON kills search 🐞 Bug ☼ Reliability
Description
fetch_all() uses asyncio.gather() without return_exceptions=True, and fetch_json() doesn’t
catch JSON decode errors (ValueError), so a single indexer returning malformed JSON can cause
gather() to raise and search() to return no results at all. This defeats the goal of
parallelization because partial successes are discarded.
Code

app/prowlarr.py[R62-80]

+async def fetch_json(session: aiohttp.ClientSession, url: str) -> List[Dict[str, Any]]:
+    try:
+        async with session.get(url) as response:
+            data: Any = await response.json()
+            if not isinstance(data, list):
+                return []
+            return [r for r in data if isinstance(r, dict)]  # type: ignore[reportUnknownVariableType]
+    except (TimeoutError, aiohttp.ClientError):
+        return []
+
+
+async def fetch_all(urls: List[str]) -> List[List[Dict[str, Any]]]:
+    async with aiohttp.ClientSession(
+        timeout=aiohttp.ClientTimeout(total=5),
+        headers={"X-Api-Key": API_KEY},
+    ) as session:
+        tasks = [asyncio.ensure_future(fetch_json(session, url)) for url in urls]
+        return await asyncio.gather(*tasks)
+
Evidence
fetch_json() only catches (TimeoutError, aiohttp.ClientError); JSON decode failures are
ValueError subclasses and can escape the coroutine. Because fetch_all() uses
asyncio.gather(*tasks) without return_exceptions=True, any single escaping exception causes
gather() to raise and abort the whole batch. The outer search() catches ValueError and returns
the (still empty) magnet_links, dropping all other indexer results.

app/prowlarr.py[62-70]
app/prowlarr.py[73-80]
app/prowlarr.py[159-162]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
A single malformed JSON response from any indexer can make the entire parallel search fail and return empty results.

### Issue Context
- `fetch_json()` does `await response.json()` but does not catch `ValueError`/decode errors.
- `fetch_all()` uses `asyncio.gather()` without `return_exceptions=True`, so one uncaught exception aborts the entire gather.

### Fix Focus Areas
- app/prowlarr.py[62-70]
- app/prowlarr.py[73-80]

### Implementation guidance
Choose one (or combine):
1) Make `fetch_json()` resilient:
  - Catch `ValueError` (and optionally `json.JSONDecodeError`) in addition to existing exceptions and return `[]`.
  - Consider calling `response.raise_for_status()` before parsing so HTTP errors are handled consistently.
2) Make `fetch_all()` resilient:
  - Use `await asyncio.gather(*coros, return_exceptions=True)`.
  - Filter exceptions and keep successful lists so partial results still return.

Goal: one indexer failing must not drop other indexers’ results.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. Silent per-indexer failures 🐞 Bug ◔ Observability
Description
fetch_json() swallows timeouts/network errors and returns an empty list without logging which
URL/indexer failed, so missing results become un-debuggable in production. The only log occurs when
an exception escapes the async layer, which won’t happen for the common handled failure cases.
Code

app/prowlarr.py[R62-71]

+async def fetch_json(session: aiohttp.ClientSession, url: str) -> List[Dict[str, Any]]:
+    try:
+        async with session.get(url) as response:
+            data: Any = await response.json()
+            if not isinstance(data, list):
+                return []
+            return [r for r in data if isinstance(r, dict)]  # type: ignore[reportUnknownVariableType]
+    except (TimeoutError, aiohttp.ClientError):
+        return []
+
Evidence
The async request path returns [] on handled failures with no log line, and the outer search()
logger only triggers for exceptions that propagate out of the async tasks. This creates a blind spot
where searches appear to “work” but silently omit indexers.

app/prowlarr.py[62-71]
app/prowlarr.py[159-162]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Per-indexer failures (timeouts, connection errors) are swallowed without any logging, making it hard to tell why results are missing.

### Issue Context
`fetch_json()` catches exceptions and returns `[]` without logging URL/indexer context. The outer `search()` log only runs if an exception escapes.

### Fix Focus Areas
- app/prowlarr.py[62-71]

### Implementation guidance
- Add a `log.debug(...)` (or appropriate level) in the `except` block including:
 - the URL (or extracted indexer ID)
 - exception type/message
- If you add `response.raise_for_status()`, consider logging status codes for non-2xx responses as well.
- Ensure logging volume is bounded (e.g., debug level only).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@hauxir hauxir merged commit 17568cb into master May 8, 2026
2 checks passed
@hauxir hauxir deleted the prowlarr-async branch May 8, 2026 15:44
Comment thread app/prowlarr.py
Comment on lines +91 to +94
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
responses = loop.run_until_complete(fetch_all(urls))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Event loop not closed 🐞 Bug ☼ Reliability

prowlarr.search() creates and installs a new asyncio event loop per call but never closes it or
restores the prior loop, which can leak loop resources and pollute the thread’s default loop in
FastAPI’s long-lived worker threads. Under sustained /api/search traffic this can degrade the
process over time (FD/memory growth and hard-to-explain async behavior in the same thread).
Agent Prompt
### Issue description
`app/prowlarr.py::search()` creates a new asyncio event loop and sets it as the current loop, but never closes it and never restores the previous loop. In FastAPI’s sync request handler path, this can leak loop resources and leave a “foreign” loop registered in a long-lived worker thread.

### Issue Context
This code runs per `/api/search` request (`app/app.py` calls `prowlarr.search()`), so even small leaks accumulate.

### Fix Focus Areas
- app/prowlarr.py[91-94]

### Implementation guidance
- Prefer `asyncio.run(fetch_all(urls))` if you’re sure this function is never called from within an already-running event loop.
- Otherwise, explicitly manage the loop:
  - Save the previous loop (if any) for the thread.
  - `try/finally` to always `loop.close()`.
  - Restore the previous loop (or set to `None`) after completion.
- Avoid `asyncio.set_event_loop(loop)` if you don’t need it (e.g., use `loop.run_until_complete(asyncio.gather(*(fetch_json(...))))` or `loop.create_task(...)`).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread app/prowlarr.py
Comment on lines +62 to +80
async def fetch_json(session: aiohttp.ClientSession, url: str) -> List[Dict[str, Any]]:
try:
async with session.get(url) as response:
data: Any = await response.json()
if not isinstance(data, list):
return []
return [r for r in data if isinstance(r, dict)] # type: ignore[reportUnknownVariableType]
except (TimeoutError, aiohttp.ClientError):
return []


async def fetch_all(urls: List[str]) -> List[List[Dict[str, Any]]]:
async with aiohttp.ClientSession(
timeout=aiohttp.ClientTimeout(total=5),
headers={"X-Api-Key": API_KEY},
) as session:
tasks = [asyncio.ensure_future(fetch_json(session, url)) for url in urls]
return await asyncio.gather(*tasks)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. One bad json kills search 🐞 Bug ☼ Reliability

fetch_all() uses asyncio.gather() without return_exceptions=True, and fetch_json() doesn’t
catch JSON decode errors (ValueError), so a single indexer returning malformed JSON can cause
gather() to raise and search() to return no results at all. This defeats the goal of
parallelization because partial successes are discarded.
Agent Prompt
### Issue description
A single malformed JSON response from any indexer can make the entire parallel search fail and return empty results.

### Issue Context
- `fetch_json()` does `await response.json()` but does not catch `ValueError`/decode errors.
- `fetch_all()` uses `asyncio.gather()` without `return_exceptions=True`, so one uncaught exception aborts the entire gather.

### Fix Focus Areas
- app/prowlarr.py[62-70]
- app/prowlarr.py[73-80]

### Implementation guidance
Choose one (or combine):
1) Make `fetch_json()` resilient:
   - Catch `ValueError` (and optionally `json.JSONDecodeError`) in addition to existing exceptions and return `[]`.
   - Consider calling `response.raise_for_status()` before parsing so HTTP errors are handled consistently.
2) Make `fetch_all()` resilient:
   - Use `await asyncio.gather(*coros, return_exceptions=True)`.
   - Filter exceptions and keep successful lists so partial results still return.

Goal: one indexer failing must not drop other indexers’ results.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant