You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+62Lines changed: 62 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,6 +4,7 @@
4
4
5
5
- a one-shot CLI for simple embedding requests
6
6
- a long-lived local HTTP server for repeated requests
7
+
- a long-lived stdio daemon for process-managed IPC
7
8
- release packaging for major desktop and server operating systems
8
9
9
10
The first release is intentionally operational rather than retrieval-quality-complete. It focuses on a stable interface, model bootstrapping, hello-world inference, and releasable artefacts.
The daemon exits cleanly on `shutdown` or when `stdin` reaches EOF.
208
+
149
209
## Cache directory resolution
150
210
151
211
Model cache resolution order:
@@ -198,5 +258,7 @@ The repository includes two workflows:
198
258
## Troubleshooting
199
259
200
260
- The first `embed` or `serve` invocation downloads model files into the local cache. This can take a while on a cold machine.
261
+
- The first `daemon` invocation also downloads model files into the local cache if they are not already present.
201
262
- If model loading fails, check network access to Hugging Face and confirm the cache directory is writable.
263
+
- Long-lived modes support `--log-level` and `--log-file`. Without `--log-file`, `serve` and `daemon` use a best-effort OS log sink and fall back to `stderr` if the native sink is unavailable.
202
264
- The runtime does not log input texts by default.
0 commit comments