Optimize VLM Inference Server for your workload.
See README Performance section for current benchmarks.
Performance optimization guide - coming soon.
Topics to be covered:
- GPU vs CPU tradeoffs
- Batch size optimization
- Memory management
- Caching strategies