Skip to content

Latest commit

 

History

History
17 lines (11 loc) · 347 Bytes

File metadata and controls

17 lines (11 loc) · 347 Bytes

Performance Tuning Guide

Optimize VLM Inference Server for your workload.

Benchmarks

See README Performance section for current benchmarks.

Tuning

Performance optimization guide - coming soon.

Topics to be covered:

  • GPU vs CPU tradeoffs
  • Batch size optimization
  • Memory management
  • Caching strategies