You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refactor DNN/transformer stack: BuildContext, chunked prefill
- Major refactor: explicit BuildContext, TrainingMode, buffer mgmt
- Enable efficient chunked prefill/inference in transformer/attention
- Replace std::vector shapes with fixed-capacity TensorShape types
- Add modular C++/CUDA attention, positional, and tensor op interfaces
BREAKING CHANGE:
Updates all component, tensor, and operator APIs to use BuildContext, TrainingMode, and shape_t;
removes legacy interfaces and changes model/component build and training lifecycle. Existing code must be updated for new APIs.
0 commit comments