Thanks Sarthak ! From request batching, I got almost the same response times as in single instance .... I went till batches of 8 .... Probably the tensors are very small in size and batching doesn't… - Vasu Sharma - Medium

Great article, Vasu. I found your ideas and analysis very insightful.
1
1
Sarthakmadaan
Vasu Sharma
·Follow
Oct 22, 2024
--
Thanks Sarthak !

From request batching, I got almost the same response times as in single instance .... I went till batches of 8 .... Probably the tensors are very small in size and batching doesn't really increase ant GPU load in this case.....


Breaking GIL would be fun !!
--
--
Written by Vasu Sharma18 Followers
·12 Following
Wonder-er ! Wander-er ! Coder !
No responses yet
Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams