can karakus

	can karakus
linkedin github scholar strava	can at contextual dot ai
	bio
	2018 Ph.D., Electrical Engineering , UCLA 2013 M.S., Electrical Engineering, UCLA 2011 B.S., Electrical Engineering, Bilkent University I am a Member of Technical Staff at Contextual AI, working on building retrieval-augmented AI systems for enterprise use cases. Previously I was a Senior Applied Scientist at AWS AI, where I worked extensively on ML systems and distributed LLM training and inference, including designing and building SageMaker model parallelism library, and helping build the Neuron distributed training capabilities. I obtained my PhD working with Suhas Diggavi, on information theory and distributed optimization. I was a visitor at EPFL for the summers of 2010 and 2012, and previously held internships at Qualcomm Corporate R&D, San Diego, CA (2015), and Technicolor Research, Los Altos, CA (2016). I am a recipient of 2011 UCLA Graduate Division Fellowship, 2013 UCLA EE Department Fellowship, and 2015 Qualcomm Roberto Padovani Scholarship. I have broad technical interests spanning both theory and systems research, including machine learning, optimization, distributed systems, applied probability, information theory, and wireless networks. In the past I have served in the technical program committees of NeurIPS and ICML, and the technical steering committee of the open-source distributed machine learning framework Horovod. I have also served as a technical consultant for the Derspresso project.
	conference
	Marconi: Prefix Caching for the Era of Hybrid LLMs [pdf] R. Pan, Z. Wang, Z. Jia, C. Karakus, L. Zancato, T. Dao, Y. Wang, R. Netravali Preprint '24 MADA: Meta-Adaptive Optimizers Through Hyper-Gradient Descent [pdf] K. Ozkara, C. Karakus, P. Raman, M. Hong, S. Sabach, B. Kveton, V. Cevher International Conference on Machine Learning (ICML) '24 Amazon SageMaker Model Parallelism: A General and Flexible Framework for Large Model Training [pdf] C. Karakus, R. Huilgol, F. Wu, A. Subramanian, C. Daniel, D. Cavdar, T. Xu, H. Chen, A. Rahnama, L. Quintela Preprint '21 Herring: Rethinking the Parameter Server at Scale for the Cloud [pdf] I. Thangakrishnan, D. Cavdar, C. Karakus, P. Ghai, Y. Selivonchyk, C. Pruce International Conference for High Performance Computing, Networking, Storage and Analysis '20 Qsparse-local SGD: Distributed SGD with Quantization, Sparsification, and Local Computations [pdf] D. Basu, D. Data, C. Karakus, S. Diggavi Neural Information Processing Systems '19 Densifying Assumed-sparse Tensors: Improving Memory Efficiency and MPI Collective Performance for Parallelized Training [pdf] D. Cavdar, V. Codreanu, C. Karakus, J.A. Lockman III, D. Podareanu, V. Saletore, A. Sergeev, D.D. Smith II, V. Suthichai, Q. Ta, S. Varadharajan, L. A. Wilson, R. Xu, P. Yang International Conference on High Performance Computing '19 Privacy-utility trade-off of linear regression under random projections and additive noise [pdf] M. Showkatbakhsh, C. Karakus, S. Diggavi IEEE International Symposium on Information Theory '18 Straggler mitigation in distributed optimization through data encoding [pdf] C. Karakus, Y. Sun, S. Diggavi, W. Yin Neural Information Processing Systems '17 (Spotlight) Encoded distributed optimization [pdf] C. Karakus, Y. Sun, S. Diggavi IEEE International Symposium on Information Theory '17 Approximately achieving feedback interference channel capacity with point-to-point codes [pdf] J. Sebastian, C. Karakus, S. Diggavi IEEE International Symposium on Information Theory '16 Rate splitting is approximately optimal for fading gaussian interference channels [pdf] J. Sebastian, C. Karakus, S. Diggavi, I.-H. Wang Allerton Conference on Computing, Communication and Control '15 Opportunistic scheduling for full-duplex uplink-downlink networks [pdf] C. Karakus, S. Diggavi IEEE International Symposium on Information Theory '15 An achievable rate region for gaussian interference channel with intermittent feedback [pdf] C. Karakus, I.-H. Wang, S. Diggavi Allerton Conference on Computing, Communication and Control '13 Interference channel with intermittent feedback [pdf] C. Karakus, I.-H. Wang, S. Diggavi IEEE International Symposium on Information Theory '13 Shifting network tomography toward a practical goal [pdf] D. Ghita, C. Karakus, K. Argyraki, P. Thiran ACM International Conference on Emerging Network Experiments and Technologies '11
	journal
	Qsparse-local SGD: Distributed SGD with Quantization, Sparsification, and Local Computations [pdf] D. Basu, D. Data, C. Karakus, S. Diggavi IEEE Journal on Selected Areas in Information Theory '20 Differentially private consensus-based distributed optimization [pdf] M. Showkatbakhsh, C. Karakus, S. Diggavi Preprint Redundancy techniques for straggler mitigation in distributed optimization and learning [pdf] C. Karakus, Y. Sun, S. Diggavi, W. Yin Journal of Machine Learning Research '19 Approximate capacity of fast fading interference channels with no CSIT [pdf] J. Sebastian, C. Karakus, S. Diggavi IEEE Transactions on Communications '18 Enhancing multiuser MIMO through opportunistic D2D cooperation [pdf] C. Karakus, S. Diggavi IEEE Transactions on Wireless Communications '17 Gaussian interference channel with intermittent feedback [pdf] C. Karakus, I.-H. Wang, S. Diggavi IEEE Transactions on Information Theory '15