Krishna Kanth Nakka

I’m currently working in the Privacy Team at the Trustworthy Technology Lab, Huawei Munich Research Center, where I focus on the privacy and safety of large language models (LLMs). My current research includes studying privacy leakage in LLMs, Unlearning of Sensitive information, Text anonymization, and understanding LLMs through mechanistic interpretability.

I graduated with a PhD in Computer Science in August 2022 from the Computer Vision Lab at EPFL. I was supervised by Dr. Mathieu Salzmann and Prof. Pascal Fua. My thesis focused on the robustness and interpretability of ML models.

Following the completion of my PhD, I worked as a postdoctoral scientist at the Visual Intelligence for Transportation Lab (VITA) at EPFL, under the supervision of Prof. Alexandre Alahi, for eight months, until April 2023.

Before joining EPFL in 2017, I spent two years at Samsung Research Bangalore working on mobile camera algorithms. Prior to that, I graduated from the Department of Electrical Engineering at IIT Kharagpur in 2015 with a dual degree (Master’s and Bachelor’s). During my undergraduate years, I interned at the University of Alberta, the University of Queensland, and Philips Research.

Email  /  CV  /  Google Scholar  /  Github /  LinkedIn /  Thesis  /  Thesis Slides

profile photo
Research

My research interests lie in developing models that are robust and interpretable, particularly for safety- and security-critical applications. Currently, my work focuses on enhancing the privacy of Large Language Models (LLMs). I am particularly interested in understanding the causes of memorization and privacy leakage in LLMs and exploring interpretable methods to mitigate these issues.

During my PhD, I investigated the vulnerabilities of deep neural networks, especially their performance in unexpected or adversarial scenarios, to improve their robustness. My research spanned topics such as explainable models, transfer-based black-box attacks, attack detection, adversarial defenses, anomaly detection, and testing disentangled representations. At VITA, I worked on human pose estimation, tracking, and re-identification, primarily in the context of team sports analytics.

teaser NAT: Learning to Attack Neurons for Enhanced Adversarial Transferability
Krishna Kanth Nakka, Alexandre Alahi
WACV 2025

We introduce Neuron Attack for Transferability (NAT), a method designed to target specific neuron within the feature embedding. Our approach is motivated by the observation that previous layer-level optimizations often disproportionately focus on a few neurons representing similar concepts, leaving other neurons within the attacked layer minimally affected. our approach NAT shifts the focus from embeddinglevel separation to a more fundamental, neuron-specific approach.

teaser PII-Scope: A Benchmark for Training Data PII Leakage Assessment in LLMs
Krishna Kanth Nakka, Ahmed Frikha, Ricardo Mendis, Xue Jiang, Xuebing Zhou
arXiv 2024
Paper

We introduce PII-Scope, a comprehensive benchmark designed to evaluate state-of-the-art methodologies for PII extraction attacks targeting LLMs across diverse threat settings. Our study provides a deeper understanding of these attacks by uncovering several hyperparameters (e.g., demonstration selection) crucial to their effectiveness of PII attacks. We show that with sophisticated adversarial capabilities and a limited query budget, PII extraction rates can increase by up to fivefold when targeting the pretrained model

teaser ObfuscaTune: Obfuscated Offsite Fine-tuning and Inference of Proprietary LLMs on Private Datasets
Ahmed Frikha, Nasssim Walha, Ricardo Mendis, Krishna Kanth Nakka, Xue Jiang, Xuebing Zhou
arXiv 2024
Paper

We by proposing ObfuscaTune, a novel, efficient and fully utility-preserving approach that combines a simple yet effective obfuscation technique with an efficient usage of confidential computing (only 5% of the model parameters are placed on TEE) to protect LLM model ownership and client data privacy

teaser IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization
Ahmed Frikha, Nasssim Walha, Krishna Kanth Nakka, Ricardo Mendis, Xue Jiang, Xuebing Zhou
Safe Generative AI Workshop, NeurIPS 2024
Paper

We propose LLM-based anonymization technique, IncogniText, that anonymizes the text to mislead a potential adversary into predicting a wrong private attribute value.

teaser PII-Compass: Guiding LLM training data extraction prompts towards the target PII via grounding
Krishna Kanth Nakka, Ahmed Frikha, Ricardo Mendis, Xue Jiang, Xuebing Zhou
Privacy in NLP Workshop, ACL 2024
Paper

We empirically demonstrate that it is possible to improve the extractability of PII by over ten-fold by grounding the prefix of the manually constructed extraction prompt with in-domain data.

teaser Federated Hyperparameter Optimization Through Reward-Based Strategies: Challenges and Insights
Krishna Kanth Nakka, Ahmed Frikha, Ricardo Mendis, Xue Jiang, Xuebing Zhou
FedVision Workshop, CVPR 2024
Paper

In this paper we take a deeper look at the reward-based strategies and systematically analyze them uncovering several issues and challenges associated with their adoption in practice.Furthermore motivated by the insights from our analysis we propose an in-depth evaluation of policy distribution with metrics that capture rankings of standalone configurations.

teaser Understanding Pose and Appearance Disentanglement in 3D Human Pose Estimation
Krishna Kanth Nakka, Mathieu Salzmann
Preprint, 2022
Paper

Our analyses show that disentanglement in the three state-of-the-art disentangled representation learning frameworks is far from complete, and that their pose codes contain significant appearance information

teaser Universal, Transferable Adversarial Attacks for Visual Object Trackers
Krishna Kanth Nakka, Mathieu Salzmann
Paper
Adversarial Robustness Workshop, European Conference on Computer Vision (ECCV), 2022

We propose to learn to generate a single perturbation from the object template only, that can be added to every search image and still successfully fool the tracker for the entire video. As a consequence, the resulting generator outputs perturbations that are quasi-independent of the template, thereby making them universal perturbations.

teaser Learning Transferable Adversarial Perturbations
Krishna Kanth Nakka, Mathieu Salzmann
Neural Information and Processing Systems (NeurIPS), 2021
arXiv / code

We show that generators trained with mid-level feature separation loss transfers significantly better in cross-model, cross-domain and cross-task setting

teaser Towards Robust Fine-grained Recognition by Maximal Separation of Discriminative Features
Krishna Kanth Nakka, Mathieu Salzmann
Asian Conference on Computer Vision (ACCV), 2020
arXiv / code / Slides

We improve the robustness by introducing an attention-based regularization mechanism that maximally separates the latent features of discriminative regions of different classes while minimizing the contribution of the non-discriminative regions to the final class prediction.

teaser Indirect Local Attacks for Context-aware Semantic Segmentation Networks
Krishna Kanth Nakka, Mathieu Salzmann
European Conference on Computer Vision (ECCV), 2020 [Spotlight]
arXiv / code / Slides

We show that the resulting networks are sensitive not only to global attacks, where perturbations affect the entire input image, but also to indirect local attacks where perturbations are confined to a small image region that does not overlap with the area that we aim to fool.

teaser Detecting the Unexpected via Image Resynthesis
Krzysztof Lis, Krishna Kanth Nakka, Pascal Fua and Mathieu Salzmann
International Conference on Computer Vision (ICCV) , 2019
arXiv / code / Poster

We rely on the intuition that the network will produce spurious labels in regions depicting unexpected anomaly objects. Therefore, resynthesizing the image from the resulting semantic map will yield significant appearance differences with respect to the input image which we detect through an auxiliary network

teaser Interpretable BoW Networks for Adversarial Example Detectio
Krishna Kanth Nakka and Mathieu Salzmann
Explainable and Interpretable AI workshop, ICCV, 2018 [Oral]
arXiv / Slides

We build upon the intuition that, while adversarial samples look very similar to real images, to produce incorrect predictions, they should activate codewords with a significantly different visual representation. We therefore cast the adversarial example detection problem as that of comparing the input image with the most highly activated visual codeword.

teaser Deep Attentional Structured Representation Learning for Visual Recognition
Krishna Kanth Nakka and Mathieu Salzmann
British Media Vision Conference (BMVC), 2018
arXiv / Poster

we introduce an attentional structured representation learning framework that incorporates an image-specific attention mechanism within the aggregation process.

teaser Deep learning based fence segmentation and removal from an image using a video sequence
SankarGanesh Jonna, Krishna Kanth Nakka and Rajiv Ranjan Sahay
International Workshop on Video Segmentation, ECCV, 2016 [Oral]
arXiv / Slides

We use knowledge of spatial locations of fences to subsequently estimate occlusion-aware optical flow. We then fuse the occluded information from neighbouring frames by solving inverse problem of denoising

teaser Detection and removal of fence occlusions in an image using a video of the static/dynamic scene
SankarGanesh Jonna, Krishna Kanth Nakka and Rajiv Ranjan Sahay
Journal of the Optical Society of America A (JOSA A) , 2016
arXiv / PDF

Our approach of defencing is as follows: (i) detection of spatial locations of fences/occlusions in the frames of the video, (ii) estimation of relative motion between the observations, and (iii) data fusion to fill in occluded pixels in the reference image. We assume the de-fenced image as a Markov random field and obtain its maximum a posteriori estimate by solving the corresponding inverse problem.

teaser My camera can see through fences: A deep learning approach for image de-fencing
SankarGanesh Jonna, Krishna Kanth Nakka and Rajiv Ranjan Sahay
Asian Conference on Pattern Recognition (ACPR), , 2015
arXiv / PDF / Poster

We propose a semi-automated de-fencing algorithm using a video of the dynamic scene. The inverse problem offence removal is solved using split Bregman technique assuming total variation of the de-fenced image as the regularization constraint.


teaser 3D-to-2D mapping for user interactive segmentation of human leg muscles from MRI data
Nilanjan Ray, Satarupa Mukherjee, Krishna Kanth Nakka, Scott T. Acton, Silvia S. Blanker
Signal and Information Processing, GlobalSIP, 2014
arXiv / PDF

We proposing a framework for user interactive segmentation of MRI of human leg muscles built upon the the strategy of bootstrapping with minimal supervision.

teaser Non-uniform sampling in EPR: optimizing data acquisition for Hyscore spectroscopy
Krishna Kanth Nakka Y. A. Tesiram, I. M. Brereton, M. Mobli and J. R. Harmer
Physical Chemistry Chemical Physics (PCCP), 2014
Paper / PDF / Supp

We show through non-linear sampling scheme with maximum entropy reconstruction technique in HYSCORE, the experimental times can be shortened by approximately an order of magnitude as compared to conventional linear sampling with negligible loss of information

Scholarships

I'm deeply grateful for the generous scholarships I received throughout my academic journey. Some of these scholarships include:

Reviewer

I have peer-reviewed over 100 articles, including:

  • Reviewer for Transactions on Pattern Analysis and Machine Intelligence, 2019, 2023, 2024
  • Reviewer for Neural Information Processing Systems (NeurIPS), 2021-2024
  • Reviewer for Computer Vision and Pattern Recognition (CVPR), 2023, 2024
  • Reviewer for International Conference on Computer Vision (ICCV), 2023
  • Reviewer for European Conference on Computer Vision (ECCV), 2024
  • Reviewer for AAAI, 2025
  • Reviewer for International Conference on Machine Learning (ICML), 2023, 2024
  • Reviewer for International Conference on Learning Representations (ICLR), 2024
  • Reviewer for Asian Conference on Computer Vision (ACCV), 2024
  • Reviewer for British Machine Vision Conference (BMVC), 2023, 2024
  • Reviewer for Winter Conference on Applications of Computer Vision (WACV), 2019, 2024, 2025
  • Reviewer for Asian Conference on Machine Learning (ACML), 2024
  • Reviewer for International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
  • Reviewer for LREC-COLING, 2024
  • Reviewer for COLING, 2025
  • Reviewer for New Frontiers in Machine Learning, ICML 2023
  • Reviewer for SafeGenAin workshop, NeurIPS 2024
  • Reviewer for MINT workshop, NeurIPS 2024
  • Reviewer for FedKDD workshop, KDD 2024
  • Reviewer for AutoRL workshop, ICML 2024
  • Reviewer for AI4CC workshop, CVPR 2024
  • Reviewer for PML workshop, ICLR 2024
Outside research

Credits: Webpage template from Jon Barron.