LLM Collusion and Identifiability in Code Generation

May 2025 – Mar 2026 (discontinued)

Investigates whether large language models can recognize or attribute code authored by themselves or other LLMs.
Implemented three attribution tasks: Self-Recognition, Target Identification, and Full Attribution.
Evaluated across multiple benchmarks (MBPP, HumanEval, DS-1000) with code generated by GPT-5, Claude, Gemini, Grok, and DeepSeek.
Built a scalable experimentation pipeline for multi-model code generation, attribution testing, and statistical analysis.
Fine-tuned lightweight judge models to identify model-specific code signatures, highlighting risks of collusion and leakage in code evaluation.