Taein Kim
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper presents a large-scale study demonstrating that tool cloning is a pervasive and severe source of hidden duplication in agent-tool ecosystems, necessitating changes in how tool diversity is measured.
The paper proposes a novel, scalable technique using unique canary tokens to automatically and accurately identify which web scrapers are feeding data to specific Large Language Models (LLMs).
Papers
Identifying AI Web Scrapers Using Canary Tokens
Steven Seiden, Triss Ren, Caroline Zhang, Taein Kim +2 more
The paper proposes a novel, scalable technique using unique canary tokens to automatically and accurately identify which web scrapers are feeding data to specific Large Language Models (LLMs).