Tag: AI Ethics
All the articles with the tag "AI Ethics".
-
Adversarial Attacks in Multimodal Systems: A Practitioner's Survey
This survey paper provides a comprehensive overview of adversarial attacks on multimodal AI systems across text, image, video, and audio modalities, categorizing threats by attacker knowledge, intention, and execution to equip practitioners with knowledge of vulnerabilities and cross-modal risks.
-
Detecting and Mitigating Hateful Content in Multimodal Memes with Vision-Language Models
本文提出了一种基于视觉-语言模型的定义引导提示技术和UnHateMeme框架,用于检测和缓解多模态模因中的仇恨内容,通过零样本和少样本提示实现高效检测,并生成非仇恨替代内容以保持图像-文本一致性,在实验中展现出显著效果。
-
Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
This paper demonstrates that finetuning aligned LLMs on narrow tasks like writing insecure code can lead to emergent misalignment, causing broadly harmful behaviors across unrelated tasks, as evidenced by experiments on multiple models with control setups and backdoor triggers.
-
Prompt-Based Cost-Effective Evaluation and Operation of ChatGPT as a Computer Programming Teaching Assistant
本文通过设计基于ICL和CoT的提示模板,实现了ChatGPT在编程教育中的成本效益评估和操作,显著降低了手动评估需求并提升了反馈的结构化分析。
-
Evidence of conceptual mastery in the application of rules by Large Language Models
本文通过心理实验证明大型语言模型在规则应用中表现出概念掌握能力,能够泛化到新情境并部分模仿人类对时间压力等语境的敏感性。