공정, 지속 가능, 신뢰할 수 있는 인공지능을 위한 안내서

✨ Goal

<aside> ▶️ 1. Fairness, Sustainability, Trustworthy AI에 대한 기본적인 개념 학습

공정한, 지속 가능한, 신뢰할 수 있는 AI에 대한 설명서(글, 영상, 포스트) 작성하기
프로젝트를 통해 기존 생성형 AI에 공정성, 지속 가능성, 신뢰성을 적용하는 능력 키우기

</aside>

📢 스터디 공지

<aside> 📢 Prompt Injection & Jailbreaking에 대한 모임은 4**/09, 화요일 20:30** 에 예정되어 있습니다.

계획표 : Hacking LLM with Prompts (방법론)
장소 : 디스코드 채널 GH - 청강 가능

다음 모임을 위해 Red Teaming LLM Application을 준비해주세요!

</aside>

지난 공지

💫

안내서 열어보기

출석 체크

팀 멤버

⚽ Ground Rule

<aside> 📢 스터디 규칙

시간을 지켜주세요.
스터디원을 존중해주세요.
튜토리얼/안내서(글,영상) - (후기가 아닙니다!!) 를 📚**가짜 도서관에 3회 미만 제출 시 수료 불가합니다**
적극적으로 소통해주세요 </aside>

⚽ Study Rule

<aside> 📢 스터디 모임 규칙

스터디의 원할한 의견 교류를 위해 실명 + 캠이 활성화 된 상태에서 참여해주세요
발표는 20분 이상, 발표에 대한 의견 교류는 5분 이상 10분 이하로 진행합니다.
발표 내용은 녹화되어 유튜브에, 발표 자료는 블로그, PPT, SNS 포스트를 ****지향합니다
- 노션 페이지를 그대로 발표하는 것은 🚫지양⛔합니다

</aside>

👟 학습 활동

스터디 (시작 전)

주제에 해당되는 내용을 공부합니다.
- 학습에 참고한 자료 중 추천하고 싶은 좋은 논문은 디스코드 논문-연구회에 추천 이유와 주요 인사이트와 함께 기록해주세요
- 해당 주제를 공부하는데, 좋은 자료가 있다면, 에 남겨주세요

스터디 (모임)

발표자는 학습한 내용을 정리해 만든 안내서를 개인 블로그/유튜브/SNS에 업로드 후 가짜 도서관에 공유합니다.
발표자의 안내서 소개가 끝난 후에는,
- 청강/스터디 참여자는 안내서에 대한 질문이나 같이 알아두면 좋은 내용을 토의, 공유합니다.

스터디(모임 이후)

발표자의 안내서에 대한 댓글을 달아주세요!
- 새롭게 배운 부분이나, 질문한 부분, 토의한 부분 등을 댓글로 달아주세요
- 권장 ) 공개된 안내서(블로그, 유튜브, SNS)에 댓글을 달고 같은 내용을 가짜 도서관 에도 달아줍니다.
- 참고한 사이트 : 고려대학교 DBSA 연구실(링크)

프로젝트

프로젝트 수행 과정 및 결과를 튜토리얼로 만들어 Github에 업로드합니다.

📜 모임 계획표

Untitled

Fairness, Sustainability, Trustworthy에 대한 자료 & 링크

주차	자료 설명 및 링크
OT	OECD AI Principles overview
Trustworthy Language Model
Jailbroken : How does llm safety training fail? (NeurlPS ‘23 Oral Paper)
2024 신뢰할 수 있는 인공지능 개발 안내서
1주차 - Fariness AI	영상 : Building fair, ethical, and responsible AI with the Responsible AI Toolkit
논문 : Preventing Discriminatory Decision-making in Evolving Data Streams
블로그 : 사람과 공존하는 AI의 필요조건, AI 공정성
2주차 - Sustainability AI	블로그 : How to Make Generative AI Greener
논문 : The role of artificial intelligence in achieving the Sustainable Development Goals
3주차 - Trustworthy AI	논문 : Trustworthy AI : From Principles to Practices
협약 : OECD AI Principles overview
영상 : MIT 6.S191 : Robus and Trustworthy Deep Learning
Trustworthy Language Model
저작권협회 생성형 AI 저작권 안내
4주차 - 프로젝트 팀 빌딩 및 토의	영상 : Generative AI meets Responsible AI : Practical Challenges and Opportunities
detection model : Huggingface Prompt injection dataset
프롬프트 Gandalf Lakera
5주차 - Jailbreaking w/ Prompts	AntiGPT
블로그 : Jailbreaking Large Language Models
Github : ChatGPT_DAN 해당 논문 : Do Anything Now
논문 : FigStep: Jailbreaking Large Vision-language Models via Typographic Visual Prompts
6주차 - Fairness in Gen AI	Measuring Fairness in Generative Models
On The Impact of Machine Learning Randomness on Group Fairness
[Data quality and artificial
intelligence – mitigating bias
and error to protect
fundamental rights](https://fra.europa.eu/sites/default/files/fra_uploads/fra-2019-data-quality-and-ai_en.pdf)

7주차 - Sustainability AI	Generative AI in energy, natural resources, and chemical
8주차 - Trustworthy Gen AI	[On Evaluating Adversarial Robustness of
Large Vision-Language Model](https://yunqing-me.github.io/AttackVLM/)
OpenReview : Jailbreak in pieces
9주차 - 중간 리뷰/프로젝트 회의
10,11주차 - Gen AI project for avoiding toxicity	영상 : Generative AI meets Responsible AI : Practical Challenges and Opportunities
논문 : Can LLM Recognize Toxicity? Structured Toxicity Investigation Framework and Semantic-Based Metric
Tutorial : Building a Dataset to Measure Toxicity and Social Bias within Language
12,13주차 - LLM trustworthy project	Constitutional AI: Harmlessness from AI Feedback
Jailbreak in pieces : Compositional Adversarial Attacks on Multi-Modal Language Models
Langchain-Safety
7 methods to secure LLM apps from prompt injections and jailbreaks
Label Errors in ML Test Sets

📚 자료 아카이빙

Untitled

Build a real-time RAG chatbot using Google Drive and Sharepoint