OpenAI Docs

2023. 1. 29. 19:02

https://platform.openai.com/docs
GET STARTED
- Models : ...
CAPABILITIES (TEXT)
- Text generation
  - Chat Completions : 'system', 'user', 'assistant' 각 Role 과 Prompts 메세지 구성 전략.
  - ~~JSON mode : gpt-4-turbo 혹은 earlier 모델~~
    - json 출력을 강제해야하기 때문에, 'system 또는 user 메시지' 으로 JSON 명시를 해줘. (안하면 익셉션 ㅎㅎ)
    - 예) {"role": "system", "content": "You are a helpful assistant designed to output JSON."}
    - json 출력 또한 (token 제약 등으로) 짤릴수 있기 때문에, finish_reason 을 꼭 확인해야.
    - json 출력이 또 특정 스키마를 보장 해주진 않아서... 최종적으로 parsing valid 점검을 해야할듯.
  - Reproducible outputs :
    - 기본적으로 non-deterministic outputs 이기 때문에~ 요청마다 응답이 조금씩 다를수밖에없다.
    - 그래서 deterministic output 을 제공하기 위해선~ 시드 파라미터 및 시스템 핑거프린트 제공.
  - Managing tokens : OpenAI tiktoken Python library 같은걸로 미리 계산 할 수도 있는데...
  - Parameter details : frequency_penalty 및 presence_penalty 튜닝 & log probabilities 튜닝
- Structured Outputs
  - response_format 으로 'json_schema', 'Pydantic 클래스', 등등 지원!
  - Structured vs JSON mode : Only Structured ensure schema adherance.
    - ex) CoT, Extraction, UI, Moderation
  - ...
- Predicted Outputs
- Vision
  - Base64 encoded images
  - Multiple images
  - ...
- Moderation
  - text | image -> {omni-moderation-latest} -> falg & categories
- Reasoning (추리)
CAPABILITIES (IMAGE)
- Image generation
CAPABILITIES (AUDIO)
- Audio generation
  - input audio -> {Completion Model} -> audio output
- TTS
- STT
CAPABILITIES (DATA)
- Vector Embeddings
GUIDES

Fine-tuning
- 데이터 업로드 -> 추가 "few-shot" 학습 -> 더 많은 예제를 통한 결과의 질적 향상, 프롬프트 토큰 절약, 속도 개선 !!
  - (1) Prepare and upload training data
  - (2) Train a new fine-tuned model
  - (3) Evaluate results and go back to step 1 if needed
  - (4) Use your fine-tuned model
- When to use fine-tuning
  - 무작정 파인튜닝 보다는, 프롬프트 engineering & chaining 부터 해보고 판단는걸 추천.
- Preparing your dataset
  - role, content, name(옵션), weight(좋은/나쁜) 의 messages 라인에 jsonl 준비. (원하는 동작을 직접적으로 타겟팅)
  - ```
  {"messages": [
     {"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, 
     {"role": "user", "content": "What's the capital of France?"}, 
     {"role": "assistant", "content": "Paris", "weight": 0}, 
     {"role": "user", "content": "Can you be more sarcastic?"}, 
     {"role": "assistant", "content": "Paris, as if everyone doesn't know that already.", "weight": 1}]}
  {"messages": [
     {"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, 
     {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, 
     {"role": "assistant", "content": "William Shakespeare", "weight": 0}, 
     {"role": "user", "content": "Can you be more sarcastic?"}, 
     {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?", "weight": 1}]}
  {"messages": [
     {"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, 
     {"role": "user", "content": "How far is the Moon from Earth?"}, 
     {"role": "assistant", "content": "384,400 kilometers", "weight": 0}, 
     {"role": "user", "content": "Can you be more sarcastic?"}, 
     {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters.", "weight": 1}]}
```
- 학습 이전의 모델을 대상으로, 필요하다는 모든 instructions 및 prompts 를 examples 에 매번 넣는게 좋다.
- 최소 10개, 일반적으로 50~100개 정도의 examples 을 적용하는것이 좋다.
- dataset 을 다 준비하면, training / test 으로 나누어서 학습 및 평가로 사용하는 것이 좋다.
- Vision fine-tuning
- Create a fine-tuned model
  - jobs.create() 이후, list() - retrieve() - cancel() - list_events() 등등으로 관리 가능.
- Use a fine-tuned model
- Use a checkpointed model
  - 학습의 각 에폭 -> 체크포인트 -> 오버피팅 등등 확인 -> ...
- Analyzing your fine-tuned model
- Fine-tuning examples
Fine-tuning 3rd-파티 연동(Weights&Biases 등등)
- ...
Evaluating
- ...
Distillation
- ...

-끝-

'AI' 카테고리의 다른 글

OpenAI Doc2 (0)	2024.05.25
OpenAI Cookbook (CHAT COMPLETIONS) (0)	2024.05.24
OpenAI API & Lib (0)	2024.05.24
Hugging Face (0)	2022.09.10
facebook research (0)	2022.04.29

기술블로그 바이수

OpenAI Docs

'AI' 카테고리의 다른 글

+ Recent posts

티스토리툴바