'AI/4주차' 카테고리의 글 목록

5. Ensemble & Hyperparameter Optimization & Experiment

AI/4주차 2021. 8. 27. 18:05

Model Averaging soft voting이 더 좋을 확률이 있다. TTA (Test Time Augmentation) 테스트 이미지를 augmentation 후 모델 추론, 출력된 여러 결과를 앙상블한다. 앙상블 기법에는 성능-효율 trade off가 존재한다. Hyperparameter Tuning Grid Search Random Search Bayseian Optimization (성능 좋음) https://brunch.co.kr/@tristanmhhd/19 Bayesian Optimization Hyperparameter tuning | Optimization Optimization은 어떤 임의의 함수 f(x)의 값을 가장 크게(또는 작게)하는 해를 구하는 것이다. 이 f(x)는 머신러닝..

4. Training & Inference

AI/4주차 2021. 8. 27. 18:05

Loss Loss도 nn.Module에 포함된다. loss.backward()를 통해 gradient 가 update된다. Focal Loss: Class Imbalance 문제가 있는 경우, 맞춘 확률이 높은 Class는 조금의 loss를, 맞춘 확률이 낮은 Class는 loss를 크게 부여 Label Smoothing Loss: class target label을 one-hot으로 표현하는 것이 아닌, soft 하게 표현해서 일반화 성능을 높임 ex) [0,1 0, 0] -> [0.2, 0.7, 0.05, 0.05] Metric 모델을 평가하기 위한 지표. 데이터에 따라서 잘 선택하는 것이 필요. Class Imbalance가 큰 경우 F1-Score, 아니면 Accuracy Pytorch Light..

3. Model

AI/4주차 2021. 8. 26. 16:44

Pytorch low level, pythonic, flexibility Modules pytorch의 모든 레이어는 nn.Module 클래스를 따른다 nn.module family: nn.module을 상속받은 모든 클래스의 공통점 -> forward() 함수를 가진다, parameter https://pytorch.org/docs/stable/generated/torch.nn.Module.html Module — PyTorch 1.9.0 documentation Shortcuts pytorch.org Pretrained Model 일반적으로 모델을 처음부터 학습시키는 것은 비효율적. 따라서 좋은 품질, 데이터로 미리 학습한 모델을 내 목적에 맞게 다듬어서 사용하는 pretrain 모델을 이용하는 방식..

2. Data Feeding

AI/4주차 2021. 8. 26. 15:57

데이터 사이언스에서는 전처리 과정이 중요하다. 경진대회용 데이터는 품질이 양호하다. 전처리 방법은 도메인, 데이터 형식에 따라 다양한 case가 존재한다. 다양한 data augmentation 기법과, 이를 위한 albumentation 라이브러리가 있다. https://github.com/albumentations-team/albumentations GitHub - albumentations-team/albumentations: Fast image augmentation library and an easy-to-use wrapper around other libraries. Fast image augmentation library and an easy-to-use wrapper around other..

1. Competition & Seaborn

AI/4주차 2021. 8. 26. 15:39

1. Overview 숙지하기 Problem Definition: 내가 풀어야 할 문제, input output 파악 등 2. Data Description 데이터의 형태와 의미를 파악하기 EDA (Exploratory Data Analysis) Seaborn은 Matplotlib 기반 통계 시각화 라이브러리. Matplotlib으로 커스텀 가능 깔끔하고 쉬운 문법 pip install seaborn==0.11 import seaborn as sns 5가지 기본적인 API 제공 Categorical Distribution Relational Regression Matrix Categorical countplot이 대표적이며 범주를 이산적으로 카운팅하여 막대그래프를 그림 x, y, data, hue, pa..

ABOUT ME

꾸준히 꾸준히

티스토리툴바