之前不了解shuffle的實(shí)際效果,假設(shè)有數(shù)據(jù)a,b,c,d,不知道batch_size=2后打亂,具體是如下哪一種情況:
1.先按順序取batch,對(duì)batch內(nèi)打亂,即先取a,b,a,b進(jìn)行打亂;
2.先打亂,再取batch。
shuffle (bool, optional): set to ``True`` to have the data reshuffled at every epoch (default: ``False``). if shuffle: sampler = RandomSampler(dataset) #此時(shí)得到的是索引
補(bǔ)充:簡(jiǎn)單測(cè)試一下pytorch dataloader里的shuffle=True是如何工作的
import sys import torch import random import argparse import numpy as np import pandas as pd import torch.nn as nn from torch.nn import functional as F from torch.optim import lr_scheduler from torchvision import datasets, transforms from torch.utils.data import TensorDataset, DataLoader, Dataset class DealDataset(Dataset): def __init__(self): xy = np.loadtxt(open('./iris.csv','rb'), delimiter=',', dtype=np.float32) #data = pd.read_csv("iris.csv",header=None) #xy = data.values self.x_data = torch.from_numpy(xy[:, 0:-1]) self.y_data = torch.from_numpy(xy[:, [-1]]) self.len = xy.shape[0] def __getitem__(self, index): return self.x_data[index], self.y_data[index] def __len__(self): return self.len dealDataset = DealDataset() train_loader2 = DataLoader(dataset=dealDataset, batch_size=2, shuffle=True) #print(dealDataset.x_data) for i, data in enumerate(train_loader2): inputs, labels = data #inputs, labels = Variable(inputs), Variable(labels) print(inputs) #print("epoch:", epoch, "的第" , i, "個(gè)inputs", inputs.data.size(), "labels", labels.data.size())
簡(jiǎn)易數(shù)據(jù)集
shuffle之后的結(jié)果,每次都是隨機(jī)打亂,然后分成大小為n的若干個(gè)mini-batch.
以上為個(gè)人經(jīng)驗(yàn),希望能給大家一個(gè)參考,也希望大家多多支持腳本之家。
標(biāo)簽:常德 潛江 株洲 阿里 黑龍江 呂梁 通遼 銅川
巨人網(wǎng)絡(luò)通訊聲明:本文標(biāo)題《我對(duì)PyTorch dataloader里的shuffle=True的理解》,本文關(guān)鍵詞 我對(duì),PyTorch,dataloader,里的,;如發(fā)現(xiàn)本文內(nèi)容存在版權(quán)問(wèn)題,煩請(qǐng)?zhí)峁┫嚓P(guān)信息告之我們,我們將及時(shí)溝通與處理。本站內(nèi)容系統(tǒng)采集于網(wǎng)絡(luò),涉及言論、版權(quán)與本站無(wú)關(guān)。