当前位置：首页 > news >正文

武山县建设局网站关键词优化排名软件

news 2025/7/18 1:42:16

武山县建设局网站,关键词优化排名软件,泰安人才市场官网,云阳如何做网站数据转换与加载项目列表前言标签转换RGB标签到类别标签映射RGB标签转换成类别标签数据数据加载随机裁剪数据加载项目列表语义分割项目（一）——数据概况及预处理语义分割项目（二）——标签转换与数据加载语义分割项目&#x…

数据转换与加载

项目列表
前言
标签转换
- RGB标签到类别标签映射
- RGB标签转换成类别标签数据
数据加载
- 随机裁剪
- 数据加载

项目列表

语义分割项目（一）——数据概况及预处理

语义分割项目（二）——标签转换与数据加载

语义分割项目（三）——语义分割模型（U-net和deeplavb3+）

前言

在前面的文章中我们介绍了数据集的概况以及预处理，在训练之前除了数据预处理之外我们还需要对于标签进行处理，因为标签是以RGB格式存放的，我们需要把他们变换成常见的类别标签，并且因为语义分割问题是针对像素的分类，在数据量较大的情况下容易内存溢出（OOM），所以我们往往需要重写数据加载类针对大量数据进行加载。

标签转换

RGB标签到类别标签映射

我们知道RGB图像的数据点有三个通道，每个通道取值范围为 $0 - 255$
即 $（ 0 - 255 ， 0 - 255, 0 - 255 ）$ ，那么我们可以考虑这样一个思路，我们设置一个长度为 $255^3$ 的向量，这样就可以容纳所有像素的取值范围。在之前的文章中我们定义了VOC_COLORMAP和VOC_CLASSES，对应着像素形式的类别和文字形式的类别

VOC_COLORMAP = [[226, 169, 41], [132, 41, 246], [110, 193, 228], [60, 16, 152], [254, 221, 58], [155, 155, 155]]
VOC_CLASSES = ['Water', 'Land (unpaved area)', 'Road', 'Building', 'Vegetation', 'Unlabeled']

那么我们构造一个voc_colormap2label函数，通过enumerate遍历VOC_COLORMAP获取索引与像素类别，并赋值colormap2label

def voc_colormap2label():colormap2label = torch.zeros(256 ** 3, dtype=torch.long)for i, colormap in enumerate(VOC_COLORMAP):colormap2label[(colormap[0] * 256 + colormap[1]) * 256 + colormap[2]] = ireturn colormap2label

RGB标签转换成类别标签数据

通过上面的函数我们可以获得RGB标签到类别标签的映射关系，那么我们在构造一个函数，传入RGB标签数据colormap和RGB标签向类别标签的映射colormap2label，返回值是类别标签。

def voc_label_indices(colormap, colormap2label):colormap = colormap.permute(1, 2, 0).numpy().astype('int32')idx = ((colormap[:, :, 0] * 256 + colormap[:, :, 1]) * 256 + colormap[:, :, 2])return colormap2label[idx]

数据加载

随机裁剪

由于输入图像的形状不能确定，并且有时图像太大会影响训练速度或者影响内存，所以我们需要对于图像和标签进行裁剪，我们调用torchvision.transforms.RandomCrop.get_params可以获取随机裁剪的区域（这一步的操作是为了使得数据和标签的区域匹配），然后我们使用torchvision.transforms.functional.crop可以进行数据和标签同步裁剪。

def voc_rand_crop(feature, label, height, width):rect = torchvision.transforms.RandomCrop.get_params(feature, (height, width))feature = torchvision.transforms.functional.crop(feature, *rect)label = torchvision.transforms.functional.crop(label, *rect)return feature, label

数据加载

我们简单介绍一下数据加载类SemanticDataset

函数名	用途
`__init__`	用于初始参数设置
`normalize_image`	将图像设置成0-1范围内并进行normalize
`pad_params`	获取图像padding参数
`pad_image`	根据pad参数padding图像
`__getitem__`	通过索引获取数据
`__len__`	获取数据长度

数据加载类的主要的思路是加载图像和标签，对于图像进行规范化（除以255以及normalize），如果图像过大进行裁剪，如果图像过小进行padding，对于标签我们调用之前的函数从RGB标签转换成类别标签

class SemanticDataset(torch.utils.data.Dataset):def __init__(self, is_train, crop_size, data_dir):self.transform = torchvision.transforms.Normalize(mean=[0.4813, 0.4844, 0.4919], std=[0.2467, 0.2478, 0.2542])self.crop_size = crop_sizeself.data_dir = data_dirself.is_train = is_trainself.colormap2label = voc_colormap2label()txt_fname = os.path.join(data_dir, 'train.txt' if self.is_train else 'test.txt')with open(txt_fname, 'r') as f:self.images = f.read().split()def normalize_image(self, img):return self.transform(img.float() / 255)def pad_params(self, crop_h, crop_w, img_h, img_w):hight = max(crop_h, img_h)width = max(crop_w, img_w)y_s = (hight - img_h) // 2x_s = (width - img_w) // 2return hight, width, y_s, x_sdef pad_image(self, hight, width, y_s, x_s, feature):zeros = torch.zeros((feature.shape[0], hight, width))zeros[:, y_s:y_s + feature.shape[1], x_s:x_s + feature.shape[2]] = featurereturn zerosdef __getitem__(self, idx):mode = torchvision.io.image.ImageReadMode.RGBfeature = torchvision.io.read_image(os.path.join(self.data_dir, 'images', '{:03d}.jpg'.format(int(self.images[idx]))))label = torchvision.io.read_image(os.path.join(self.data_dir, 'labels', '{:03d}.png'.format(int(self.images[idx]))), mode)c_h, c_w, f_h, f_w = self.crop_size[0], self.crop_size[1], feature.shape[1], feature.shape[2]if f_h < c_h or f_w < c_w:higth, width, y_s, x_s = self.pad_params(c_h, c_w, f_h, f_w)feature = self.pad_image(higth, width, y_s, x_s, feature)label = self.pad_image(higth, width, y_s, x_s, label)feature = self.normalize_image(feature) feature, label = voc_rand_crop(feature, label,*self.crop_size)label = voc_label_indices(label, self.colormap2label)return (feature, label)def __len__(self):return len(self.images)

使用torch.utils.data.DataLoader批量加载数据

def load_data_voc(batch_size, crop_size, data_dir = './dataset'):train_iter = torch.utils.data.DataLoader(SemanticDataset(True, crop_size, data_dir), batch_size, shuffle=True, drop_last=True)test_iter = torch.utils.data.DataLoader(SemanticDataset(False, crop_size, data_dir), batch_size, shuffle=False, drop_last=True)return train_iter, test_iter