Pytorch mask-rcnn 實(shí)現(xiàn)細(xì)節(jié)分享
DataLoader
Dataset不能滿足需求需自定義繼承torch.utils.data.Dataset時(shí)需要override __init__, __getitem__, __len__ ,否則DataLoader導(dǎo)入自定義Dataset時(shí)缺少上述函數(shù)會導(dǎo)致NotImplementedError錯(cuò)誤
Numpy 廣播機(jī)制:
讓所有輸入數(shù)組都向其中shape最長的數(shù)組看齊,shape中不足的部分都通過在前面加1補(bǔ)齊
輸出數(shù)組的shape是輸入數(shù)組shape的各個(gè)軸上的最大值
如果輸入數(shù)組的某個(gè)軸和輸出數(shù)組的對應(yīng)軸的長度相同或者其長度為1時(shí),這個(gè)數(shù)組能夠用來計(jì)算,否則出錯(cuò)
當(dāng)輸入數(shù)組的某個(gè)軸的長度為1時(shí),沿著此軸運(yùn)算時(shí)都用此軸上的第一組值
CUDA在pytorch中的擴(kuò)展:
torch.utils.ffi中使用create_extension擴(kuò)充:
def create_extension(name, headers, sources, verbose=True, with_cuda=False,
package=False, relative_to='.', **kwargs):
"""Creates and configures a cffi.FFI object, that builds PyTorch extension.
Arguments:
name (str): package name. Can be a nested module e.g. ``.ext.my_lib``.
headers (str or List[str]): list of headers, that contain only exported
functions
sources (List[str]): list of sources to compile.
verbose (bool, optional): if set to ``False``, no output will be printed
(default: True).
with_cuda (bool, optional): set to ``True`` to compile with CUDA headers
(default: False)
package (bool, optional): set to ``True`` to build in package mode (for modules
meant to be installed as pip packages) (default: False).
relative_to (str, optional): path of the build file. Required when
``package is True``. It's best to use ``__file__`` for this argument.
kwargs: additional arguments that are passed to ffi to declare the
extension. See `Extension API reference`_ for details.
.. _`Extension API reference`: https://docs.python.org/3/distutils/apiref.html#distutils.core.Extension
"""
base_path = os.path.abspath(os.path.dirname(relative_to))
name_suffix, target_dir = _create_module_dir(base_path, name)
if not package:
cffi_wrapper_name = '_' + name_suffix
else:
cffi_wrapper_name = (name.rpartition('.')[0] +
'.{0}._{0}'.format(name_suffix))
wrapper_source, include_dirs = _setup_wrapper(with_cuda)
include_dirs.extend(kwargs.pop('include_dirs', []))
if os.sys.platform == 'win32':
library_dirs = glob.glob(os.getenv('CUDA_PATH', '') + '/lib/x64')
library_dirs += glob.glob(os.getenv('NVTOOLSEXT_PATH', '') + '/lib/x64')
here = os.path.abspath(os.path.dirname(__file__))
lib_dir = os.path.join(here, '..', '..', 'lib')
library_dirs.append(os.path.join(lib_dir))
else:
library_dirs = []
library_dirs.extend(kwargs.pop('library_dirs', []))
if isinstance(headers, str):
headers = [headers]
all_headers_source = ''
for header in headers:
with open(os.path.join(base_path, header), 'r') as f:
all_headers_source += f.read() + '\n\n'
ffi = cffi.FFI()
sources = [os.path.join(base_path, src) for src in sources]
# NB: TH headers are C99 now
kwargs['extra_compile_args'] = ['-std=c99'] + kwargs.get('extra_compile_args', [])
ffi.set_source(cffi_wrapper_name, wrapper_source + all_headers_source,
sources=sources,
include_dirs=include_dirs,
library_dirs=library_dirs, **kwargs)
ffi.cdef(_typedefs + all_headers_source)
_make_python_wrapper(name_suffix, '_' + name_suffix, target_dir)
def build():
_build_extension(ffi, cffi_wrapper_name, target_dir, verbose)
ffi.build = build
return ffi
補(bǔ)充知識:maskrcnn-benchmark 代碼詳解之 resnet.py
1Resnet 結(jié)構(gòu)
Resnet 一般分為5個(gè)卷積(conv)層,每一層為一個(gè)stage。其中每一個(gè)stage中由不同數(shù)量的相同的block(區(qū)塊)構(gòu)成,這些區(qū)塊的個(gè)數(shù)就是block_count, 第一個(gè)stage跟其他幾個(gè)stage結(jié)構(gòu)完全不同,也可以看做是由單獨(dú)的區(qū)塊構(gòu)成的,因此由區(qū)塊不停堆疊構(gòu)成的第二層到第5層(即stage2-stage5或conv2-conv5),分別定義為index1-index4.就像搭積木一樣,這四個(gè)層可有基本的區(qū)塊搭成。下圖為resnet的基本結(jié)構(gòu):

以下代碼通過控制區(qū)塊的多少,搭建出不同的Resnet(包括Resnet50等):
# ----------------------------------------------------------------------------- # Standard ResNet models # ----------------------------------------------------------------------------- # ResNet-50 (包括所有的階段) # ResNet 分為5個(gè)階段,但是第一個(gè)階段都相同,變化是從第二個(gè)階段開始的,所以下面的index是從第二個(gè)階段開始編號的。其中block_count為該階段區(qū)塊的個(gè)數(shù) ResNet50StagesTo5 = tuple( StageSpec(index=i, block_count=c, return_features=r) for (i, c, r) in ((1, 3, False), (2, 4, False), (3, 6, False), (4, 3, True)) ) # ResNet-50 up to stage 4 (excludes stage 5) ResNet50StagesTo4 = tuple( StageSpec(index=i, block_count=c, return_features=r) for (i, c, r) in ((1, 3, False), (2, 4, False), (3, 6, True)) ) # ResNet-101 (including all stages) ResNet101StagesTo5 = tuple( StageSpec(index=i, block_count=c, return_features=r) for (i, c, r) in ((1, 3, False), (2, 4, False), (3, 23, False), (4, 3, True)) ) # ResNet-101 up to stage 4 (excludes stage 5) ResNet101StagesTo4 = tuple( StageSpec(index=i, block_count=c, return_features=r) for (i, c, r) in ((1, 3, False), (2, 4, False), (3, 23, True)) ) # ResNet-50-FPN (including all stages) ResNet50FPNStagesTo5 = tuple( StageSpec(index=i, block_count=c, return_features=r) for (i, c, r) in ((1, 3, True), (2, 4, True), (3, 6, True), (4, 3, True)) ) # ResNet-101-FPN (including all stages) ResNet101FPNStagesTo5 = tuple( StageSpec(index=i, block_count=c, return_features=r) for (i, c, r) in ((1, 3, True), (2, 4, True), (3, 23, True), (4, 3, True)) ) # ResNet-152-FPN (including all stages) ResNet152FPNStagesTo5 = tuple( StageSpec(index=i, block_count=c, return_features=r) for (i, c, r) in ((1, 3, True), (2, 8, True), (3, 36, True), (4, 3, True)) )
根據(jù)以上的不同組合方案,maskrcnn benchmark可以搭建起不同的backbone
def _make_stage(
transformation_module,
in_channels,
bottleneck_channels,
out_channels,
block_count,
num_groups,
stride_in_1x1,
first_stride,
dilation=1,
dcn_config={}
):
blocks = []
stride = first_stride
# 根據(jù)不同的配置,構(gòu)造不同的卷基層
for _ in range(block_count):
blocks.append(
transformation_module(
in_channels,
bottleneck_channels,
out_channels,
num_groups,
stride_in_1x1,
stride,
dilation=dilation,
dcn_config=dcn_config
)
)
stride = 1
in_channels = out_channels
return nn.Sequential(*blocks)
這幾種不同的backbone之后被集成為一個(gè)統(tǒng)一的對象以便于調(diào)用,其代碼為:
_STAGE_SPECS = Registry({
"R-50-C4": ResNet50StagesTo4,
"R-50-C5": ResNet50StagesTo5,
"R-101-C4": ResNet101StagesTo4,
"R-101-C5": ResNet101StagesTo5,
"R-50-FPN": ResNet50FPNStagesTo5,
"R-50-FPN-RETINANET": ResNet50FPNStagesTo5,
"R-101-FPN": ResNet101FPNStagesTo5,
"R-101-FPN-RETINANET": ResNet101FPNStagesTo5,
"R-152-FPN": ResNet152FPNStagesTo5,
})
2區(qū)塊(block)結(jié)構(gòu)
2.1 Bottleneck結(jié)構(gòu)
剛剛提到,在Resnet中,第一層卷基層可以看做一種區(qū)塊,而第二層到第五層由不同的稱之為Bottleneck的區(qū)塊堆疊二層。第一層可以看做一個(gè)stem區(qū)塊。其中Bottleneck的結(jié)構(gòu)如下:

在maskrcnn benchmark中構(gòu)造以上結(jié)構(gòu)的代碼為:
class Bottleneck(nn.Module):
def __init__(
self,
in_channels,
bottleneck_channels,
out_channels,
num_groups,
stride_in_1x1,
stride,
dilation,
norm_func,
dcn_config
):
super(Bottleneck, self).__init__()
# 區(qū)塊旁邊的旁支
self.downsample = None
if in_channels != out_channels:
# 獲得卷積的步長 使用一個(gè)長度為1的卷積核對輸入特征進(jìn)行卷積,使得其輸出通道數(shù)等于主體部分的輸出通道數(shù)
down_stride = stride if dilation == 1 else 1
self.downsample = nn.Sequential(
Conv2d(
in_channels, out_channels,
kernel_size=1, stride=down_stride, bias=False
),
norm_func(out_channels),
)
for modules in [self.downsample,]:
for l in modules.modules():
if isinstance(l, Conv2d):
nn.init.kaiming_uniform_(l.weight, a=1)
if dilation > 1:
stride = 1 # reset to be 1
# The original MSRA ResNet models have stride in the first 1x1 conv
# The subsequent fb.torch.resnet and Caffe2 ResNe[X]t implementations have
# stride in the 3x3 conv
# 步長
stride_1x1, stride_3x3 = (stride, 1) if stride_in_1x1 else (1, stride)
# 區(qū)塊中主體部分,這一部分為固定結(jié)構(gòu)
# 使得特征經(jīng)過長度大小為1的卷積核
self.conv1 = Conv2d(
in_channels,
bottleneck_channels,
kernel_size=1,
stride=stride_1x1,
bias=False,
)
self.bn1 = norm_func(bottleneck_channels)
# TODO: specify init for the above
with_dcn = dcn_config.get("stage_with_dcn", False)
if with_dcn:
# 使用dcn網(wǎng)絡(luò)
deformable_groups = dcn_config.get("deformable_groups", 1)
with_modulated_dcn = dcn_config.get("with_modulated_dcn", False)
self.conv2 = DFConv2d(
bottleneck_channels,
bottleneck_channels,
defrost=with_modulated_dcn,
kernel_size=3,
stride=stride_3x3,
groups=num_groups,
dilation=dilation,
deformable_groups=deformable_groups,
bias=False
)
else:
# 使得特征經(jīng)過長度大小為3的卷積核
self.conv2 = Conv2d(
bottleneck_channels,
bottleneck_channels,
kernel_size=3,
stride=stride_3x3,
padding=dilation,
bias=False,
groups=num_groups,
dilation=dilation
)
nn.init.kaiming_uniform_(self.conv2.weight, a=1)
self.bn2 = norm_func(bottleneck_channels)
self.conv3 = Conv2d(
bottleneck_channels, out_channels, kernel_size=1, bias=False
)
self.bn3 = norm_func(out_channels)
for l in [self.conv1, self.conv3,]:
nn.init.kaiming_uniform_(l.weight, a=1)
def forward(self, x):
identity = x
out = self.conv1(x)
out = self.bn1(out)
out = F.relu_(out)
out = self.conv2(out)
out = self.bn2(out)
out = F.relu_(out)
out0 = self.conv3(out)
out = self.bn3(out0)
if self.downsample is not None:
identity = self.downsample(x)
out += identity
out = F.relu_(out)
return out
2.2 Stem結(jié)構(gòu)
剛剛提到Resnet的第一層可以看做是一個(gè)Stem結(jié)構(gòu),其結(jié)構(gòu)的代碼為:
class BaseStem(nn.Module): def __init__(self, cfg, norm_func): super(BaseStem, self).__init__() # 獲取backbone的輸出特征層的輸出通道數(shù),由用戶自定義 out_channels = cfg.MODEL.RESNETS.STEM_OUT_CHANNELS # 輸入通道數(shù)為圖像的三原色,輸出為輸出通道數(shù),這一部分是固定的,又Resnet論文定義的 self.conv1 = Conv2d( 3, out_channels, kernel_size=7, stride=2, padding=3, bias=False ) self.bn1 = norm_func(out_channels) for l in [self.conv1,]: nn.init.kaiming_uniform_(l.weight, a=1) def forward(self, x): x = self.conv1(x) x = self.bn1(x) x = F.relu_(x) x = F.max_pool2d(x, kernel_size=3, stride=2, padding=1) return x
2.3 兩種結(jié)構(gòu)的衍生與封裝
在maskrcnn benchmark中,對上面提到的這兩種block結(jié)構(gòu)進(jìn)行的衍生和封裝,Bottleneck和Stem分別衍生出帶有Batch Normalization 和 Group Normalizetion的封裝類,分別為:BottleneckWithFixedBatchNorm, StemWithFixedBatchNorm, BottleneckWithGN, StemWithGN. 其代碼過于簡單,就不做注釋:
class BottleneckWithFixedBatchNorm(Bottleneck):
def __init__(
self,
in_channels,
bottleneck_channels,
out_channels,
num_groups=1,
stride_in_1x1=True,
stride=1,
dilation=1,
dcn_config={}
):
super(BottleneckWithFixedBatchNorm, self).__init__(
in_channels=in_channels,
bottleneck_channels=bottleneck_channels,
out_channels=out_channels,
num_groups=num_groups,
stride_in_1x1=stride_in_1x1,
stride=stride,
dilation=dilation,
norm_func=FrozenBatchNorm2d,
dcn_config=dcn_config
)
class StemWithFixedBatchNorm(BaseStem):
def __init__(self, cfg):
super(StemWithFixedBatchNorm, self).__init__(
cfg, norm_func=FrozenBatchNorm2d
)
class BottleneckWithGN(Bottleneck):
def __init__(
self,
in_channels,
bottleneck_channels,
out_channels,
num_groups=1,
stride_in_1x1=True,
stride=1,
dilation=1,
dcn_config={}
):
super(BottleneckWithGN, self).__init__(
in_channels=in_channels,
bottleneck_channels=bottleneck_channels,
out_channels=out_channels,
num_groups=num_groups,
stride_in_1x1=stride_in_1x1,
stride=stride,
dilation=dilation,
norm_func=group_norm,
dcn_config=dcn_config
)
class StemWithGN(BaseStem):
def __init__(self, cfg):
super(StemWithGN, self).__init__(cfg, norm_func=group_norm)
_TRANSFORMATION_MODULES = Registry({
"BottleneckWithFixedBatchNorm": BottleneckWithFixedBatchNorm,
"BottleneckWithGN": BottleneckWithGN,
})
接著,這兩種結(jié)構(gòu)關(guān)于BN和GN的四種衍生類被封裝起來,以便于調(diào)用。其封裝為:
_TRANSFORMATION_MODULES = Registry({
"BottleneckWithFixedBatchNorm": BottleneckWithFixedBatchNorm,
"BottleneckWithGN": BottleneckWithGN,
})
_STEM_MODULES = Registry({
"StemWithFixedBatchNorm": StemWithFixedBatchNorm,
"StemWithGN": StemWithGN,
})
3 Resnet總體結(jié)構(gòu)
3.1 Resnet結(jié)構(gòu)
在以上的基礎(chǔ)上,我們可以在以上結(jié)構(gòu)上進(jìn)一步搭建起真正的Resnet. 其中包括第一層卷基層,和其他四個(gè)階段,代碼為:
class ResNet(nn.Module):
def __init__(self, cfg):
super(ResNet, self).__init__()
# If we want to use the cfg in forward(), then we should make a copy
# of it and store it for later use:
# self.cfg = cfg.clone()
# Translate string names to implementations
# 第一層conv層,也是第一階段,以stem的形式展現(xiàn)
stem_module = _STEM_MODULES[cfg.MODEL.RESNETS.STEM_FUNC]
# 得到指定的backbone結(jié)構(gòu)
stage_specs = _STAGE_SPECS[cfg.MODEL.BACKBONE.CONV_BODY]
# 得到具體bottleneck結(jié)構(gòu),也就是指出組成backbone基本模塊的類型
transformation_module = _TRANSFORMATION_MODULES[cfg.MODEL.RESNETS.TRANS_FUNC]
# Construct the stem module
self.stem = stem_module(cfg)
# Constuct the specified ResNet stages
# 用于group normalization設(shè)置的組數(shù)
num_groups = cfg.MODEL.RESNETS.NUM_GROUPS
# 指定每一組擁有的通道數(shù)
width_per_group = cfg.MODEL.RESNETS.WIDTH_PER_GROUP
# stem是第一層的結(jié)構(gòu),它的輸出也就是第二層一下的組合結(jié)構(gòu)的輸入通道數(shù),內(nèi)部通道數(shù)是可以自由定義的
in_channels = cfg.MODEL.RESNETS.STEM_OUT_CHANNELS
# 使用group的數(shù)目和每一組的通道數(shù)來得出組成backbone基本模塊的內(nèi)部通道數(shù)
stage2_bottleneck_channels = num_groups * width_per_group
# 第二階段的輸出通道數(shù)
stage2_out_channels = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS
self.stages = []
self.return_features = {}
for stage_spec in stage_specs:
name = "layer" + str(stage_spec.index)
# 以下每一階段的輸入輸出層的通道數(shù)都可以由stage2層的得到,即2倍關(guān)系
stage2_relative_factor = 2 ** (stage_spec.index - 1)
bottleneck_channels = stage2_bottleneck_channels * stage2_relative_factor
out_channels = stage2_out_channels * stage2_relative_factor
stage_with_dcn = cfg.MODEL.RESNETS.STAGE_WITH_DCN[stage_spec.index -1]
# 得到每一階段的卷積結(jié)構(gòu)
module = _make_stage(
transformation_module,
in_channels,
bottleneck_channels,
out_channels,
stage_spec.block_count,
num_groups,
cfg.MODEL.RESNETS.STRIDE_IN_1X1,
first_stride=int(stage_spec.index > 1) + 1,
dcn_config={
"stage_with_dcn": stage_with_dcn,
"with_modulated_dcn": cfg.MODEL.RESNETS.WITH_MODULATED_DCN,
"deformable_groups": cfg.MODEL.RESNETS.DEFORMABLE_GROUPS,
}
)
in_channels = out_channels
self.add_module(name, module)
self.stages.append(name)
self.return_features[name] = stage_spec.return_features
# Optionally freeze (requires_grad=False) parts of the backbone
self._freeze_backbone(cfg.MODEL.BACKBONE.FREEZE_CONV_BODY_AT)
# 固定某一層的參數(shù)不再更新
def _freeze_backbone(self, freeze_at):
if freeze_at < 0:
return
for stage_index in range(freeze_at):
if stage_index == 0:
m = self.stem # stage 0 is the stem
else:
m = getattr(self, "layer" + str(stage_index))
for p in m.parameters():
p.requires_grad = False
def forward(self, x):
outputs = []
x = self.stem(x)
for stage_name in self.stages:
x = getattr(self, stage_name)(x)
if self.return_features[stage_name]:
outputs.append(x)
return outputs
3.2 Resnet head結(jié)構(gòu)
Head,在我理解看來就是完成某種功能的網(wǎng)絡(luò)結(jié)構(gòu),Resnet head就是指使用Bottleneck塊堆疊成不同的用于構(gòu)成Resnet的功能網(wǎng)絡(luò)結(jié)構(gòu),它內(nèi)部結(jié)構(gòu)相似,完成某種功能。在此不做過多介紹,因?yàn)槭巧厦娴腞esnet子結(jié)構(gòu)
class ResNetHead(nn.Module):
def __init__(
self,
block_module,
stages,
num_groups=1,
width_per_group=64,
stride_in_1x1=True,
stride_init=None,
res2_out_channels=256,
dilation=1,
dcn_config={}
):
super(ResNetHead, self).__init__()
stage2_relative_factor = 2 ** (stages[0].index - 1)
stage2_bottleneck_channels = num_groups * width_per_group
out_channels = res2_out_channels * stage2_relative_factor
in_channels = out_channels // 2
bottleneck_channels = stage2_bottleneck_channels * stage2_relative_factor
block_module = _TRANSFORMATION_MODULES[block_module]
self.stages = []
stride = stride_init
for stage in stages:
name = "layer" + str(stage.index)
if not stride:
stride = int(stage.index > 1) + 1
module = _make_stage(
block_module,
in_channels,
bottleneck_channels,
out_channels,
stage.block_count,
num_groups,
stride_in_1x1,
first_stride=stride,
dilation=dilation,
dcn_config=dcn_config
)
stride = None
self.add_module(name, module)
self.stages.append(name)
self.out_channels = out_channels
def forward(self, x):
for stage in self.stages:
x = getattr(self, stage)(x)
return x
以上這篇Pytorch mask-rcnn 實(shí)現(xiàn)細(xì)節(jié)分享就是小編分享給大家的全部內(nèi)容了,希望能給大家一個(gè)參考,也希望大家多多支持腳本之家。
- 基于PyTorch實(shí)現(xiàn)一個(gè)簡單的CNN圖像分類器
- pytorch實(shí)現(xiàn)textCNN的具體操作
- Pytorch 使用CNN圖像分類的實(shí)現(xiàn)
- pytorch實(shí)現(xiàn)CNN卷積神經(jīng)網(wǎng)絡(luò)
- 用Pytorch訓(xùn)練CNN(數(shù)據(jù)集MNIST,使用GPU的方法)
- PyTorch CNN實(shí)戰(zhàn)之MNIST手寫數(shù)字識別示例
- PyTorch上實(shí)現(xiàn)卷積神經(jīng)網(wǎng)絡(luò)CNN的方法
- CNN的Pytorch實(shí)現(xiàn)(LeNet)
相關(guān)文章
Scrapy框架實(shí)現(xiàn)的登錄網(wǎng)站操作示例
這篇文章主要介紹了Scrapy框架實(shí)現(xiàn)的登錄網(wǎng)站操作,結(jié)合實(shí)例形式分析了Scrapy登錄網(wǎng)站cookies方式、post請求方式登錄網(wǎng)站相關(guān)實(shí)現(xiàn)技巧,需要的朋友可以參考下2020-02-02
Django csrf 兩種方法設(shè)置form的實(shí)例
今天小編就為大家分享一篇Django csrf 兩種方法設(shè)置form的實(shí)例,具有很好的參考價(jià)值,希望對大家有所幫助。一起跟隨小編過來看看吧2019-02-02
教你用Python寫一個(gè)植物大戰(zhàn)僵尸小游戲
這篇文章主要介紹了教你用Python寫一個(gè)植物大戰(zhàn)僵尸小游戲,文中有非常詳細(xì)的代碼示例,對正在學(xué)習(xí)python的小伙伴們有非常好的幫助,需要的朋友可以參考下2021-04-04
Jupyter Notebook 如何修改字體和大小以及更改字體樣式
這篇文章主要介紹了Jupyter Notebook 如何修改字體和大小以及更改字體樣式的操作,具有很好的參考價(jià)值,希望對大家有所幫助。如有錯(cuò)誤或未考慮完全的地方,望不吝賜教2021-06-06
Python如何向SQLServer存儲二進(jìn)制圖片
這篇文章主要介紹了Python如何向SQLServer存儲二進(jìn)制圖片,文中通過示例代碼介紹的非常詳細(xì),對大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友可以參考下2020-06-06
python中l(wèi)ogging庫的使用總結(jié)
Python的logging模塊提供了通用的日志系統(tǒng),可以方便第三方模塊或者是應(yīng)用使用,下面這篇文章主要給大家介紹了關(guān)于python中l(wèi)ogging庫使用的一些知識總結(jié),文中給出了詳細(xì)的示例代碼,需要的朋友可以參考借鑒,下面來一起看看吧。2017-10-10
Python實(shí)現(xiàn)遍歷子文件夾并將文件復(fù)制到不同的目標(biāo)文件夾
這篇文章主要介紹了如何基于Python語言實(shí)現(xiàn)遍歷多個(gè)子文件夾,將每一個(gè)子文件夾中大量的文件,按照每一個(gè)文件的文件名稱的特點(diǎn)復(fù)制到不同的目標(biāo)文件夾中,感興趣的可以了解下2023-08-08

