keypointrcnn_resnet50_fpn¶
- torchvision.models.detection.keypointrcnn_resnet50_fpn(*, weights: Optional[KeypointRCNN_ResNet50_FPN_Weights] = None, progress: bool = True, num_classes: Optional[int] = None, num_keypoints: Optional[int] = None, weights_backbone: Optional[ResNet50_Weights] = ResNet50_Weights.IMAGENET1K_V1, trainable_backbone_layers: Optional[int] = None, **kwargs: Any) KeypointRCNN[source]¶
- Constructs a Keypoint R-CNN model with a ResNet-50-FPN backbone. - Warning - The detection module is in Beta stage, and backward compatibility is not guaranteed. - Reference: Mask R-CNN. - The input to the model is expected to be a list of tensors, each of shape - [C, H, W], one for each image, and should be in- 0-1range. Different images can have different sizes.- The behavior of the model changes depending if it is in training or evaluation mode. - During training, the model expects both the input tensors, as well as a targets (list of dictionary), containing: - boxes ( - FloatTensor[N, 4]): the ground-truth boxes in- [x1, y1, x2, y2]format, with- 0 <= x1 < x2 <= Wand- 0 <= y1 < y2 <= H.
- labels ( - Int64Tensor[N]): the class label for each ground-truth box
- keypoints ( - FloatTensor[N, K, 3]): the- Kkeypoints location for each of the- Ninstances, in the format- [x, y, visibility], where- visibility=0means that the keypoint is not visible.
 - The model returns a - Dict[Tensor]during training, containing the classification and regression losses for both the RPN and the R-CNN, and the keypoint loss.- During inference, the model requires only the input tensors, and returns the post-processed predictions as a - List[Dict[Tensor]], one for each input image. The fields of the- Dictare as follows, where- Nis the number of detected instances:- boxes ( - FloatTensor[N, 4]): the predicted boxes in- [x1, y1, x2, y2]format, with- 0 <= x1 < x2 <= Wand- 0 <= y1 < y2 <= H.
- labels ( - Int64Tensor[N]): the predicted labels for each instance
- scores ( - Tensor[N]): the scores or each instance
- keypoints ( - FloatTensor[N, K, 3]): the locations of the predicted keypoints, in- [x, y, v]format.
 - For more details on the output, you may refer to Instance segmentation models. - Keypoint R-CNN is exportable to ONNX for a fixed batch size with inputs images of fixed size. - Example: - >>> model = torchvision.models.detection.keypointrcnn_resnet50_fpn(weights=KeypointRCNN_ResNet50_FPN_Weights.DEFAULT) >>> model.eval() >>> x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)] >>> predictions = model(x) >>> >>> # optionally, if you want to export the model to ONNX: >>> torch.onnx.export(model, x, "keypoint_rcnn.onnx", opset_version = 11) - Parameters:
- weights ( - KeypointRCNN_ResNet50_FPN_Weights, optional) – The pretrained weights to use. See- KeypointRCNN_ResNet50_FPN_Weightsbelow for more details, and possible values. By default, no pre-trained weights are used.
- progress (bool) – If True, displays a progress bar of the download to stderr 
- num_classes (int, optional) – number of output classes of the model (including the background) 
- num_keypoints (int, optional) – number of keypoints 
- weights_backbone ( - ResNet50_Weights, optional) – The pretrained weights for the backbone.
- trainable_backbone_layers (int, optional) – number of trainable (not frozen) layers starting from final block. Valid values are between 0 and 5, with 5 meaning all backbone layers are trainable. If - Noneis passed (the default) this value is set to 3.
 
 - class torchvision.models.detection.KeypointRCNN_ResNet50_FPN_Weights(value)[source]¶
- The model builder above accepts the following values as the - weightsparameter.- KeypointRCNN_ResNet50_FPN_Weights.DEFAULTis equivalent to- KeypointRCNN_ResNet50_FPN_Weights.COCO_V1. You can also use strings, e.g.- weights='DEFAULT'or- weights='COCO_LEGACY'.- KeypointRCNN_ResNet50_FPN_Weights.COCO_LEGACY: - These weights were produced by following a similar training recipe as on the paper but use a checkpoint from an early epoch. - box_map (on COCO-val2017) - 50.6 - kp_map (on COCO-val2017) - 61.1 - categories - no person, person - keypoint_names - nose, left_eye, right_eye, … (14 omitted) - min_size - height=1, width=1 - num_params - 59137258 - recipe - The inference transforms are available at - KeypointRCNN_ResNet50_FPN_Weights.COCO_LEGACY.transformsand perform the following preprocessing operations: Accepts- PIL.Image, batched- (B, C, H, W)and single- (C, H, W)image- torch.Tensorobjects. The images are rescaled to- [0.0, 1.0].- KeypointRCNN_ResNet50_FPN_Weights.COCO_V1: - These weights were produced by following a similar training recipe as on the paper. Also available as - KeypointRCNN_ResNet50_FPN_Weights.DEFAULT.- box_map (on COCO-val2017) - 54.6 - kp_map (on COCO-val2017) - 65.0 - categories - no person, person - keypoint_names - nose, left_eye, right_eye, … (14 omitted) - min_size - height=1, width=1 - num_params - 59137258 - recipe - The inference transforms are available at - KeypointRCNN_ResNet50_FPN_Weights.COCO_V1.transformsand perform the following preprocessing operations: Accepts- PIL.Image, batched- (B, C, H, W)and single- (C, H, W)image- torch.Tensorobjects. The images are rescaled to- [0.0, 1.0].
 - Examples using - keypointrcnn_resnet50_fpn:
