Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

electronics-logo

Article Menu

types of segmentation research paper

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Techniques and challenges of image segmentation: a review.

types of segmentation research paper

1. Introduction

2. classic segmentation methods, 2.1. edge detection, 2.2. region division, 2.3. graph theory, 2.4. clustering method, 2.5. random walks, 3. co-segmentation methods, 3.1. mrf-based co-segmentation, 3.2. co-segmentation based on random walks, 3.3. co-segmentation based on active contours, 3.4. clustering-based co-segmentation, 3.5. co-segmentation based on graph theory, 3.6. co-segmentation based on thermal diffusion, 3.7. object-based co-segmentation, 4. semantic segmentation based on deep learning, 4.1. encoder–decoder architecture.

  • The interpolation method uses a specified interpolation strategy to insert new elements between the pixels of the original image, thereby expanding the size of the image and achieving the effect of up-sampling. Interpolation does not require training parameters and is often used in early up-sampling tasks;
  • The FCN adopts deconvolution for up-sampling. Deconvolution, also known as transposed convolution, reverses the parameters of the original convolution kernel upside down and flipped horizontally, and fills the spaces between and around the elements of the original image;
  • SegNet [ 61 ] adopts the up-sampling method of unpooling. Unpooling represents the inverse operation of max-pooling in the CNN. During maximum pooling, not only the maximum value of the pooling window, but also the coordinate position of the maximum values should be recorded; in the case of unpooling, the maximum value of this position is activated, and the values in other positions are all set to 0;
  • Wang et al. [ 62 ] proposed a dense up-sampling convolution (DUC), the core idea of which is to convert the label mapping in the feature map into smaller label mapping with multiple channels. This transformation can be achieved by directly using convolutions between the input feature map and the output label map, without the need to interpolate extra values during the up-sampling process.

4.2. Skip Connections

4.3. dilated convolution, 4.4. multiscale feature extraction, 4.5. attention mechanisms.

Click here to enlarge figure

AlgorithmsPub. YearBackboneExperimentsMajor Contributions
DatasetsmIoU (%)
FCN [ ]2015VGG-16PASCAL VOC 201162.7The forerunner for end-to-end semantic segmentation
NYUDv234.0
U-Net [ ]2015VGG-16PhC-U37392.03Encoder–decoder structure,
skip connections
DIC-HeLa77.56
SegNet [ ]2016VGG-16CamVid60.4Transferred the max-pooling indices to the decoder
SUN RGBD28.27
DeepLabv1 [ ]2016VGG-16PASCAL VOC 201271.6Atrous convolution, fully connected CRFs
MSCA [ ]2016VGG-16PASCAL VOC 201275.3Dilated convolutions, multi-scale context aggregation, front-end context module
LRR [ ]2016ResNet/VGG-16PASCAL VOC 201177.5Reconstruction up-sampling module, Laplacian pyramid refinement
Cityscapes69.7
ReSeg [ ]2016VGG-16 & ReNetCamVid91.6Extension of ReNet to semantic segmentation
Oxford Flowers93.7
CamVid58.8
DRN [ ]2017ResNet-101Cityscapes70.9Modified Conv4/5 of ResNet,
dilated convolution
PSPNet [ ]2017ResNet50PASCAL VOC 201285.4Spatial pyramid pooling (SPP)
Cityscapes80.2
DeepLab V2 [ ]2017VGG-16/ResNet-101PASCAL VOC 201279.7Atrous spatial pyramid pooling (ASPP), fully connected CRFs
Cityscapes70.4
DeepLab V3 [ ]2017ResNet-101PASCAL VOC 201286.9Cascaded or parallel ASPP modules
Cityscapes81.3
DeepLab V3+ [ ]2018XceptionPASCAL VOC 201289.0A new encoder–decoder structure with DeepLab V3 as an encoder
Cityscapes82.1
DUC-HDC [ ]2018ResNet-101/ResNet-152PASCAL VOC 201283.1HDC (hybrid dilation convolution) was proposed to solve the gridding caused by dilated convolutions
Cityscapes80.1
Attention U-Net [ ]2018VGG-16 with AGsmulti-class abdominal CT-150--A novel self-attention gating (AGs) filter, skip connections
TCIA Pancreas CT-82--
PSANet [ ]2018ResNet-101ADE20K81.51Point-wise spatial attention maps from two parallel branches, bi-direction information propagation model
PASCAL VOC 201285.7
Cityscapes81.4
APCNet [ ]2019ResNet-101PASCAL VOC 201284.2Multi-scale, global-guided local affinity (GLA), adaptive context modules (ACMs)
PASCAL Context54.7
ADE20K45.38
DANet [ ]2019ResNet-101Cityscapes81.5Dual attention: position attention module and channel attention module
PASCAL VOC 201282.6
PASCAL Context52.6
COCO Stuff39.7
CARAFE [ ]2019ResNet-50ADE20k42.23Pyramid pooling module (PPM), feature pyramid network (FPN), multi-level feature fusion (FUSE)
EFPN [ ]2021VGG-16PASCAL VOC 201286.4PPM, multi-scale feature fusion module with a parallel branch
Cityscapes82.3
PASCAL Context53.9
CARAFE++ [ ]2021ResNet-101ADE20k43.94PPM, FPN, FUSE, adaptive kernels on-the-fly
Swin Transformer [ ]2021Swin-LSwin-L53.5A novel shifted windowing scheme, a general backbone network for computer vision
Attention UW-Net [ ]2022ResNet50NIH Chest X-ray--Skip connections,
an intermediate layer that combines the feature maps of the fourth-layer encoder with the feature maps of the last-layer encoder layer,
attention mechanism
FPANet [ ]2022ResNet18Cityscapes75.9Bilateral directional FPN, lightweight ASPP, feature pyramid fusion module (FPFM), border refinement module (BRM)
CamVid74.7

5. Conclusions

  • Semantic segmentation, instance segmentation, and panoramic segmentation are still the research hotspots of image segmentation. Instance segmentation predicts the pixel regions contained in each instance; panoramic segmentation integrates both semantic segmentation and instance segmentation, and assigns a category label and an instance ID to each pixel of the image. Especially in panoramic segmentation, countable, or uncountable instances are difficult to recognize in a single workflow, so it is a challenging task to build an effective network to simultaneously identify both large inter-category differences and small intra-category differences;
  • With the popularization of image acquisition equipment (e.g., LiDAR cameras), RGB-depth, 3D-point clouds, voxels, and mesh segmentation have gradually become research hotspots, which have a wide requirement in face recognition [ 95 ], autonomous vehicles, VR, AR, architectural modeling, etc. Although there has been some progress in the research of 3D image segmentation, e.g., region growth, random walks, and clustering in classic algorithms, and SVM, random forest, and AdaBoost in machine learning algorithms, the representation and processing of 3D data, which are unstructured, redundant, disordered, and unevenly distributed, remain a major challenge;
  • In some fields, it is difficult to use supervised learning algorithms to train the network due to a lack of datasets or fine-grained annotations. Semi-supervised and unsupervised semantic segmentation can be selected in these cases, where the network can be trained on the benchmark dataset first, and the lower-level parameters of the network can then be fixed, and the fully connected layer or some high-level parameters can be trained on the small-sample dataset. This is transfer learning, that does not require abundant labeled samples. Reinforcement learning is also a possible solution, but it is rarely studied in the field of image segmentation. In addition, few-shot image semantic segmentation is also a hot research direction;
  • Deep learning networks require a significant amount of computing resources in the training process, that also illustrates the computational complexity of the deep neural network. Real-time (or near real-time) segmentation is required in many fields, e.g., video processing to meet the human vision mechanism of at least 25 fps, and most current networks are far below this frame rate. Some lightweight networks have improved the speed of the segmentation to a certain extent, but there is still a large amount of room for improvement in the balance of model accuracy and real-time performance.

Author Contributions

Data availability statement, acknowledgments, conflicts of interest.

  • Anwesh, K.; Pal, D.; Ganguly, D.; Chatterjee, K.; Roy, S. Number plate recognition from enhanced super-resolution using generative adversarial network. Multimed. Tools Appl. 2022 , 1–17. [ Google Scholar ] [ CrossRef ]
  • Jin, B.; Cruz, L.; Gonçalves, N. Deep Facial Diagnosis: Deep Transfer Learning from Face Recognition to Facial Diagnosis. IEEE Access 2020 , 8 , 123649–123661. [ Google Scholar ] [ CrossRef ]
  • Zhao, M.; Liu, Q.; Jha, R.; Deng, R.; Yao, T.; Mahadevan-Jansen, A.; Tyska, M.J.; Millis, B.A.; Huo, Y. VoxelEmbed: 3D Instance Segmentation and Tracking with Voxel Embedding based Deep Learning. In Proceedings of the International Workshop on Machine Learning in Medical Imaging, Strasbourg, France, 27 September 2021; Volume 12966, pp. 437–446. [ Google Scholar ] [ CrossRef ]
  • Yao, T.; Qu, C.; Liu, Q.; Deng, R.; Tian, Y.; Xu, J.; Jha, A.; Bao, S.; Zhao, M.; Fogo, A.B.; et al. Compound Figure Separation of Biomedical Images with Side Loss. In Proceedings of the Deep Generative Models, and Data Augmentation, Labelling, and Imperfections, Strasbourg, France, 1 October 2021; Volume 13003, pp. 173–183. [ Google Scholar ] [ CrossRef ]
  • Minaee, S.; Boykov, Y.; Porikli, F.; Plaza, A.; Kehtarnavaz, N.; Terzopoulos, D. Image Segmentation Using Deep Learning: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2022 , 44 , 3523–3542. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Zhang, X.; Yao, Q.A.; Zhao, J.; Jin, Z.J.; Feng, Y.C. Image Semantic Segmentation Based on Fully Convolutional Neural Network. Comput. Eng. Appl. 2022 , 44 , 45–57. [ Google Scholar ]
  • Garcia-Garcia, A.; Orts-Escolano, S.; Oprea, S.; Villena-Martinez, V.; Martinez-Gonzalez, P.; Garcia-Rodriguez, J. A survey on deep learning techniques for image and video semantic segmentation. Appl. Soft Comput. 2018 , 70 , 41–65. [ Google Scholar ] [ CrossRef ]
  • Yu, Y.; Wang, C.; Fu, Q.; Kou, R.; Wu, W.; Liu, T. A Survey of Evaluation Metrics and Methods for Semantic Segmentation. Comput. Eng. Appl. 2023; online preprint . [ Google Scholar ]
  • Lankton, S.; Tannenbaum, A. Localizing Region-Based Active Contours. IEEE Trans. Image Process. 2008 , 17 , 2029–2039. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Freedman, D.; Tao, Z. Interactive Graph Cut based Segmentation with Shape Priors. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, CA, USA, 20–25 June 2005; Volume 1, pp. 755–762. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Felzenszwalb, P.F.; Huttenlocher, D.P. Efficient Graph-Based Image Segmentation. Int. J. Comput. Vis. 2004 , 59 , 167–181. [ Google Scholar ] [ CrossRef ]
  • Leordeanu, M.; Hebert, M. A Spectral Technique for Correspondence Problems using Pairwise Constraints. In Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV’05), Beijing, China, 17–21 October 2005; Volume 2, pp. 1482–1489. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Comaniciu, D.; Meer, P. Mean Shift: A Robust Approach Toward Feature Space Analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2002 , 24 , 603–619. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Chuang, K.S.; Tzeng, H.L.; Chen, S.; Wu, J.; Chen, T.J. Fuzzy C-means Clustering with Spatial Information for Image Segmentation. Comput. Med. Imaging Graph. Off. J. Comput. Med. Imaging Soc. 2006 , 30 , 9–15. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC Superpixels Compared to State-of-the-Art Su-perpixel Method. IEEE Trans. Pattern Anal. Mach. Intell. 2012 , 34 , 2274–2282. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Li, Z.; Chen, J. Superpixel Segmentation using Linear Spectral Clustering. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1356–1363. [ Google Scholar ] [ CrossRef ]
  • Pan, W.; Lu, X.Q.; Gong, Y.H.; Tang, W.M.; Liu, J.; He, Y.; Qiu, G.P. HLO: Half-kernel Laplacian Operator for Sur-face Smoothing. Comput. Aided Des. 2020 , 121 , 102807. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Chen, H.B.; Zhen, X.; Gu, X.J.; Yan, H.; Cervino, L.; Xiao, Y.; Zhou, L.H. SPARSE: Seed Point Auto-Generation for Random Walks Segmentation Enhancement in medical inhomogeneous targets delineation of morphological MR and CT images. J. Appl. Clin. Med. Phys. 2015 , 16 , 387–402. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Drouyer, S.; Beucher, S.; Bilodeau, M.; Moreaud, M.; Sorbier, L. Sparse Stereo Disparity Map Densification using Hierarchical Image Segmentation. In Mathematical Morphology and Its Applications to Signal and Image Processing ; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2017; Volume 1022. [ Google Scholar ] [ CrossRef ]
  • Grady, L. Random Walks for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2006 , 28 , 1768–1783. [ Google Scholar ] [ CrossRef ] [ PubMed ] [ Green Version ]
  • Yang, W.; Cai, J.; Zheng, J.; Luo, J. User-Friendly Interactive Image Segmentation Through Unified Combinatorial User Inputs. IEEE Trans. Image Process. 2010 , 19 , 2470–2479. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Lai, Y.K.; Hu, S.M.; Martin, R.R.; Rosin, P.L. Fast Mesh Segmentation using Random Walks. In Proceedings of the 2008 ACM Symposium on Solid and Physical Modeling, New York, NY, USA, 2 June 2008; pp. 183–191. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Zhang, J.; Wu, C.; Cai, J.; Zheng, J.; Tai, X. Mesh Snapping: Robust Interactive Mesh Cutting using Fast Geodesic Curvature Flow. Comput. Graph. Forum 2010 , 29 , 517–526. [ Google Scholar ] [ CrossRef ]
  • Rother, C.; Minka, T.P.; Blake, A.; Kolmogorov, V. Cosegmentation of Image Pairs by Histogram Matching—Incorporating a Global Constraint into MRFs. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), New York, NY, USA, 17–22 June 2006; pp. 993–1000. [ Google Scholar ] [ CrossRef ]
  • Vicente, S.; Kolmogorov, V.; Rother, C. Cosegmentation Revisited: Models and Optimization. Lecture Notes in Computer Science. In Proceedings of the Computer Vision (ECCV), Crete, Greece, 5–11 September 2010; pp. 465–479. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Mukherjee, L.; Singh, V.; Dyer, C.R. Half-integrality-based Algorithms for Cosegmentation of Images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 2028–2035. [ Google Scholar ] [ CrossRef ]
  • Hochbaum, D.S.; Singh, V. An Efficient Algorithm for Co-segmentation. In Proceedings of the 12th IEEE International Con-ference on Computer Vision (ICCV), Kyoto, Japan, 29 September–2 October 2009; pp. 269–276. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Rubio, J.C.; Serrat, J.; López, A.; Paragios, N. Unsupervised Co-segmentation through Region Matching. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, 16–21 June 2012; pp. 749–756. [ Google Scholar ] [ CrossRef ]
  • Chang, K.; Liu, T.; Lai, S. From Co-saliency to Co-segmentation: An Efficient and Fully Unsupervised Energy Minimization Model. In Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, 20–25 June 2011; pp. 2129–2136. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Yu, H.; Xian, M.; Qi, X. Unsupervised Co-segmentation based on a New Global GMM Constraint in MRF. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Paris, France, 27–30 October 2014; pp. 4412–4416. [ Google Scholar ] [ CrossRef ]
  • Wang, C.; Guo, Y.; Zhu, J.; Wang, L.; Wang, L. Video Object Co-Segmentation via Subspace Clustering and Quadratic Pseudo-Boolean Optimization in an MRF Framework. IEEE Trans. Multimed. 2014 , 16 , 903–916. [ Google Scholar ] [ CrossRef ]
  • Zhu, J.; Wang, L.; Gao, J.; Yang, R. Spatial-Temporal Fusion for High Accuracy Depth Maps using Dynamic MRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2010 , 32 , 899–909. [ Google Scholar ] [ CrossRef ]
  • Collins, M.D.; Xu, J.; Grady, L.; Singh, V. Random Walks based Multi-image Segmentation: Quasiconvexity Results and GPU-based Solutions. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 1656–1663. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Fabijanska, A.; Goclawski, J. The Segmentation of 3D Images using the Random Walking Technique on a Randomly Created Image Adjacency Graph. IEEE Trans. Image Process. 2015 , 24 , 524–537. [ Google Scholar ] [ CrossRef ]
  • Dong, X.P.; Shen, J.B.; Shao, L.; Gool, L.V. Sub-Markov Random Walk for Image Segmentation. IEEE Trans. Image Process. 2016 , 25 , 516–527. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Zhou, J.; Wang, W.M.; Zhang, J.; Yin, B.C.; Liu, X.P. 3D shape segmentation using multiple random walkers. J. Comput. Appl. Math. 2018 , 329 , 353–363. [ Google Scholar ] [ CrossRef ]
  • Dong, C.; Zeng, X.; Lin, L.; Hu, H.; Han, X.; Naghedolfeizi, M.; Aberra, D.; Chen, Y.W. An Improved Random Walker with Bayes Model for Volumetric Medical Image Segmentation. J. Healthc. Eng. 2017 , 2017 , 6506049. [ Google Scholar ] [ CrossRef ] [ PubMed ] [ Green Version ]
  • Meng, F.; Li, H.; Liu, G. Image Co-segmentation via Active Contours. In Proceedings of the 2012 IEEE International Symposium on Circuits and Systems (ISCAS), Seoul, Republic of Korea, 20–23 May 2012; pp. 2773–2776. [ Google Scholar ] [ CrossRef ]
  • Zhang, T.; Xia, Y.; Feng, D.D. A Deformable Cosegmentation Algorithm for Brain MR Images. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, San Diego, CA, USA, 28 August–1 September 2012; pp. 3215–3218. [ Google Scholar ] [ CrossRef ]
  • Zhang, Z.; Liu, X.; Soomro, N.Q.; Abou-El-Hossein, K. An Efficient Image Co-segmentation Algorithm based on Active Contour and Image Saliency. In Proceedings of the 2016 7th International Conference on Mechanical, Industrial, and Manufacturing Technologies (MIMT 2016), Cape Town, South Africa, 1–3 February 2016; Volume 54, p. 08004. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Joulin, A.; Bach, F.; Ponce, J. Discriminative Clustering for Image Co-segmentation. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, 13–18 June 2010; pp. 1943–1950. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Kim, E.; Li, H.; Huang, X. A Hierarchical Image Clustering Cosegmentation Framework. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, 16–21 June 2012; pp. 686–693. [ Google Scholar ] [ CrossRef ]
  • Joulin, A.; Bach, F.; Ponce, J. Multi-class Cosegmentation. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, 16–21 June 2012; pp. 542–549. [ Google Scholar ] [ CrossRef ]
  • Meng, F.; Li, H.; Liu, G.; Ngan, K.N. Object Co-Segmentation Based on Shortest Path Algorithm and Saliency Model. IEEE Trans. Multimed. 2012 , 14 , 1429–1441. [ Google Scholar ] [ CrossRef ]
  • Meng, F.M.; Li, H.; Liu, G.H. A New Co-saliency Model via Pairwise Constraint Graph Matching. In Proceedings of the International Symposium on Intelligent Signal Processing and Communications Systems, Tamsui, Taiwan, 4–7 November 2012; IEEE Computer Society Press: Los Alamitos, CA, USA, 2012; pp. 781–786. [ Google Scholar ] [ CrossRef ]
  • Kim, G.; Xing, E.P.; Li, F.F.; Kanade, T. Distributed Cosegmentation via Submodular Optimization on Anisotropic Diffusion. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 169–176. [ Google Scholar ] [ CrossRef ]
  • Kim, G.; Xing, E.P. On Multiple Foreground Cosegmentation. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 837–844. [ Google Scholar ] [ CrossRef ]
  • Alexe, B.; Deselaers, T.; Ferrari, V. What Is an Object? In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 73–80. [ Google Scholar ] [ CrossRef ]
  • Vicente, S.; Rother, C.; Kolmogorov, V. Object cosegmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, 20–25 June 2011. [ Google Scholar ] [ CrossRef ]
  • Meng, F.; Cai, J.; Li, H. Cosegmentation of Multiple Image Groups. Comput. Vis. Image Underst. 2016 , 146 , 67–76. [ Google Scholar ] [ CrossRef ]
  • Johnson, M.; Shotton, J.; Cipolla, R. Semantic Texton Forests for Image Categorization and Segmentation. In Decision Forests for Computer Vision and Medical Image Analysis, Advances in Computer Vision and Pattern Recognition ; Criminisi, A., Shotton, J., Eds.; Springer: London, UK, 2013. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Lindner, C.; Thiagarajah, S.; Wilkinson, J.M.; The arcOGEN Consortium; Wallis, G.A.; Cootes, T.F. Fully Automatic Segmentation of the Proximal Femur using Random Forest Regression Voting. IEEE Trans. Med. Imaging 2013 , 32 , 1462–1472. [ Google Scholar ] [ CrossRef ]
  • Li, H.S.; Zhao, R.; Wang, X.G. Highly Efficient Forward and Backward Propagation of Convolutional Neural Networks for Pixelwise Classification. arXiv 2014 , arXiv:1412.4526. [ Google Scholar ]
  • Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell 2017 , 39 , 640–651. [ Google Scholar ] [ CrossRef ]
  • Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based Learning Applied to Document Recognition. Proc. IEEE 1998 , 86 , 2278–2324. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017 , 60 , 84–90. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Karen, S.; Andrew, Z. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014 , arXiv:1409.1556. [ Google Scholar ]
  • Szegedy, C.; Liu, W.; Jia, Y.Q.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Visio. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • He, K.M.; Zhang, X.Y.; Ren, S.Q.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Seg-mentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017 , 39 , 2481–2495. [ Google Scholar ] [ CrossRef ]
  • Wang, P.; Chen, P.; Yuan, Y.; Liu, D.; Huang, Z.H.; Hou, X.D.; Cottrell, G. Understanding Convolution for Semantic Segmentation. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 12–15 March 2018; pp. 1451–1460. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv 2015 , arXiv:1505.04597. [ Google Scholar ]
  • Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. arXiv 2014 , arXiv:1412.7062. [ Google Scholar ] [ CrossRef ]
  • Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. arXiv 2017 , arXiv:1606.00915. [ Google Scholar ] [ CrossRef ] [ PubMed ] [ Green Version ]
  • Chen, L.C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv 2017 , arXiv:1706.05587. [ Google Scholar ]
  • Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. Proceeding of the European conference on computer vision (ECCV). arXiv 2018 , arXiv:1802.02611. [ Google Scholar ]
  • Yu, F.; Koltun, V. Multi-Scale Context Aggregation by Dilated Convolutions. arXiv 2015 , arXiv:1511.07122. [ Google Scholar ] [ CrossRef ]
  • Yu, F.; Koltun, V.; Funkhouser, T. Dilated Residual Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 636–644. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015 , 37 , 1904–1916. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Zhao, H.S.; Shi, J.P.; Qi, X.J.; Jia, J.Y. Pyramid Scene Parsing Network. arXiv 2017 , arXiv:1612.01105v2. [ Google Scholar ] [ CrossRef ]
  • Ghiasi, G.; Fowlkes, C. Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Lin, T.Y.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. arXiv 2017 , arXiv:1612.03144. 32. [ Google Scholar ]
  • He, J.; Deng, Z.; Zhou, L.; Wang, Y.; Qiao, Y. Adaptive Pyramid Context Network for Semantic Segmentation. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 7511–7520. [ Google Scholar ] [ CrossRef ]
  • Ye, M.; Ouyang, J.; Chen, G.; Zhang, J.; Yu, X. Enhanced Feature Pyramid Network for Semantic Segmentation. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 3209–3216. [ Google Scholar ] [ CrossRef ]
  • Wu, Y.; Jiang, J.; Huang, Z.; Tian, Y. FPANet: Feature pyramid aggregation network for real-time semantic segmentation. Appl. Intell. 2022 , 52 , 3319–3336. [ Google Scholar ] [ CrossRef ]
  • Mnih, V.; Heess, N.; Graves, A.; Kavukcuoglu, K. Recurrent Models of Visual Attention. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS’14), Montreal, QC, Canada, 8–13 December 2014; Volume 2, pp. 2204–2212. [ Google Scholar ]
  • Visin, F.; Romero, A.; Cho, K.; Matteucci, M.; Ciccone, M.; Kastner, K.; Bengio, Y.; Courville, A. ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation. arXiv 2015 , arXiv:1511.07053. [ Google Scholar ]
  • Visin, F.; Kastner, K.; Cho, K.; Matteucci, M.; Courville, A.; Bengio, Y. ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks. arXiv 2015 , arXiv:1505.00393. [ Google Scholar ]
  • Byeon, W.; Breuel, T.M.; Raue, F.; Liwicki, M. Scene labeling with LSTM recurrent neural networks. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3547–3555. [ Google Scholar ] [ CrossRef ]
  • Liang, X.; Shen, X.; Feng, J.; Lin, L.; Yan, S. Semantic Object Parsing with Graph LSTM. In Computer Vision—ECCV 2016, Lecture Notes in Computer Science ; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer: Cham, Switzerland, 2016; Volume 9905. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Oktay, O.; Schlemper, J.; Folgoc, L.; Lee, M.; Heinrich, M.P.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.; Kainz, B.; et al. Attention U-Net: Learning Where to Look for the Pancreas. arXiv 2018 , arXiv:1804.03999. [ Google Scholar ]
  • Pal, D.; Reddy, P.B.; Roy, S. Attention UW-Net: A fully connected model for automatic segmentation and annotation of chest X-ray. Comput. Biol. Med. 2022 , 150 , 106083. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Zhao, H.; Zhang, Y.; Liu, S.; Shi, J.; Loy, C.C.; Lin, D.; Jia, J. PSANet: Point-wise Spatial Attention Network for Scene Parsing. In Proceedings of the 15th European Conference, Munich, Germany, 8–14 September 2018. [ Google Scholar ] [ CrossRef ]
  • Fu, J.; Liu, J.; Tian, H.; Fang, Z.; Lu, H. Dual Attention Network for Scene Segmentation. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 3141–3149. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Wang, J.; Chen, K.; Xu, R.; Liu, Z.; Loy, C.C.; Lin, D. CARAFE: Content-Aware ReAssembly of FEatures. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3007–3016. [ CrossRef ] [ Green Version ]
  • Wang, J.; Chen, K.; Xu, R.; Liu, Z.; Loy, C.C.; Lin, D. CARAFE++: Unified Content-Aware ReAssembly of FEatures. IEEE Trans. Pattern Anal. Mach. Intell. 2021 , 44 , 4674–4687. [ Google Scholar ] [ CrossRef ]
  • Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 4–9 December 2017; pp. 6000–6010. [ Google Scholar ]
  • Weissenborn, D.; Täckström, O.; Uszkoreit, J. Scaling Autoregressive Video Models. arXiv 2020 , arXiv:1906.02634. [ Google Scholar ]
  • Cordonnier, J.B.; Loukas, A.; Jaggi, M. On the Relationship between Self-Attention and Convolutional Layers. arXiv 2020 , arXiv:1911.03584. [ Google Scholar ]
  • Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. In Proceedings of the International Conference on Learning Representations (ICLR), Virtual, 3–7 May 2021. [ Google Scholar ] [ CrossRef ]
  • Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. arXiv 2021 , arXiv:2103.14030. [ Google Scholar ]
  • Zheng, Q.; Yang, M.; Yang, J.; Zhang, Q.; Zhang, X. Improvement of Generalization Ability of Deep CNN via Implicit Regularization in Two-Stage Training Process. IEEE Access 2018 , 6 , 15844–15869. [ Google Scholar ] [ CrossRef ]
  • Jin, B.; Cruz, L.; Gonçalves, N. Pseudo RGB-D Face Recognition. IEEE Sens. J. 2022 , 22 , 21780–21794. [ Google Scholar ] [ CrossRef ]
MethodsRef.Foreground Feature Co-Information Optimization
MRF-Based Co-Segmentation[ ]color histogram normgraph cuts
[ ]color histogram normquadratic pseudo-Boolean
[ ]color and texture histogramsreward modelmaximum flow
[ ]color histogramBoykov–Jolly modeldual decomposition
[ ]color and SIFT featuresregion matchinggraph cuts
[ ]SIFT featureK-means + graph cuts
[ ]SIFT featureGaussian mixture model (GMM) constraintgraph cuts
Co-Segmentation Based on Random Walks[ ]color and texture histogramsimproved random walk global termgradient projection and conjugate gradient (GPCG)
[ ]intensity and gray differenceimproved random walk global termgraph size reduction
[ ]label prior from user scribblesGMMsminimize the average reaching probability
Co-Segmentation Based on Active Contours[ ]color histogramreward modellevel set function
[ ]co-registered atlas and statistical featuresk-meanslevel set function
[ ]saliency informationimproved Chan–Vese (C-V) modellevel set function
Clustering-Based Co-Segmentation[ ]SIFT, Gabor filter, color histogramChi-square distancelow-rank
[ ]color and location informationdiscriminant clusteringexpectation maximization (EM)
[ ]pyramid of LAB colors, HOG textures, SURF features histogramhierarchical clusteringnormalized cut criterion
Co-Segmentation based on Graph Theory[ ]color histogrambuilt digraphs according to region similarity and saliencyshortest path
[ ]color and shape informationbuild global items based on digraphs and saliencyshortest path
Co-Segmentation Based on Thermal Diffusion[ ]lab space color and texture informationGaussian consistencySub-modularity optimization
[ ]color and texture histogramsGMM & SPM (spatial pyramid matching)dynamic programming
Object-Based Co-Segmentation[ ]multi-scale saliency, color contrast, edge density and superpixels straddlingBayesian frameworkmaximizing the posterior probability
[ ]33 types of featuresrandom forest classifierA-star search algorithm
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Yu, Y.; Wang, C.; Fu, Q.; Kou, R.; Huang, F.; Yang, B.; Yang, T.; Gao, M. Techniques and Challenges of Image Segmentation: A Review. Electronics 2023 , 12 , 1199. https://doi.org/10.3390/electronics12051199

Yu Y, Wang C, Fu Q, Kou R, Huang F, Yang B, Yang T, Gao M. Techniques and Challenges of Image Segmentation: A Review. Electronics . 2023; 12(5):1199. https://doi.org/10.3390/electronics12051199

Yu, Ying, Chunping Wang, Qiang Fu, Renke Kou, Fuyu Huang, Boxiong Yang, Tingting Yang, and Mingliang Gao. 2023. "Techniques and Challenges of Image Segmentation: A Review" Electronics 12, no. 5: 1199. https://doi.org/10.3390/electronics12051199

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Subscribe to the PwC Newsletter

Join the community, add a new evaluation result row, semantic segmentation.

5666 papers with code • 129 benchmarks • 320 datasets

Semantic Segmentation is a computer vision task in which the goal is to categorize each pixel in an image into a class or object. The goal is to produce a dense pixel-wise segmentation map of an image, where each pixel is assigned to a specific class or object. Some example benchmarks for this task are Cityscapes, PASCAL VOC and ADE20K. Models are usually evaluated with the Mean Intersection-Over-Union (Mean IoU) and Pixel Accuracy metrics.

( Image credit: CSAILVision )

types of segmentation research paper

Benchmarks Add a Result

--> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> -->
Trend Dataset Best ModelPaper Code Compare
ONE-PEACE
GeminiFusion (Finetune-Swin-Large)
VLTSeg
SERNet-Former
BEiT-3
PlainSeg (EVA-02-L)
OmniVec
PTv3 + PPT
DeepLabv3+ (Xception-65-JFT)
PTv3 + PPT
GeminiFusion (Swin-Large)
Trans4PASS+ (multi-scale)
EfficientNet-L2+NAS-FPN (single scale test, with self-training)
MMUDA
SFSS-MMSI (RGB+HHA)
StitchFusion(RGB-D-E-LiDAR)
SERNet-Former
TEC (ViT-B/16, 224x224, SSL+FT, mmseg)
DSNet-Base
SWIM^2 (Mask2Former)
EVA
SegNeXt-L
StitchFusion (RGB-A-D-N)
Feature Geometric Net
AerialFormer-B
Trans4Trans (M)
Refign (HRDA)
U-Net (MaxViT-S)
CMNeXt (RGB-D-E-LiDAR)
CMNeXt (RGB-LF80)
Hulk(Finetune, ViT-L)
CMX
MIC
TADP
CMX (B4)
StitchFusion+FFMs (RGB-Infrared)
ShareCMP (B4 RGB-FP)
MAE+MTP(ViT-L)
LSKNet-S
UANet(PVT-V2-B2)
SMMCL (SegNeXt-B)
ShareCMP (B2 RGB-FP)
GeminiFusion
MMSFormer (RGB-A-D)
CMX (RGB-HYPER)
CMNeXt
CMNeXt
RPVNet [xu2021rpvnet]
SkyScapesNet-Dense
FoodSAM
HRDA + PiPa
Trans4PASS+
CMX
Baseline - DeepLabv3+
Segformer-B2
AO-SegNet
U-Net Ensemble
OneFormer (InternImage-H, emb_dim=1024, single-scale)
CMX (SegFormer-B4)
Bimodal SegNet
TIMF
ScribFormer
PatchFormer
EMSANet (2x ResNet-34 NBt1D)
PSPNet + CascadePSP
NCC Next
SFSS-MMSI (RGB+Depth+Normal)
SFSS-MMSI (RGB+Depth)
TTD (TCL)
SA-Gate
UNet3D
Plugin network
GA-Nav
Exchanger+Mask2Former
RBE2E
CBFC
EyeNet
UNETR + SS-CXR
FPN EfficientNet-B4 w/ Aux loss
SSMA
SSMA
Cloud-Net+
GALDNet
SkyScapesNet-Lane
DoubleUNet
VOLO-D5
Cleargrasp
SPFNet34M
MoCo V2 Surg SSL - DeepLabv3+ head
Nearest Latent Neighbours
Nearest Latent Neighbours
FasterSeg
Deeplab v2
uNetXST
ERFNet-IntRA-KD (ours)
DLv3+ (Xception65)
UVid-Net
EfficientSeg
Dice loss + IS-Triplet loss
SIW
DLDL-8s+CRF
SegFormer-B5 (Single Scale)
DLDL-8s+CRF
ICT-Net
SIW
RITnet
CGA-Net
SegCLIP
EPYNET
U-Net
Erfani et al.
MFSNet
MFSNet
MFSNet
FloodTransformer (Ours)
ACLNet
ACLNet
ACLNet
WaferSegClassNet
Late Fusion
TaskPrompter
U-Net baseline
GA-Nav
FPN EfficientNet-B4
TFNet
DiffSeg (512)
Unet+RN34
UNet
U-Net (ConvFormer-M36)

types of segmentation research paper

Most implemented papers

U-net: convolutional networks for biomedical image segmentation.

types of segmentation research paper

There is large consent that successful training of deep networks requires many thousand annotated training samples.

Deep Residual Learning for Image Recognition

Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance.

MobileNetV2: Inverted Residuals and Linear Bottlenecks

In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes.

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

types of segmentation research paper

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited.

MMDetection: Open MMLab Detection Toolbox and Benchmark

In this paper, we introduce the various features of this toolbox.

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Point cloud is an important type of geometric data structure.

FCOS: Fully Convolutional One-Stage Object Detection

By eliminating the predefined set of anchor boxes, FCOS completely avoids the complicated computation related to anchor boxes such as calculating overlapping during training.

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

The former networks are able to encode multi-scale contextual information by probing the incoming features with filters or pooling operations at multiple rates and multiple effective fields-of-view, while the latter networks can capture sharper object boundaries by gradually recovering the spatial information.

Rethinking Atrous Convolution for Semantic Image Segmentation

To handle the problem of segmenting objects at multiple scales, we design modules which employ atrous convolution in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Gates Open Res

Logo of gatesopenres

The impact of market segmentation and social marketing on uptake of preventive programmes: the example of voluntary medical male circumcision. A literature review

Anabel gomez.

1 AVAC, AVAC, New York City, NY, 10027, USA

Rebecca Loar

2 Independent Consultant, Independent Consultant, Austin, Texas, USA

Andrea England Kramer

3 Independent Consultant, Independent Consultant, St Petersburg, Florida, USA

Associated Data

All data underlying the results are available as part of the article and no additional source data are required

Peer Review Summary

Review dateReviewer name(s)Version reviewedReview status
Jason B. Reed, Manya Dotson, and Sema SgaierApproved
Steve KretschmerApproved
Beth SkorochodApproved with Reservations

Background : The business world has long recognized the power of defining discrete audiences within a target population. However, market segmentation’s full potential has not been applied to the public health context. While some broad elements of market segmentation (e.g., age, geography) are considered, a nuanced look at behavioural and psychographic segmentation, which could greatly enhance the possibility of lasting behaviour change, is often missing.  

Segmentation , and the associated mindset which acknowledges the multi-dimensional differences between people, allows service providers, implementers, policymakers, and government officials to target initiatives and lead to a greater likelihood of lasting behavioural change.

This paper investigates what segmentation is, how it has been applied to voluntary medical male circumcision (VMMC), how it can be applied in development, and the challenges in both measuring and adopting segmentation as part of program design.

Methods : We performed a detailed search of peer-reviewed literature using PubMed, ProQuest, ScienceDirect, Google Scholar, and the abstract directories of the International AIDS Society (IAS) published between January 2015 and September 2018. We also accessed articles from business databases such as the Harvard Business Review.  

Results : Results from a VMMC-focused intervention that successfully designed and delivered segmentation-based programs in two countries demonstrated that it is possible to adapt private sector approaches. However, within the sector of global development that is most familiar with segmentation, these efforts rarely go beyond basic demographic segments.

Conclusions : Existing published material tends not to measure the impact of segmentation itself, but the impact of the intervention to which segmentation was applied, which makes it challenging for the development sector to invest in the approach without evidence that it works. Nonetheless, the experiences of segmentation and demand creation for VMMC do highlight the opportunity for better integrating this approach in HIV prevention and in global development and measurement initiatives.

List of abbreviations

AGYW        Adolescent girls and young women

EAMC        Early adolescent male circumcision

EIMC         Early infant male circumcision

HIV            Human immunodeficiency virus

IAS            International AIDS Society

PrEP          Pre-exposure prophylaxis

VMMC      Voluntary medical male circumcision

Introduction

Defining discrete audiences within a target population is a marketing approach used widely in the commercial world, where strong understanding of a consumer segment is directly tied to profits. Even private sector giants have had massive failures due to poor consumer understanding: Coca-Cola’s C2 drink is an example. In order for companies to make targeted marketing decisions they rely on segmentation, the process of dividing “a market into smaller groups of buyers with distinct needs, characteristics, or behaviors who might require separate products or marketing mixes” 1 . Most commonly, markets are segmented by geographic, demographic, psychographic (psychological attributes such as values, attitudes, and beliefs), and behavioural factors 2 . The resulting breakout can then be used to make strategic decisions about whom to reach and how to connect meaningfully with them through product and service experiences. A specific market segment includes individuals with similar preferences and characteristics, and different market segments are clearly differentiated ( Table 1 presents and discusses characteristics associated with useful market segments) so that the campaigns, products, and marketing tools applied to them can be implemented without overlap. Moreover, a set of criteria are typically used to define a segment – to identify individuals who share those characteristics – and those who do not fit that segment’s criteria fall into a different segment. The value of these segments is to have clear characteristics associated with a set of marketing approaches and, in turn, to drive quantifiable outcomes 3 .

1) Identifiable/Differentiable. Customers in each segment should possess measurable key attributes – such as usage and consumption
behaviours and purchasing preferences – that clearly distinguish them from customers in other segments.
2) Substantial. According to Harvard Business Review’s Gavett, “It’s usually not cost-effective to target small segments – a segment,
therefore, must be large enough to be potentially profitable.” In Gichuru’s words, “Marketing segments must be large enough to meet the
financial needs of the company and the product.”
3) Accessible. Are communication and distribution channels in place to reach each segment? To be useful, segments must be accessible
through promotional tools.
4) Sustainable. Gavett states that “a segment should be stable enough for a long enough period of time to be marketed to strategically.”
This points away from basing segmentation on attributes that tend to fluctuate, such as lifestyle.

Sources: https://hbr.org/2014/07/what-you-need-to-know-about-segmentation

http://ijecm.co.uk/ ISSN 2348 0386

https://www.iiste.org/Journals/index.php/EJBM/article/viewFile/647/540

While some broad elements of segmentation, such as age and geography, have been applied in the development sector, the power of behavioural and psychographic segmentation has been largely overlooked. According to Samuel, “psychographics, which measure customers’ attitudes and interests rather than ‘objective’ demographic criteria, can provide deep insight that complements what we learn from demographics” (see Harvard Business Review article on psychographics ). This type of segmentation provides a deeper understanding of the desires, needs, and decision-making considerations of a potential user of a product or service. Applied correctly, it could enhance efficacy of public health initiatives, ensure new products reach the people most likely to need and use them, and increase the likelihood of lasting behavioural change. Using the example of voluntary medical male circumcision (VMMC), this paper shows how segmentation has been applied in development and discusses the challenges in both measuring and adopting segmentation as part of program design.

In an era of constrained funding for HIV primary prevention basics like demand creation for male and female condoms, there is a need to be more targeted with resources in order to reach the right people with the right intervention – a higher priority than reaching all people. Groups sharing common attributes are more likely to respond similarly to a given demand creation strategy, but addressing all men aged between 15–19 as if they are identical is unlikely to result in cost-efficient, relevant, or relatable communication, whether it takes place at an interpersonal (e.g., peer educator) or mass level.

To better understand what the literature reveals about the barriers to and benefits of segmentation, both for VMMC in particular and for demand creation for HIV preventive interventions more broadly, we queried scholarly databases, health and innovation journals, and other publications for studies, analyses, and peer-reviewed articles on segmentation published between January 2015 and September 2018. The types of literature considered included case studies, systematic reviews, meta-analysis, and journal articles. We performed a detailed search of the existing peer-reviewed literature using PubMed , ProQuest , ScienceDirect , Google Scholar , and the abstract directories of the International AIDS Society (IAS). We searched these sources for keywords including HIV prevention, segmentation, demand creation, demand generation, innovation, behaviours, human-centred design, segmentation evaluation, campaign qualitative evaluation, campaign success, campaign failure, target audience, social marketing, campaign lessons learned, and VMMC. Our primary application of interest was VMMC, with a secondary emphasis on HIV prevention programs related to adolescent girls and young women (AGYW). Sub-Saharan Africa was the priority region of interest, with examples elsewhere in low and middle-income countries a subordinate focus. The hundreds of results generated by these criteria were then individually examined and collated by subject-matter experts for relevance to different main aspects of this particular piece of work; full citations were developed for 61 select articles, along with complete summaries of their significance. A second round of keyword searching was conducted, this time expanding sources queried from the listed databases to include premier marketing journals and publications in order to provide more complete information regarding existing applications of segmentation (as an approach predominantly employed and evaluated predominantly within the context of private sector work). The resulting, updated bibliography included 40 sources and was reviewed a second time to produce the comprehensive list of sources (44) which informed initial drafting of the literature review text. After subsequent rounds of writing and editing, including extracting information deemed unnecessary and as a result eliminating some citations, this literature review’s current bibliography has been updated to represent its use of 37 carefully curated sources. It should be noted that this literature review did not include a need to address consent issues.

Our findings indicate that market segmentation has, to date, most often been applied to global health fields that function for profit and, as such, tend to view their clients as customers, with preferences to discover and cater to, rather than as patients with needs that are often assumed to be homogenous within a broadly defined population, such as by diagnosis or perceived need. The latter view is associated with interventions designed from the top down. However, areas of global health with a development focus or non-profit structure and culture have been slower to adopt or apply market segmentation. Moreover, existing published materials on its applications tend not to measure the impact of segmentation itself, but the impact of the intervention to which segmentation was applied, generally through uptake measures.

On top of the traditional difficulties of measuring impact in the global health and development space (e.g., long observation times, difficult data gathering, complex influences on decision-making), segmentation requires additional thought to isolate the impact of the strategies themselves. While we are able to observe that using segmentation strategies offers a framework to design more nuanced and resonant interventions, it is difficult to isolate the impact of segmentation from the many other steps, methodologies, and strategies that are part of a robust human-centred design process. As multi-disciplinary consortiums take on multi-year projects, it becomes increasingly difficult to isolate specific contributions when evaluating overall efficacy.

It is also difficult to measure impact because a segmentation strategy is not a product or a message or an experience on its own; it is a vehicle for developing them. In this way, the segmentation step is not tested directly, but indirectly through the things it produces – making it challenging to draw an objective cause and effect relationship.

Because the value and impact are difficult to isolate, segmentation is often overlooked or attempted unsystematically. Sgaier et al. determined that approaches to demand generation were inconsistent, not evidence-based, and poorly coordinated. This work goes on to say that “political and social factors, including ignorance of the need for strategic demand generation, may contribute to inadequate funding and focus.” The authors found that there was scant evidence on approaches to demand generation for VMMC, both in terms of understanding drivers of demand and in terms of evaluating existing interventions 4 .

In a later article for the Stanford Social Innovation Review, Sgaier et al . expanded on these issues, citing particular cases. For example, in Niger, a limited understanding of the value of segmentation at the highest levels caused discussions with governments and partners to drag on for more than three years before segmentation could begin to be implemented. Even once segmentation is underway, Sgaier et al . note myriad issues, including restricted ability to design the research, limited number of people with experience in the segmentation process, and difficulty transferring the findings into large-scale programs 2 . In the case of HIV prevention, the psychographic measures of risk perception and belief about the efficacy of a particular intervention proved to be a more effective approach to the segmentation of men into target audiences than basic demographic distinctions. Designing a program of social promotion that is tailored for groups based on these values, attitudes, and decision-making requirements creates the greatest likelihood for change in HIV-related behaviours among each segment 5 .

Meanwhile, Terris-Prestholt and Windmeijer looked at interventions that promote behaviour change. They determined that the impact of interventions aimed at the ongoing behaviours that are relevant to prevention are slow to take hold and should therefore be evaluated over a longer timeframe 6 . However, current funding mechanisms generally do not allow sufficient time for such change to take place and be observed, making interventions based on segmentation hard to evaluate.

Though there are difficulties in exact measurement, the most compelling evidence for the efficacy of segmentation is to look at when and how it is used. Specifically, it is often brought in when nothing else is working or when there is a drop-off in the efficacy of an intervention. This typically occurs when generalized, homogenous efforts have reached the most easily persuaded among the target audience and nuanced interventions are needed for more resistant members of the target audience. This was the case with VMMC, where early adopters of the procedure had been reached and demand was plateauing ( Figure 1 ).

An external file that holds a picture, illustration, etc.
Object name is gatesopenres-2-13982-g0000.jpg

Source: https://www.avac.org/sites/default/files/resource-files/AVACreport2018.pdf . This graphic has been reproduced with permission from AIDS Vaccine Advocacy Coalition (AVAC).

Segmentation can also be used to investigate why efficacy has been uneven. The knowledge gained from segmentation can be used to design specific interventions (e.g., messaging, experiences, campaigns) in situations where demand is lagging, or to prioritize outreach to specific groups if resources are constrained. This is especially true when used in conjunction with human-centred design, another methodology increasingly applied in global health programs when traditional strategies start to produce declining returns on investment.

Despite challenges, segmentation also offers a large range of opportunities for the sector. The Reproductive Health Supplies Coalition is one of a growing number of organisations in development that acknowledge that segmentation provides empirical evidence which can help guide the most efficient and effective use of resources. The Coalition states that “market segmentation data can shape family planning…(and) it can increase market efficiency for the government stewards of public resources” 7 . It goes on to say that policymakers can use market segmentation research to “draft evidence-based policy initiatives, giving government officials a more useful context to decide which policies are worth enacting.”

The identification of segments can also guide decisions about how to meet performance and delivery objectives within a health system. For example, segmentation studies can identify whether one group of end users will likely access a prevention service in a private clinic rather than a public one. Most obviously, segmentation can be used to guide communication at all consumer touchpoints from the community to the clinic, and on a mass scale.

The application in VMMC programs

The example of VMMC lends itself well to a segmentation approach. The audience of those who may undergo the procedure already encompasses distinct subdivisions by age (e.g., early infant male circumcision (EIMC), early adolescent male circumcision (EAMC), “catch-up” population segments comprised of older men). The literature is replete with examples of such demographic segmentation and its applicability to developing more effective promotions, including via social marketing 8 – 13 . Notable examples include the Kingdom of Eswatini’s successful VMMC program, which prioritised EIMC as the “sustainment” component of a comprehensive set of VMMC interventions for multiple age brackets 14 ; Lane et al. ’s supplement of nine studies across South Africa, Zimbabwe, and Tanzania, which targeted 10–14-year old adolescent males through messaging around key incentives or barriers to VMMC uptake (motivation, counselling, wound healing, parental involvement, female peer support, quality of in-service communication, and providers' perceptions) 15 ; and an observational prospective intervention study in the Orange Farm township of South Africa, which successfully obtained male circumcision prevalence of 80% among adult men within just three months 16 . Modelling investigations likewise find age to be a beneficial, and in some cases particularly cost-effective 17 , 18 , basis for market segmentation 19 , 20 . Geographic segmentation is similarly common 13 , 17 – 19 , 21 , and through modern mapping technology can offer novel applications 21 , 22 .

Importantly, however, recent work also emphasizes the need for segmentation that goes beyond age to distinguish among individuals’ perceived motivations or disincentives for VMMC, as well as beyond the candidate population for VMMC to highlight the role of decision-making “influencers” (i.e., female partners, family members like parents, grandparents, and parents-in-law, trusted community leaders like sports team coaches), who may be effectively targeted through social marketing promotions of VMMC to encourage its uptake among the men in their lives 23 . Figure 2 depicts a representation of segments applied to VMMC in Zambia.

An external file that holds a picture, illustration, etc.
Object name is gatesopenres-2-13982-g0001.jpg

Source: https://healthcommcapacity.org/wp-content/uploads/2017/06/Albert-Machinda-Society-for-Family-Health.pdf This graphic has been reproduced with permission from The Bill and Melinda Gates Foundation.

The work to segment Zambian and Zimbabwean men along behavioural and psychographic lines provides the most straightforward example of a non-age-based population dissection with findings readily applicable to social marketing interventions. In this case, men were segmented in alignment with factors that motivated and/or supported them on a personal level to undergo VMMC. Men were also segmented according to influences at the community or structural level that discouraged or encouraged uptake 24 . Another important finding from the “influencer” cluster of studies relates to how campaigns are conducted after segmentation. VMMC candidates, perhaps due to the intimate nature of the decision and procedure, exhibit a strong preference for individual over mass communication on this issue 9 , 25 . This preference is borne out by both the insignificant results yielded in a study which leveraged the mass communication platform of SMS to deploy VMMC-related information and counselling 26 , as well as by two unsuccessful VMMC promotion case studies which cited a lack of consideration for sociocultural context (such as including the perspectives and gaining the support of “traditional leaders, healers and circumcisers”) as a reason for failure 27 . In demand creation campaigns, personal counselling or one-to-one approaches favourably impacted willingness to undergo VMMC or to consider it for one’s dependents. Quality market segmentation conducted before beginning such an individualised intervention has the potential to make this otherwise costly and labour-intensive – yet highly effective – approach more feasible to implement 9 , 16 , 28 .

Market segmentation can also identify whom not to target. A 2015 analysis “explored correlates of male circumcision status among men and their social, economic, health and sexual behaviour factors.” This analysis provided characteristics for better targeting and intervention design. In this case, limited resources for uptake campaigns could be directed toward populations of greatest need and minimize the use of resources directed at segments that were unlikely to choose VMMC under any circumstance 29 .

There is also some evidence for market segmentation’s value in creating demand for HIV preventive services outside of VMMC. Cremin et al . use a mathematical model to isolate certain subsets of Nairobi’s population in which HIV incidence is on the rise, in contrast to its trend of decline at the city level, and to suggest optimal interventions for reducing HIV infection among these high-risk groups 30 . Reed et al . extrapolate lessons learned from VMMC scale-up in the region to support oral pre-exposure prophylaxis (PrEP) expansion among another targeted market segment 31 . Also addressing the AGYW audience, Celum et al . explore how social marketing and innovative market segmentation can increase demand for and optimise uptake and effective use of PrEP 32 ; Eakle et al . point to this potent combination as a particularly useful means of enhancing PrEP demand generation among more sceptical communities 33 ; and Luecke et al . examine the demographic and behavioural correlates of preferred PrEP formulations, arguing that a deeper understanding of women’s product preferences can guide not only product development but also drive demand creation through social marketing 34 . Sgaier, too, opines from family planning work in India that psychographic-behavioural segmentation can better forecast demand and, “as in the private sector, a staged market launch that actively stimulates uptake can be used to match appropriate products to suitable customers” (see devex article on contraception in Uttar Pradesh women ). Ayikwa and Jager advocate for social marketing as the “ultimate weapon” in combating HIV/AIDS transmission and overcoming related stigmas 35 .

All these papers corroborate Rao and McCoy, who state that “behaviour change isn’t just about crafting the perfect message; creating better programs requires really listening to and understanding the patient experience” [ https://ssir.org/articles/entry/fostering_behavior_change_for_better_health#bio-footer ]. They stress that “borrowing tools from the private sector…to understand, track, and influence customers can greatly enhance global health programs that require changes in attitudes or behaviour.”

Discussion / Recommendations

The literature collectively drive toward the idea that what the public health sphere needs is a new mindset. Instead of viewing its target audience as patients with diagnoses, the audience should be seen as multidimensional consumers with preferences and needs. Current public health segmentation has been almost solely demographic; though valuable at a basic level, this is rudimentary 36 . Research marketing pioneer Daniel Yankelovich states in the Harvard Business Review that demographic segmentation “implies that differences in reasons for buying, in brand choice influences, in frequency of use, or in susceptibility will be reflected in differences in age, sex, income, and geographical location. But this is usually not true” (see Harvard Business Review article on market segmentation ). Basically, age segmenting can only be useful as a very general indication of patterns of behaviour, as not everyone in the same age band will behave the same way or respond the same way to experiences. Even “geodemographic classifications such as ACORN (a classification of regional neighborhoods), while useful for indicating likely very general patterns of spending power, do not reveal the absurd assumption that everyone…drives the same car, reads the same newspapers, eats the same food and so on” (see The Marketing Journal article on market segmentation ). Demographic and age segmentation are some of the easiest to develop, but they provide limited guidance. In order to successfully effect change, implementers need to connect deeply with their audience by looking past its superficial characteristics to the attitudinal, behavioural, and contextual factors that guide its members’ decision-making. This does not mean that public health solutions should be fragmented, but rather that precise messaging – still framed in the broader context of health goals – can be developed for and directed to audiences with whom it is most likely to resonate strongly 4 .

Conclusions

At this stage, it is not yet possible to definitively conclude that market segmentation leads to measurably better HIV prevention results, but it can be asserted that market segmentation leads to interventions with measurably better HIV prevention results. The literature present ample evidence for the value of market segmentation as a component of demand creation for HIV prevention interventions, including VMMC and, more recently, oral PrEP. While traditional applications of market segmentation in healthcare, such as age and geography, remain useful as components of more nuanced population stratifications, behavioural-psychographic segmentation presents the greatest potential for efficacy in uptake of HIV prevention measures, both broadly and in the case of VMMC specifically. In a later article, also for the Harvard Business Review, Yankelovich and Meer write that “non-demographic segmentation began more than 40 years ago as a way to focus on the differences amongst customers that matter most strategically” 37 . Ultimately, what the public health sphere needs is a shift: from considering possible users of its products as a single group with the same desires and behaviours based on age or location to seeing them as multidimensional consumers with individual preferences and needs.

Data availability

Acknowledgements.

Thanks are extended to Jeanne Baron, Mitchell Warren, and Emily Bass from AVAC for their assistance with manuscript preparation and helpful comments throughout the process.

[version 1; peer review: 2 approved, 1 approved with reservations]

Funding Statement

This work is supported by the Bill and Melinda Gates Foundation [OPP1161329]. The AIDS Vaccine Advocacy Coalition’s (AVAC) product introduction and access efforts are supported through grants from the Bill and Melinda Gates Foundation, via the HIV Prevention Market Manager grant agreement [OPP1135316], and by the generous support of the American people through the U.S. President’s Emergency Plan for AIDS Relief (PEPFAR) and the U.S. Agency for International Development (USAID), via the OPTIONS Consortium Cooperative Agreement No. AID-OAA-A-15-00035). The content of this article is solely the responsibility of the authors and does not necessarily represent the views of the Gates Foundation, PEPFAR, USAID, or the United States Government. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Reviewer response for version 1

Jason b. reed.

1 Jhpiego, Baltimore, MD, USA

Manya Dotson

Sema sgaier.

2 Surgo Foundation, Washington, D.C., USA

3 Harvard T.H. Chan School of Public Health, Boston, MA, USA

4 University of Washington, Seattle, WA, USA

The referee team finds the article to be an important contribution to the field because it reinforces the potential value of an underappreciated approach to global development/non-profit public health programming, specifically related to program uptake. A case is made that the private sector is better at market segmentation. Development/public health donors often prescribe demographic and behavioral sub-populations—sex, age, geographic area, wealth quintile—for program services based upon epidemiology and return on investment expectations. Implementers may simply resort to the same or similar sub-populations for structuring demand creation campaigns. Going beyond these conventional public health categories to include psychographic and behavioral segments may represent an important opportunity for greater cost-effectiveness in program execution, by more efficiently influencing uptake. However, the authors stop short of describing how the private sector (more) successfully determines psycho-behavioral market segments and acts upon them in selection of messages and channels. This may be because little is published on methodologies from the private sector, which has little incentive to share proprietary approaches. This may well be beyond the scope of this paper, but some discussion of bases of segmentation could be helpful to the reader to understand the concept. As it is, the reader is left with the vague feeling that “everyone is different” and therefore all messaging must be micro-tailored, which would be even less efficient/cost-effective than existing approaches.

The article could be improved by defining terminology upfront, e.g., demographic, geographic, behavioral vs. psychographic market segmentation. A strong definition of segmentation, including the four qualities of viable segments, is provided here for further consideration:

“Heterogeneous markets are divided into homogenous markets by identifying what are known as the appropriate bases of segmentation. An appropriate segmentation base might be age in a

market where young people prefer a different product or service than old people. Sex, socioeconomic status, and residence are common segmentation bases due to assumptions about differing preferences between males and females, the rich and the poor, and the urban and rural. Many other segmentation bases exist, including psycho-graphics and lifestyle. In marketing, the appropriateness of one or more segmentation bases is evaluated in terms of whether each basis is identifiable (able to be measured), actionable (theoretically able to be changed through an intervention), accessible (defines populations able to be reached), responsive (empirically are changed by an intervention), substantial (large in size), and stable (do not change for a period long enough to design, monitor and evaluate an intervention)” Frank et al. (1972 1 ).

The details of parameters for the literature review are clear. The number of professional disciplines of subject matter experts reviewing and summarizing the findings is less clear, though presumably comprised of some or all of the authorship team. Given the disconnect between private and public sector approaches to segmentation, it would be valuable to know whether interpretations of relevance and importance (to impact and as specifically relates to voluntary medical male circumcision) varied by background of the experts.

The main conclusions appear to be framed in relation to public health intervention uptake (or adherence). However, the other a priori interests stemming from market segmentation that may be of equal (or arguably even greater) importance may take precedence, e.g., using segmentation to identify services/products more aligned with potential users preferences, demand forecasting, and preference of where to access products and services (public vs. private sector, formal vs. informal outlets). If market segmentation were more thoughtfully used earlier in processes, segmentation might not be so frequently the approach of last resort. In turn, judging segmentation impact might be broader than solely focused on improvements in uptake.

We have read this submission. We believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Steve Kretschmer

1 DesireLine LLC, Istanbul, Turkey

- Clarify the forms of market segmentation throughout the paper to be specific about which forms of segmentation are being discussed in each point made on its use

- Clarify segmentation as a means to dividing the market and identifying the differences to then target differential interventions/services/products vs. as a method itself to understand and intervene.

  • The study design is appropriate and the work is technically sound
  • Some additional details of methods and analysis need clarification to allow replication by others
  • Statistical analyses not applicable
  • All source data are available
  • The conclusions drawn are adequately supported by the results

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Beth Skorochod

1 CollaborateUp, Washington, DC, USA

The title includes both segmentation and social marketing but social marketing is not defined in the article and is not discussed at any length. I would remove social marketing from the title, or have it play a bigger role in the article (and define it upfront or in a table as has been done with segmentation).

While the authors state that segmentation was not measured extensively, there are too few examples for any reader to try to replicate any work done with segmentation. The work of VMMC, which is used as the main case study for the piece, includes insufficient detail for replication and does not include specific data for Zimbabwe or Zambia. How did behavioral segmentation impact uptake of VMMC in these two countries? What were the challenges of executing against the segmentation results. What was the impact of activities across each of the 7 segments in Zambia? Can additional data on impact and/or details on execution be included for these two countries? Even the VMMC studies mentioned don't seem to point specifically to the examples in Zambia and Zimbabwe.

The definitions of quality market segmentation in this paper are too vague, making the prospect of applying it to a program very difficult. Table 1 provides characteristics of useful market segments but additional detail or examples are needed. Table 1 states that segments should be substantial enough to be profitable or to meet the financial needs of a company. While this works for the commercial sector, it is unhelpful for development work which is typically not about profitability or even cost effectiveness. Can authors include examples of a 'substantial' size that would be applicable and relevant to HIV prevention? In addition, "sustainable" is also used to describe a useful market segment, stating that the segment should be stable enough for a long-enough period of time to be marketed strategically and warns off using something like "lifestyle' which too easily fluctuates. But what is a long-enough period of time to be marketed strategically? Do attitudes and behaviors shift less frequently than 'lifestyle? If not using age or demographics, what type of attributes should be used that will not fluctuate easily and how long do those attributes need to be stable? Better defining or providing concrete examples from public health segmentation (or the commercial sector) will strengthen the piece and make it more relevant to implementers.

The article seems at times to conflate demand generation and demand creation with segmentation. In paragraph 5 of the Results section concludes that segmentation is often overlooked or unsystematic, using the Sgaier article's criticism of poor demand creation approaches as evidence, but these seem two different things. Demand creation does not necessarily include behavioral segmentation. While it may include demographic segmentation, that is not the focus of the piece. TheTerris-Prestholt and Windmeijer article on behavior change interventions is also cited as a reason why segmentation would be hard to evaluate and measure, but the conclusion seems forced unless segmentation is part of the reason behavior change interventions take a longer time to take hold.

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Data Descriptor
  • Open access
  • Published: 14 August 2024

ChineseMPD: A Semantic Segmentation Dataset of Chinese Martial Arts Classic Movie Props

  • Suiyu Zhang 1 , 2   na1 ,
  • Rong Wang 1 , 2   na1 ,
  • Yaqi Wang   ORCID: orcid.org/0000-0002-4627-3392 1 ,
  • Xiaoyu Ma 1 ,
  • Chengyu Wu 3 ,
  • Hongyuan Zhang   ORCID: orcid.org/0000-0002-0657-1862 4 ,
  • Zhi Li 5 &
  • Dingguo Yu 1  

Scientific Data volume  11 , Article number:  882 ( 2024 ) Cite this article

Metrics details

  • Scientific community

Recent advances in computer vision and deep learning techniques have facilitated significant progress in video scene understanding, thus helping film and television practitioners achieve accurate video editing. However, so far, publicly available semantic segmentation datasets are mostly limited to indoor scenes, city streets, and natural images, often ignoring example objects in action movies, which is a research gap that needs to be urgently filled. In this paper, we introduce a large-scale, high-precision semantic segmentation dataset of props in Chinese martial arts movie clips, named ChineseMPD. Specifically, this dataset first establishes segmentation rules and general review criteria for audiovisual data, and then provides semantic segmentation annotations for six weapon props (Gun, Sword, Stick, Knife, Hook, and Arrow) with a summary of 32,992 objects.To the best of our knowledge, this dataset is the largest semantic segmentation dataset for movie props to date. ChineseMPD dataset not only significantly expands the application of traditional tasks of computer vision such as object detection and scene understanding, but also opens up new avenues for interdisciplinary research.

Similar content being viewed by others

types of segmentation research paper

A lightweight weak semantic framework for cinematographic shot classification

types of segmentation research paper

DeepAction: a MATLAB toolbox for automated classification of animal behavior in video

types of segmentation research paper

Transductive meta-learning with enhanced feature ensemble for few-shot semantic segmentation

Background & summary.

Semantic segmentation and scene understanding and have become important research directions in the field of computer vision 1 , 2 , 3 , 4 , 5 , 6 , with applications covering a wide range of areas including autonomous driving 6 , surface crack detection 1 , and so on. Among these task, semantic segmentation is crucial, as recognizing and classifying different instance objects at the pixel level is one of the most important methods for scene understanding. Chinese martial arts movies provide a unique and fascinating research area for semantic segmentation due to their rich visual and cultural heritage. These films are characterized by complex fight scenes and iconic props, elements that play a crucial role in narrative and aesthetics. However, although several seminal datasets such as Cityscapes 7 , PASCAL VOC 2012 8 and COCO Stuff 9 datasets are available, they often focus on semantic segmentation of urban scenes or natural scenes, neglecting prop segmentation in movie scenes. Currently, semantic segmentation of props in Chinese martial arts movies is still challenging due to the lack of benchmark datasets.

Existing movie datasets can be divided into two categories. One category is movie description datasets. Heilbron et al . 10 introduced ActivityNet, a movie video dataset for understanding human activities. The dataset contains 203 videos of different categories that can be used for human activity understanding, e.g., video categorization or activity detection. Tapaswi et al . 11 proposed the MovieQA dataset for purposes aimed at evaluate the ability of algorithms to automatically understand video and text stories. It contains 14,944 multiple-choice questions and the corresponding 5 multiple-choice answers. Huang et al . 12 used MovieNet, a multimodal dataset for movie comprehension, which contains annotations corresponding to different aspects of descriptive text, location and action labels, and detection frames. Although these video scenes are rich, they lack corresponding segmentation labels and are often used for classification tasks such as video classification, scene recognition and sentiment analysis.

There are very few available datasets for segmentation of video objects, except for movie descriptions. Pont-Tuset et al . 13 proposed DAVIS, a public dataset and benchmark designed specifically for the task of video object segmentation.DAVIS contains dense semantic annotations for different objects in different life scenarios. Wei et al . 14 proposed an actor-centered dataset for video object segmentation YouMVOS. This data is labeled only for the segmentation of multiple shots of the actors themselves. Similarly, Ding et al . 15 proposed MOSE, a semantic segmentation dataset containing 5,200 video objects in 36 categories. This dataset aims to explore the ability of artificial intelligence (AI) algorithms for video object segmentation of common objects in complex scenarios. In summary, there is no publicly available dataset for martial arts props, and the only existing publicly available dataset for semantic segmentation of video objects not only differs from martial arts props in terms of object shapes and sizes, but also is not applicable to semantic segmentation of martial arts props.

To address these gaps, this paper introduces ChineseMPD, a semantic segmentation dataset of props from classic Chinese martial arts movies. ChineseMPD provides pixel-level annotations for six categories, including Gun, Sword, Stick, Knife, Hook, and Arrow. Fine annotations are provided for the props in the images through fine-grained annotation and strict review process to ensure the high quality and authenticity of the dataset. The pipeline of our proposed Chinese martial arts film props dataset can be seen from the Fig.  1 . Based on the video data of 8 action movies segments, the dataset provides a summary of 32,992 objects with fine annotations showing different scenes (e.g., fight scenes, training scenes, ritual scenes, rest scenes, and market scenes). The selected clips from Chinese martial arts films feature unique action sequences. Through continuous narrative blocks and a series of individual shots, these clips ultimately present the “chivalrous” plot, which combines aesthetics and storytelling. However, particularly with continuous frame images that contain narrative elements, it is challenging to annotate film clips using automated models. Firstly, due to the limitations of composition, shooting angles, and lighting, the model cannot distinguish between deliberate blurriness and occlusions. Secondly, the differences between continuous frame labels are significant and rely heavily on contextual semantic understanding, making it difficult to correlate the rich semantic information in films. Thirdly, the dynamic object labels within the film clips are often neglected. Our dataset employs fine-grained semantic segmentation techniques to deeply annotate props in Chinese martial arts films. In addition, we have established relevant rules and comments for the extraction and annotation of film clips. In order to more clearly visualize the content of our marks, Fig.  2 provides the specific distribution of various props in A-H movie clips.

figure 1

Pipeline of Chinese martial arts film props dataset. ( A ) Data selection. Select the movie slice with plot and props to determine the category of semantic segmentation; ( B ) Rule establishment. Different colors are used to distinguish the semantic segmentation categories, and the rules for labeling props are established. ( C ) Data labeling. Label the props in film clips. ( D ) Data review. Experienced three more rigorous reviews.

figure 2

Number of props annotations per clip. A–H represent 8 different Chinese martial arts classic movie clips.

Our dataset provides a new perspective to explore and analyze the complex interactions and dynamic changes in videos by complementing the existing semantic segmentation of movie objects. In addition, the establishment of the dataset promotes research in the field of computer vision on cutting-edge technologies such as motion recognition, scene reconstruction, and virtual reality, which offers the possibility of realizing a more intelligent and automated film and television post-production process. Meanwhile, it also provides rich materials for interdisciplinary research, promotes the integration of AI with cultural analysis, historical research and other fields, and opens up new ways for the digital protection and innovative inheritance of traditional cultural heritage.

This section reviews the process of our collection and implementation of the dataset of Chinese martial arts film props. The labeling of props was performed manually with AI assistance, which will be elaborated on in this section. At the same time, we have also established a special annotation and review method for the film props dataset.

Participants

A total of 21 people participated in data labeling and review. Among them, the data labeling personnel is composed of 11 undergraduates, and the reviewing personnel is composed of 3 junior auditors, 2 senior auditors and 5 acceptance personnel. The data labeling personnel received 7 days of theoretical training and labeled 2000 images. The data reviewing personnel conducted research and discussion on international standards and segmentation requirements in the early stage, and gained a comprehensive understanding of video labeling. Moreover, the participants jointly formulated the specification and criteria for the annotation and review of the dataset, such as the review rules for object contours and fuzzy images in the annotation of props.

Data collection

We selected film clips from the China Film Archive ( https://www.cfa.org.cn ) and Zhejiang Communication Television Art Archive ( http://ysys.cuz.edu.cn ) for labeling of film props. The resolution of selected films is 2560 × 1440 and 1920 × 1080. The selection process for film clips starts in October 2021 and ends in March 2022, and the annotation process starts in October 2021 and ends in August 2022. These clips were carefully selected following copyright reviews and are used solely for academic research purposes, adhering to academic standards. Specifically, according to Article 22 of the Copyright Law of the People’s Republic of China, the limited use of published works for teaching and research purposes is permitted under specific conditions. There are no copyright issues involved. Finally, we selected eligible clips from more than 700 movie clips to build our dataset 16 , each clip is about 2 minutes long. It is important to note that the total number of labeled props varies depending on the specific requirements of each film clip and the relevant scene. Swords and knives have more counts than others among the various props.

Data extraction

The plotshot division of the film clip was completed by a graduate student with rich editing experience. The film clip was roughly and finely cut with Adobe Premiere, the film image and audio were aligned, the subtitles of which blocked the images were removed. Finally, the movie clips are extracted at a rate of 4 frames per second. Each movie clip is saved as a JPG image with a resolution of 1920  × 1080 pixels.

Interactive annotation

To ensure the quality and time consuming are within our affordable range, we used interactive annotation tool EISeg 17 to label frame images with high accuracy, which is able to annotate images in an interactive way. In addition, EISeg embedded segmentation algorithms for both the coarse and fine granularity levels, which facilitate the annotation procedure. As shown in Fig.  3 , this method can generate an annotation mask, it is convenient to adjust the polygon vertices of the mask to further improve the accuracy.

figure 3

Illustration of annotated images. The annotated items are distinguished by different colors, and the edge annotation points are connected into a semantic segmentation outline.

Data generation

As shown in Fig.  4 , to make our marked props more clear and more distinct, we have made corresponding annotation examples for the content interacting with the props. The dataset we provide also has relevant labels for obvious characters and scenes, which can provide research references. Specifically, through the previously selected label, the props are manually marked and the edges are corrected. Each segmented shot is annotated with contours referencing the format used in the COCO dataset. The annotations and corresponding labels are subsequently stored in JSON format once the contours are finalized. The default path for saving is the new label folder in the dataset folder, where JSON files of all marked points are stored at the same time. For the reason that it is not necessary to perform semantic segmentation by edge tracing points, but by clicking any part of the props, the method propsed by Benenson et al . 18 was used to generate the mask. In order to ensure the quality of annotation, we have also established a set of annotation checking specifications for the segmentation of film and television elements and props, making our dataset 16 more reliable.

figure 4

Semantic segmentation dataset annotation of props in film clips. The props are Sword, Stick, Hook, Arrow, Knife and Gun. From left to right, they are Original, Colour mask and Foreground. The colour mask unifies the contents of the removed labels into a blue background; Foreground realizes the visualization of annotation types.

Annotation checking specifications

The semantic segmentation types of martial arts film props include Knife, Sword, Gun, Stick, Hook, and Arrow, as shown in Table  1 . In the actual annotation process, we have established a set of standards for props semantic segmentation dataset annotation checking, as shown in Fig.  5 . The composition of the audit team is as follows: primary audit (3 persons), senior audit (3 persons), and senior management (2 persons) to audit and correct the annotation effectively. Advanced audit returns unqualified datasets and corrects them according to the established rules, which can be stated as:

figure 5

Pipeline of props annotation and corresponding checking specifications. ( a ) Procedure of props annotation. ( b ) Procedure of annotation inspection.

For the area of the image to be annotated, we define an indicator called Pixel Boundary Error (PBE), which can be formulated as: \(PBE=\frac{a\cap b}{a\cup b};\) where a  ∩  b is the area of overlap between the actual props to be labeled and the presumed props; a ∪ b is the sum of the area for current labeled props and the actual props. Specifically, we require a PBE of not more than 0.75 for stationary objects and less than 0.85 for moving objects. The examples of unqualified annotations can be seen in Table  2 .

Data Records

This section summarizes the entire processing flow of our dataset. The dataset 16 is open to the public and provide the necessary ways to use and organize data. Researchers can register with ScienceDB to access the FTP download link to our dataset. For ScienceDB account authentication and registration procedures, see https://www.scidb.cn/en . The link to access our dataset is https://www.scidb.cn/en/anonymous/SlpaelFy .

Data selection

The film selection is related to each subsequent step, we chose Chinese martial arts film clips to build the dataset. There are a lot of fighting scenes in Chinese martial arts film clips, and the props interact with people to a large extent. The choice of martial arts props can also cover China’s classic weapons. The martial arts films we first selected are shots lasts about 2 minutes, no shading and blurring are shown in these shots, which ensures the authenticity of the screen. Based on selected shots, the movie clip images are then extracted under the a rate of 4 frames per second for one shot.

Data annotation

Film props account for a small proportion of the film screen. Although the semantic segmentation model has achieved high accuracy in today’s increasingly developed in-depth learning, to ensure that the quality of the dataset is under control, we invited experts with specialized technical backgrounds to participate in the project for early assessment.

Data organization

The organized dataset is composed of several folders. Each folder contains a specific sequence of data. For each of image, three labeled images will be generated, namely the pseudo color image, the grayscale image, and the cutout image. The purpose of generating three labeled images is to increase the intuitive understanding of the segmented image. The annotation points of the image are presented in a JSON file. As shown in Fig.  6 , we name the similar movie clips as the same ID, such as “m00x”. “m00x_fen” is a frame image folder formed after four frames per second segmentation at the shot level. The same level also contains descriptions of segmentation specifications; It is described by “m00x_label.txt” (label semantics and numerical matching relationship information) and “m00x_details.xlsx” (details of labels and related descriptions). The next level is the dataset of x shots, named “m00x_fen00x_dataSet”, which contains m00x_ fen00038_00000001.jpg (the original frame used for annotation) and a label folder, which is used to store the annotated data. Each original frame image contains three labeled images, as well as a JSON file named “annotations.json” with the information about annotation points.

figure 6

Folder level. The entire dataset is divided into four levels. The upper part is the description of the dataset, and the lower part is the actual name. The sections marked black are folders, and the others are in corresponding formats.

Technical Validation

As for the technical verification of dataset, a technical team composed of 5 experts conducted manual checking and sampled the labeled props at the interval of three images. Each expert conducted an independent visual inspection of all labels to ensure the accuracy. The props in this study were manually annotated, so a set of strict technical methods is also established during the inspection. This method takes into account the continuity and consistency between the frames and establishes a judgment standard for the possible fuzziness of props. Images with high dynamic blur are not considered, because they are not useful for the content that may be studied.

During the technical inspection, we use the annotation tool EISeg to visualize the labeled image stacked on the original image, which can help us not to omit other information during the inspection. In addition, we have specified relevant parameters setting for annotation tool before labeling, which can reduce the variance in manual operation. Specifically, the technical standard of sampling can be described as: for dynamic fuzzy or high-speed moving annotation objects, the pixel error is within 3 pixels. For ordinary annotation objects such as props interacting with the scene or stationary props, the pixel error is within 5 pixels. The parameters of EISeg are set as follow: the segmentation threshold is set as 0.5, the label transparency is set as 0.75, and the visualization radius is set as 3.

For quality assurance, five experts will conduct a total of two rounds of visual inspection of the labeled masks according to the above standards, and the labeled images that do not meet the standards will be re-labeled. Table  2 shows the images which do not meet the standard and the corresponding problems, number of error pixels and items. Both two rounds of expert inspection need to meet the above criteria to ensure that the final high quality labels and usage availability. Fig. 7(a),(b) shows the error bars before and after the expert team’s verification in first and second inspection round. It can be conclude that after two rounds of expert inspection, the annotation errors of all classes of props in our dataset reduced from the 2-4 mm to 1.5-3.5 mm, which demonstrates the validity of our proposed criteria, and also proves the effectiveness and great contribution of expert inspection in indicating the quality of the labeling of our dataset.

figure 7

Boxplots of annotation errors in all classes of props. ( a ) The first round inspection by the expert group; ( b ) The second round inspection by the expert group.

In order to prove the validity of our proposed dataset, four classical semantic segmentation models: DeepLabv3+ 19 , FCN 20 , PSPNet 21 , and SegFormer 22 were applied to evaluate the four semantic segmentation metric of aAcc, mIoU, mAcc and mDice, respectively. The definition of these metrics can be formulated as:

where T P ,  T N ,  F P ,  F N stand for the true positive, true negative, false negative, and false negative, respectively. N stands for the total number of classes. ω c stands for the total number of pixels in the class c . It it worth noting that both aAcc and mAcc show the average pixels classification performance of the model. However, the former does not take into account the difference in the number of pixels in different classes, while the latter uses the number of pixels in different classes to perform a weighted calculation. Moreover, both mIoU and mDice measure the average overlap between the model’s prediction results and the real labels in each category, which can well reflect the classification accuracy of the model at the pixel level. The difference is that mDice is less sensitive to noise and boundaries because it focuses more on the overall overlap; mIoU, on the other hand, is more sensitive to the boundary region, because the calculation of the union includes the region of the prediction error.

For the evaluation performance of the baseline models, It can be seen from the Table  3 and Fig. 8 that the performance of each baseline model is different for various metrics. It is worth noting that all the baseline models were not fine-tuned in our dataset, but were directly evaluated using the model weights pre-trained on other datasets to obtain the metric performance. For aAcc, the performance of each model exceeds 94%, showing strong foreground-background pixel classification ability. However, for mIoU and mDice, the evaluation performance of each model is significantly reduced, indicating that our dataset face challenges in dealing with complex, diverse, and culturally specific foreground item segmentation. These challenges can stem from items blocking each other, small size, high visual similarity, and a lack of clear boundaries, especially when the item does not have a high contrast to the background.

figure 8

Semantic segmentation evaluation performance for four popular baseline models. The original images and corresponding ground truth are also given. Quantitative results show that the SegFormer performs the best.

In terms of advantages, our dataset provides detailed labeling of traditional Chinese martial arts props, which poses new challenges for developing refined high-precision small-target semantic segmentation models. In addition, our dataset shows a high degree of cultural relevance, which creates a basis for the development of future style-specific generation models and models that require the recognition of specific cultural objects.

Usage Notes

Our open data complies with the license statement under CC BY 4.0, and the datasets 16 should refer to this article when using or referencing research objects. The license allows readers to distribute, remix, tweak and build works, but not to use the dataset for commercial purposes. Researchers using this dataset are required to provide a link to this License Agreement and indicate whether modifications have been made to the original work. We hope that ChinesePMD dataset will be available to more researchers and encourage more authors to publish their optimized codes and models, which will contribute to the development of semantic segmentation research in the film and television industry.

Code availability

Our datasets 16 is public available in Science DB ( http://www.scidb.cn/en ), input link ( https://www.scidb.cn/anonymous/SlpaelFy ) or https://doi.org/10.57760/sciencedb.07008 . The JSON file is in the label folder under the lens dataset. In the label folder, you can also see some visual annotation content after segmentation. If you want to use it, you can directly call the JSON file named “annotations.json” in each dataset. The code for technical validation is public code, the dataset is accessible via the DOI of science DB, and the software EISeg for annotation is open source.

Siriborvornratanakul, T. Downstream semantic segmentation model for low-level surface crack detection. Advances in Multimedia 2022 , 3712289 (2022).

Article   Google Scholar  

Nilsson, D. Data-efficient learning of semantic segmentation. Lund University (2022).

Bressan, P. O. et al . Semantic segmentation with labeling uncertainty and class imbalance applied to vegetation mapping. International Journal of Applied Earth Observation and Geoinformation 108 , 102690 (2022).

Kittipongdaja, P. & Siriborvornratanakul, T. Automatic kidney segmentation using 2.5 d resunet and 2.5 d denseunet for malignant potential analysis in complex renal cyst based on ct images. EURASIP Journal on Image and Video Processing 2022 , 5 (2022).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Monasterio-Exposito, L., Pizarro, D. & Macias-Guarasa, J. Label augmentation to improve generalization of deep learning semantic segmentation of laparoscopic images. IEEE Access 10 , 37345–37359 (2022).

Abdigapporov, S., Miraliev, S., Kakani, V. & Kim, H. Joint multiclass object detection and semantic segmentation for autonomous driving. IEEE Access 11 , 37637–37649 (2023).

Dataset, C. Semantic understanding of urban street scenes. Germany: City Shapes (2016).

Everingham, M., Gool, L. V., Williams, C. K. I., Winn, J. M. & Zisserman, A. The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88 , 303–338, https://doi.org/10.1007/s11263-009-0275-4 (2010).

Lin, T. et al . Microsoft COCO: common objects in context. In Fleet, D. J., Pajdla, T., Schiele, B. & Tuytelaars, T. (eds.) Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V , vol. 8693 of Lecture Notes in Computer Science , 740–755, https://doi.org/10.1007/978-3-319-10602-1_48 (Springer, 2014).

Caba Heilbron, F., Escorcia, V., Ghanem, B. & Carlos Niebles, J. Activitynet: A large-scale video benchmark for human activity understanding. In Proceedings of the ieee conference on computer vision and pattern recognition , 961–970 (2015).

Tapaswi, M. et al . Movieqa: Understanding stories in movies through question-answering. In Proceedings of the IEEE conference on computer vision and pattern recognition , 4631–4640 (2016).

Huang, Q., Xiong, Y., Rao, A., Wang, J. & Lin, D. Movienet: A holistic dataset for movie understanding. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV 16 , 709–727 (Springer, 2020).

Pont-Tuset, J. et al . The 2017 davis challenge on video object segmentation. arXiv preprint arXiv:1704.00675 (2017).

Wei, D. et al . Youmvos: an actor-centric multi-shot video object segmentation dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 21044–21053 (2022).

Ding, H. et al . Mose: A new dataset for video object segmentation in complex scenes. In Proceedings of the IEEE/CVF International Conference on Computer Vision , 20224–20234 (2023).

Wang, Y. et al . Semantic segmentation dataset of Chinese martial arts classic movie props. ScienceDB https://doi.org/10.57760/sciencedb.07008 (2023).

Liu, Y. et al . Paddleseg: A high-efficient development toolkit for image segmentation 2101.06175 (2021).

Benenson, R., Popov, S. & Ferrari, V. Large-scale interactive object segmentation with human annotators. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , 11700–11709 (2019).

Chen, L., Zhu, Y., Papandreou, G., Schroff, F. & Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Ferrari, V., Hebert, M., Sminchisescu, C. & Weiss, Y. (eds.) Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part VII , vol. 11211 of Lecture Notes in Computer Science , 833–851 https://doi.org/10.1007/978-3-030-01234-2_49 (Springer, 2018).

Long, J., Shelhamer, E. & Darrell, T. Fully convolutional networks for semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015 , 3431–3440 https://doi.org/10.1109/CVPR.2015.7298965 (IEEE Computer Society, 2015).

Zhao, H., Shi, J., Qi, X., Wang, X. & Jia, J. Pyramid scene parsing network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017 , 6230–6239, https://doi.org/10.1109/CVPR.2017.660 (IEEE Computer Society, 2017).

Xie, E. et al . Segformer: Simple and efficient design for semantic segmentation with transformers. In Ranzato, M., Beygelzimer, A., Dauphin, Y. N., Liang, P. & Vaughan, J. W. (eds.) Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual , 12077–12090, https://proceedings.neurips.cc/paper/2021/hash/64f1f27bf1b4ec22924fd0acb550c235-Abstract.html (2021).

Download references

Acknowledgements

This research is supported in part by the Zhejiang Provincial Natural Science Foundation of China (No.LTGG24F030002); the National Natural Science Foundation of China (No. 62206242); the Public Welfare Technology Application Research Project of Zhejiang Province, China (No. LGF21F010001).

Author information

These authors contributed equally: Suiyu Zhang, Rong Wang.

Authors and Affiliations

College of Media Engineering, Communication University of Zhejiang, Hangzhou, 310018, China

Suiyu Zhang, Rong Wang, Yaqi Wang, Xiaoyu Ma & Dingguo Yu

Key Lab of Film and TV Media Technology of Zhejiang Province, Hangzhou, 310018, China

Suiyu Zhang & Rong Wang

Department of Mechanical, Electrical and Information Engineering, Shandong University, Weihai, 264209, China

School of Biomedical Engineering, Shenzhen University, Shenzhen, Guangdong, China

Hongyuan Zhang

School of Automation, Hangzhou Dianzi University, Hangzhou, 310018, China

You can also search for this author in PubMed   Google Scholar

Contributions

Suiyu Zhang is responsible for collecting film and television data, organizing related materials, dataset buliding, and funding acquisition. Rong Wang is responsible for original draft writing and graphic drawing. Yaqi Wang was responsible for funding acquisition, supervision, and data curation. Xiaoyu Ma was responsible for the technical verification of the dataset. Chengyu Wu is responsible for manuscript revision and data visualization, Hongyuan Zhang is responsible for manuscript revision and language rewriting. Zhi Li is responsible for content grammar checking. Dingguo Yu is responsible for supervision, project administration and funding acquisition.

Corresponding authors

Correspondence to Yaqi Wang or Dingguo Yu .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Zhang, S., Wang, R., Wang, Y. et al. ChineseMPD: A Semantic Segmentation Dataset of Chinese Martial Arts Classic Movie Props. Sci Data 11 , 882 (2024). https://doi.org/10.1038/s41597-024-03701-6

Download citation

Received : 05 January 2023

Accepted : 30 July 2024

Published : 14 August 2024

DOI : https://doi.org/10.1038/s41597-024-03701-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

types of segmentation research paper

bioRxiv

Statistical estimation of sparsity and efficiency for molecular codes

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jun Hou Fung
  • For correspondence: [email protected]
  • ORCID record for Mathieu Carrière
  • ORCID record for Andrew J. Blumberg
  • Info/History
  • Preview PDF

A fundamental biological question is to understand how cell types and functions are determined by genomic and proteomic coding. A basic form of this question is to ask if small families of genes or proteins code for cell types. For example, it has been shown that the collection of homeodomain proteins can uniquely delineate all 118 neuron classes in the nematode C. elegans. However, unique characterization is neither robust nor rare. Our goal in this paper is to develop a rigorous methodology to characterize molecular codes. We show that in fact for information-theoretic reasons almost any sufficiently large collection of genes is able to disambiguate cell types, and that this property is not robust to noise. To quantify the discriminative properties of a molecular codebook in a more refined way, we develop new statistics - partition cardinality and partition entropy - borrowing ideas from coding theory. We prove these are robust to data perturbations, and then apply these in the C. elegans example and in cancer. In the worm, we show that the homeodomain transcription factor family is distinguished by coding for cell types sparsely and efficiently compared to a control of randomly selected family of genes. Furthermore, the resolution of cell type identities defined using molecular features increases as the worm embryo develops. In cancer, we perform a pan-cancer study where we use our statistics to quantify interpatient tumor heterogeneity and we identify the chromosome containing the HLA family as sparsely and efficiently coding for melanoma.

Competing Interest Statement

The authors have declared no competing interest.

View the discussion thread.

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Twitter logo

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
  • Animal Behavior and Cognition (5522)
  • Biochemistry (12563)
  • Bioengineering (9421)
  • Bioinformatics (30801)
  • Biophysics (15839)
  • Cancer Biology (12908)
  • Cell Biology (18505)
  • Clinical Trials (138)
  • Developmental Biology (9995)
  • Ecology (14966)
  • Epidemiology (2067)
  • Evolutionary Biology (19148)
  • Genetics (12729)
  • Genomics (17527)
  • Immunology (12669)
  • Microbiology (29696)
  • Molecular Biology (12360)
  • Neuroscience (64682)
  • Paleontology (479)
  • Pathology (2000)
  • Pharmacology and Toxicology (3449)
  • Physiology (5324)
  • Plant Biology (11084)
  • Scientific Communication and Education (1728)
  • Synthetic Biology (3063)
  • Systems Biology (7682)
  • Zoology (1728)

Market Segmentation: Understanding It, Doing It, and Making It Useful

  • In book: Market Segmentation Analysis (pp.3-9)

Sara Dolnicar at The University of Queensland

  • The University of Queensland

Bettina Grün at Wirtschaftsuniversität Wien

  • Wirtschaftsuniversität Wien

Friedrich Leisch at University of Natural Resources and Life Sciences Vienna

  • University of Natural Resources and Life Sciences Vienna

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Patiphol Yodsurang

  • Asadaporn Kiatthanawat
  • Pega Sanoamuang
  • Wandee Pinijvarasin

Omri Raiter

  • Jane Nikolitsch

Thomas Aichner

  • Laura Da Ros
  • Kris Nagdev
  • INT J RES MARK

John H. Roberts

  • EUR J MARKETING

Ali Kara

  • J OPER RES SOC

Gary L. Lilien

  • INT J MARKET RES

Jens Maier

  • John Saunders

Caroline Tynan

  • JENNIFER L. DRAYTON
  • J MARKETING
  • Wendell R. Smith
  • J.A. Saunders
  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

Research on Market Segmentation of Tourism Highway Travelers Based on Travel Behavior

  • Conference paper
  • First Online: 14 August 2024
  • Cite this conference paper

types of segmentation research paper

  • Wentao Liu 38 ,
  • Leqi Cui 39 ,
  • Xiqiao Zhang 39 &
  • Bing Tian 38  

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1209))

Included in the following conference series:

  • International Conference on SmartRail, Traffic and Transportation Engineering

With the arrival of the era of mass tourism, tourism highway have become an important carrier for promoting the integrated development of transportation and tourism. This paper takes the Jilin section of the G331 highway as an example and conducted a questionnaire survey on 222 tourists. Using exploratory factor analysis, five common factors of tourist motivations were extracted: leisure and relaxation, return to nature, exploration and trial, socializing, and knowledge-seeking. Based on this, cluster analysis was conducted, yielding four main groups: self-improvement type, ecological nature type, novelty-seeking type, and dual-goal-seeking type. It was found that different types of tourists vary in the number of companions, residence, and transportation characteristics. These findings contribute to a deeper analysis of the tourist market for tourism highway, providing a theoretical reference for the practical management of tourism highway.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Mengmeng, W., Yaping, K., Chen Bing, G., Xiaofeng,: Concept, attributes, and classification of tourism highways. Highway 64 (03), 176–181 (2019)

Google Scholar  

Jin, H.: Exploration of innovative design concepts and practices for tourism highways. Highway 63 (11), 220–223 (2018)

Smith, W.R.: Product differentiation and market segmentation as alternative marketing strategies. Market. Manage. (1995)

Yang Jianming, Y., Yaling, Y.L.: Market segmentation of tourists in Fuzhou national forest park: factor-cluster analysis based on recreational motivation. Scientia Silvae Sinicae 51 (09), 106–116 (2015)

Dann, G.M.S.: Tourism motivation: a appraisal. Ann. Tour. Res. 8 (2), 187–219 (1981)

Article   Google Scholar  

Guofang, L., Hongyue, L.: Travel behavior of tourists in Chengdu. J. Traffic Transp. Eng. Inf. 14 (02), 36–41 (2016)

Gang, Z., Jiaqi, Z.: Research on market segmentation of elderly tourism based on the perspective of Shi. Resour. Dev. Market 31 (12), 1540–1544 (2015)

Zhangxin, Y., Zancai, X., Yueliang, T.: Statistical testing for market segmentation in rural tourism. Stat. Decis. 34 (20), 114–117 (2018)

Yiling, L., Gennian, S.: How does a highway become a tourism resource. Highway 62 (03), 193–198 (2017)

Download references

Author information

Authors and affiliations.

Jilin Provincial Institute of Transportation Science, Changchun, 130012, China

Wentao Liu & Bing Tian

Harbin Institute of Technology, Harbin, 150090, China

Leqi Cui & Xiqiao Zhang

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Xiqiao Zhang .

Editor information

Editors and affiliations.

Beijing Jiaotong University, Beijing, China

Department of Civil Engineering, Toronto Metropolitan University, Toronto, ON, Canada

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper.

Liu, W., Cui, L., Zhang, X., Tian, B. (2024). Research on Market Segmentation of Tourism Highway Travelers Based on Travel Behavior. In: Jia, L., Easa, S., Qin, Y. (eds) Developments and Applications in SmartRail, Traffic, and Transportation Engineering. ICSTTE 2023. Lecture Notes in Electrical Engineering, vol 1209. Springer, Singapore. https://doi.org/10.1007/978-981-97-3682-9_32

Download citation

DOI : https://doi.org/10.1007/978-981-97-3682-9_32

Published : 14 August 2024

Publisher Name : Springer, Singapore

Print ISBN : 978-981-97-3681-2

Online ISBN : 978-981-97-3682-9

eBook Packages : Engineering Engineering (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

You might be using an unsupported or outdated browser. To get the best possible experience please use the latest version of Chrome, Firefox, Safari, or Microsoft Edge to view this website.

What Is Network Segmentation?

Leeron Hoory

Published: Aug 12, 2024, 7:15am

What Is Network Segmentation?

Table of Contents

How network segmentation works, the main benefits of network segmentation, network segmentation vs. microsegmentation, network segmentation vs. internal segmentation, network segmentation and zero trust, bottom line, frequently asked questions (faqs).

Network segmentation refers to the process of partitioning a network into subnetworks. There are several reasons a company or organization may want to segment its network, including to improve security and performance. Read on to learn about the main benefits of network segmentation and why it’s important.

These days, networks are both physical and on the cloud. With so many possible ways to connect to a network, such as BYOD (bring your own devices) and IoT (Internet of Things) , network segmentation is an important way to establish who can access which part of a network and organize traffic flows.

Today, most segmentation is done through software-defined networking (SDN) , which is an approach to networks that allows for flexible network configuration and meets the needs of cloud computing as opposed to traditional physical networks.

Featured Partners

Monday Service

Monday Service

On Monday Service's Website

AI-powered automations and workflows, customized self-service

Freshdesk

On Freshdesk's Website

$25 per user per month

Salesforce

On Salesforce's Website

Zendesk

On Zendesk's Website

The main benefits of network segmentation are to improve network security and performance. Improving network security will protect a company in the long term, while improving performance can both increase business value and improve user experience.

Network Segmentation Improves Security

Network segmentation is an important part of network security. “Companies typically segment their networks based on the sensitivity and security requirements of their systems,” says Josh Amishav-Zlatin , founder and CEO of the data breach monitoring company Breachsense.

Segmentation gives control over who can access which segments of the network. It is also important because it means that if a network breach occurs, the hacker would not have access to the whole network but only the network segment. “Networks without segmentation have a larger attack surface, enabling attackers to exploit vulnerabilities anywhere on the network to escalate privileges and gain unauthorized access to critical systems or sensitive data,” Amishav-Zlatin says.

“For example, systems that touch PII, customer records or sensitive financial or personal details should all be on their own subnet. In the event of a security incident, this will help contain the attack,” says Amishav-Zlatin.

A flat network also makes it difficult to recover from the damage of an attack. “In the event of an attack, the lack of segmentation will hinder incident response efforts to isolate infected systems and prevent malicious users from pivoting to other machines on the network,” Amishav-Zlatin says.

Network Segmentation Improves Network Performance and Bandwidth

“Segmenting a network can help a company improve its bandwidth and performance,” says Dr. Chris Mattmann , Chief Technology and Innovation Officer (CTIO) at NASA Jet Propulsion Laboratory. Different employees of a company require different bandwidth and internet speeds.

For example, the HR team may not need the latest and fastest internet connection, while the data science team may require more bandwidth to run tests. Additionally, a company might want to ensure that the essential services of a network receive the fastest speeds and highest bandwidth. The process of partitioning a network by requirement and priority is a form of segmentation.

“You can’t do segmentation and partitioning without network analytics, which is understanding how people are using them, what the business requirements are for access and more,” Mattmann says. Insight into how networks are being used now and how they need to be used in the future for business purposes will help an organization make informed choices about network segmentation.

For example, a particular project may be, “such a critical portion of our value delivery stream in our business that it needs a dedicated superfast network. So we might partition it for business reasons,” Mattmann says.

Given the implications of network segmentation on speed and performance, it is important not only for improved access but also for business value.

“Microsegmentation” takes the concept of network segmentation one step further. While network segmentation focuses on dividing the network into multiple smaller parts, microsegmentation puts each application in its own zone.

“The advantage of the latter approach is that it gives admins the ability to enforce access controls on the application layer, monitor communication patterns and detect and respond to security threats more effectively,” says Amishav-Zlatin.

Internal segmentation is a subset of network segmentation. “Network segmentation refers to dividing the entire network into subnets, while internal segmentation refers to dividing an organization’s internal network into smaller segments,” Amishav-Zlatin says.

Zero trust is an approach to cybersecurity based on the idea that no user or asset should be trusted by default. One of the concepts of zero trust is the principle of least privilege, which means users should only have access to parts of the network they need and nothing more. This is important from a cybersecurity perspective because granting all users access to all parts of the network means that an internal threat actor would be able to harm the entire network, and an external hacker would be able to attack a larger surface area once inside the network.

Network segmentation can help establish a zero-trust network, as segmentation is a necessary part of limiting user access. To prevent an attack, for example, important information such as credit card numbers and personal identifiable information (PII) should be on their own subnetworks. Amishav-Zlatin also recommends using, “microsegmentation to enforce zero trust and the least-privilege policy for even finer control on how applications are allowed to interact with each other.”

Network segmentation divides a computer network into subnetworks. The main reasons for network segmentation are enhanced performance and improved security. Every business should implement network segmentation, as it’s an important way to establish network security, limit the risk and potential damage of a cyberattack and improve the overall network performance.

What is an example of network segmentation?

A financial company, for example, has a lot of sensitive information that it doesn’t everyone in the company to have access to. Network segmentation can ensure that only the people who have privileged access can see sensitive documents, while other employees cannot.

What is the purpose of network segmentation?

The purpose of network segmentation is to increase security and improve performance. When a network has segments, the surface area is broken into parts, which limit the access an internal or external threat actor can have. Network performance can also improve through network segmentation by controlling which traffic gets directed to which parts of the network.

Who needs network segmentation?

Every company or organization can benefit from network segmentation. Implementing some form of network segmentation will help prevent both internal and external security breaches.

  • Best Help Desk Software
  • Best MSP Software
  • What Is PCAP? Packet Capture Explained
  • 9 Types Of Network Protocols & When To Use Them
  • What Is AMS? Application Management Services Explained
  • What Is A Managed Service Provider (MSP)?
  • What Is Cloud Computing?

Next Up In Business

  • Best VPN Services
  • Best Project Management Software
  • Best Web Hosting Services
  • Best Antivirus Software
  • Best LLC Services
  • Best POS Systems For Small Business

What Is SNMP? Simple Network Management Protocol Explained

What Is SNMP? Simple Network Management Protocol Explained

AJ Dellinger

What Is A Single-Member LLC? Definition, Pros And Cons

Evan Tarver

What Is Penetration Testing? Definition & Best Practices

Juliana Kenny

What Is Network Access Control (NAC)?

Leeron Hoory

How To Start A Business In Louisiana (2024 Guide)

Jacqueline Nguyen, Esq.

How To Start A Business In Pennsylvania (2024 Guide)

Belle Wong, J.D.

Leeron is a New York-based writer with experience covering technology and politics. Her work has appeared in publications such as Quartz, the Village Voice, Gothamist, and Slate.

Grab your spot at the free arXiv Accessibility Forum

Help | Advanced Search

Computer Science > Computer Vision and Pattern Recognition

Title: ensemble architecture in polyp segmentation.

Abstract: In this research, we revisit the architecture of semantic segmentation and evaluate the models excelling in polyp segmentation. We introduce an integrated framework that harnesses the advantages of different models to attain an optimal outcome. More specifically, we fuse the learned features from convolutional and transformer models for prediction, and we view this approach as an ensemble technique to enhance model performance. Our experiments on polyp segmentation reveal that the proposed architecture surpasses other top models, exhibiting improved learning capacity and resilience. The code is available at this https URL .
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as: [cs.CV]
  (or [cs.CV] for this version)
  Focus to learn more arXiv-issued DOI via DataCite

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

COMMENTS

  1. A Comprehensive Review of Image Segmentation Techniques

    The majority of image segmentation algorithms may be. divided into three techniques: boundary-based segmentation, region-based segmentation, and hybrid -based segmentation. [1 3 ]. The first ...

  2. Image Segmentation and its Different Techniques: An In-Depth Analysis

    Image segmentation is the most important part of image processing, separating an image into multiple meaningful parts. In this area of the image segmentation, new technologies emerge day after day. In this paper, an in-depth analysis is carried out on some frequently adopted image segmentation techniques such as thresholding based techniques, edge detection based or boundary based techniques ...

  3. B2B market segmentation: A systematic review and research agenda

    In addition, a non-parametric analysis was conducted to assess the relationship between article focus and CABS journal ranking. Based on Spearman's correlation, research on pure B2B market segmentation is significantly less likely (r: −0.270, n = 88, p < 0.05) to appear in top-tier publication outlets (i.e., JM, JMR, MS, and JAMS) than was research focusing on both B2B and B2C settings.

  4. Techniques and Challenges of Image Segmentation: A Review

    Image segmentation, which has become a research hotspot in the field of image processing and computer vision, refers to the process of dividing an image into meaningful and non-overlapping regions, and it is an essential step in natural scene understanding. Despite decades of effort and many achievements, there are still challenges in feature extraction and model design. In this paper, we ...

  5. Image segmentation Techniques and its application

    There are several. techniques of image segmentation like thresholding. method, region based method, edge based method, clustering methods and the watershed method etc. In this paper we will see ...

  6. (PDF) Market Segmentation Analysis: Understanding It, Doing It, and

    There are two types of segmentation commonly used in tourism: a priori geographical segmentation and a posteriori behavioral segmentation (Dolnicar and Leisch, 2005). ... This paper explores the ...

  7. Image Segmentation Techniques: Statistical, Comprehensive, Semi

    Segmentation has been a rooted area of research having diverse dimensions. The roots of image segmentation and its associated techniques have supported computer vision, pattern recognition, image processing, and it holds variegated applications in crucial domains. To compile the vast literature on machine learning and deep learning-based segmentation techniques and proffer statistical ...

  8. Image Segmentation Techniques Overview

    The technology of image segmentation is widely used in medical image processing, face recognition pedestrian detection, etc. The current image segmentation techniques include region-based segmentation, edge detection segmentation, segmentation based on clustering, segmentation based on weakly-supervised learning in CNN, etc. This paper analyzes and summarizes these algorithms of image ...

  9. [2001.05566] Image Segmentation Using Deep Learning: A Survey

    Image segmentation is a key topic in image processing and computer vision with applications such as scene understanding, medical image analysis, robotic perception, video surveillance, augmented reality, and image compression, among many others. Various algorithms for image segmentation have been developed in the literature. Recently, due to the success of deep learning models in a wide range ...

  10. Semantic Image Segmentation: Two Decades of Research

    Semantic image segmentation (SiS) plays a fundamental role in a broad variety of computer vision applications, providing key information for the global understanding of an image. This survey is an effort to summarize two decades of research in the field of SiS, where we propose a literature review of solutions starting from early historical methods followed by an overview of more recent deep ...

  11. Survey on Image Segmentation Techniques

    Layer-Based Segmentation Methods Layered model: for object detection and image segmentation that composites the output of a bank of object detectors in order to define shape masks and explain the appearance, depth ordering, and that evaluates both class and instance segmentation [10, 21]. This type didn't discuss in this paper.

  12. [2301.07499] A Comprehensive Review of Modern Object Segmentation

    Image segmentation is the task of associating pixels in an image with their respective object class labels. It has a wide range of applications in many industries including healthcare, transportation, robotics, fashion, home improvement, and tourism. Many deep learning-based approaches have been developed for image-level object recognition and pixel-level scene understanding-with the latter ...

  13. Semantic Segmentation

    75. Paper. Code. **Semantic Segmentation** is a computer vision task in which the goal is to categorize each pixel in an image into a class or object. The goal is to produce a dense pixel-wise segmentation map of an image, where each pixel is assigned to a specific class or object. Some example benchmarks for this task are Cityscapes, PASCAL ...

  14. Image segmentation evaluation: a survey of methods

    Image segmentation is a prerequisite for image processing. There are many methods for image segmentation, and as a result, a great number of methods for evaluating segmentation results have also been proposed. How to effectively evaluate the quality of image segmentation is very important. In this paper, the existing image segmentation quality evaluation methods are summarized, mainly ...

  15. The basis of market segmentation: a critical review of literature

    4. A segmentation base is identified as the distinguished base and a model is developed for predicting this base from other (possibly external) variables. The present paper highlights the definition and major basis of market segmentation. This research paper is broadly divided in to four parts.

  16. A Survey of Brain Tumor Segmentation and Classification Algorithms

    In addition, as can be shown in Figure 1 and Figure 2, deep learning-based brain tumor segmentation and classification techniques are becoming the most active research area. In this paper, a comprehensive survey on region growing, shallow machine learning, and deep learning-based brain tumor segmentation and classification methods are presented.

  17. A review on customer segmentation methods for personalized ...

    Based on this paper corpus, we identified a four-phase process consisting of information (data) collection, customer representation, customer analysis via segmentation and customer targeting. With respect to customer representation and customer analysis by segmentation, we provide a comprehensive overview of the methods used in these process steps.

  18. (PDF) Market Segmentation, Targeting and Positioning

    In sum, this chapter explains the three stages of target marketing, including; market segmentation (ii) market targeting and (iii) market positioning. Discover the world's research 25+ million members

  19. The impact of market segmentation and social marketing on uptake of

    This type of segmentation provides a deeper understanding of the desires, needs, and decision-making considerations of a potential user of a product or service. Applied correctly, it could enhance efficacy of public health initiatives, ensure new products reach the people most likely to need and use them, and increase the likelihood of lasting ...

  20. 1 Image Segmentation Using Deep Learning: A Survey

    Image Segmentation Using Deep Learning: A Survey. Plaza, Nasser Kehtarnavaz, and Demetri TerzopoulosAbstract—Image segmentation is a key topic in image processing and computer vision with applications such as scene understanding, medical image analysis, robotic perception, video surveillance, augmented. reality, and image compression, among ...

  21. ChineseMPD: A Semantic Segmentation Dataset of Chinese Martial ...

    Semantic segmentation and scene understanding and have become important research directions in the field of computer vision 1,2,3,4,5,6, with applications covering a wide range of areas including ...

  22. (PDF) Approaches to Customer Segmentation

    07054 ; Phone: (973) 658 1719; Fax: (973) 658 1701; Email: [email protected]. Bruce Cooil acknowledges support from the Dean's Fund for Faculty Research, Owen Graduate School, Vanderbilt ...

  23. Recent progress in semantic image segmentation

    Semantic image segmentation, which becomes one of the key applications in image pro-cessing and computer vision domain, has been used in multiple domains such as medical area and intelligent transportation. Lots of benchmark datasets are released for researchers to verify their algorithms. Semantic segmentation has been studied for many years.

  24. Terrain‐aware path planning via semantic segmentation and uncertainty

    The Journal of Field Robotics is an applied robotics journal publishing theoretical and practical papers on robotics used in real-world applications. Abstract In ground mobile robots, effective path planning relies on their ability to assess the types and conditions of the surrounding terrains. ... This research was supported by the National ...

  25. MFFAE-Net: semantic segmentation of point clouds using multi ...

    In the field of point cloud segmentation research, deep learning methods have been highly favored by researchers. There are mainly three methods: projection-based, voxel-based, and point-based methods, each with its own advantages and disadvantages. The following is a detailed introduction to these methods. 2.1 Projection-based method

  26. Statistical estimation of sparsity and efficiency for ...

    A fundamental biological question is to understand how cell types and functions are determined by genomic and proteomic coding. A basic form of this question is to ask if small families of genes or proteins code for cell types. For example, it has been shown that the collection of homeodomain proteins can uniquely delineate all 118 neuron classes in the nematode C. elegans. However, unique ...

  27. (PDF) Market Segmentation: Understanding It, Doing It ...

    In 2005, Christensen, Cook and Hall, in the Harvard Business Review, found that of 30,000 new products launched in the US, 85% failed because of poor market segmentation. Yankelovich's paper in ...

  28. Research on Market Segmentation of Tourism Highway Travelers ...

    3.1 Basic Sample Characteristics Analysis. A descriptive statistical analysis of the survey sample was carried out. The results show that male tourists accounted for the majority of the total number of people, accounting for 61.50%; the majority of tourists were middle-aged and young people, with 49.50% of the tourists aged 18-35, 32.50% of the tourists aged 36-60, and a lower percentage ...

  29. What Is Network Segmentation?

    Segmentation gives control over who can access which segments of the network. It is also important because it means that if a network breach occurs, the hacker would not have access to the whole ...

  30. [2408.07262] Ensemble architecture in polyp segmentation

    In this research, we revisit the architecture of semantic segmentation and evaluate the models excelling in polyp segmentation. We introduce an integrated framework that harnesses the advantages of different models to attain an optimal outcome. More specifically, we fuse the learned features from convolutional and transformer models for prediction, and we view this approach as an ensemble ...