{"id":4594,"date":"2025-12-15T20:51:32","date_gmt":"2025-12-15T12:51:32","guid":{"rendered":"https:\/\/www.15zhi.net\/blog\/?p=4594"},"modified":"2025-12-15T20:53:24","modified_gmt":"2025-12-15T12:53:24","slug":"202512-%e8%ae%ba%e6%96%87%e9%98%85%e8%af%bb-zone-yolo-vision-language-object-detection-using-zone-prompt","status":"publish","type":"post","link":"https:\/\/www.15zhi.net\/blog\/202512-%e8%ae%ba%e6%96%87%e9%98%85%e8%af%bb-zone-yolo-vision-language-object-detection-using-zone-prompt\/","title":{"rendered":"202512 \u8bba\u6587\u7814\u8bfb-Zone-YOLO: Vision-Language Object Detection Using Zone Prompt"},"content":{"rendered":"\n<p>\u6765\u6e90\uff1aIEEE Transactions on Intelligent Transportation Systems 2025<\/p>\n\n\n\n<p>\u5355\u4f4d\uff1a\u540c\u6d4e\u5927\u5b66<\/p>\n\n\n\n<p>\u4f5c\u8005\uff1aJiaxiong Yang, Ning Jia, Xianhui Liu, Rui Fan, Yougang Sun, <\/p>\n\n\n\n<p>Weidong Zhao<\/p>\n\n\n\n<p><strong>\u4e00\u3001\u80cc\u666f<\/strong><\/p>\n\n\n\n<p>\u73b0\u6709\u7684\u5b9e\u65f6\u76ee\u6807\u68c0\u6d4b\u5668\u4e3b\u8981\u4f9d\u8d56\u7eaf\u89c6\u89c9\u7279\u5f81\uff0c\u7f3a\u4e4f\u9ad8\u7ea7\u8bed\u4e49\u652f\u6301\u3002\u867d\u7136\u89c6\u89c9\u8bed\u8a00\u76ee\u6807\u68c0\u6d4b\uff08VLOD\uff09\u65b9\u6cd5\u901a\u8fc7\u5f15\u5165 CLIP \u7b49\u89c6\u89c9\u8bed\u8a00\u6a21\u578b \u63d0\u5347\u4e86\u5206\u7c7b\u6027\u80fd\uff0c\u4f46\u4ecd\u5b58\u5728\u4e24\u5927\u7f3a\u9677\uff1a<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>\u6587\u672c\u7279\u5f81\u5229\u7528\u4e0d\u5145\u5206\uff1a<\/strong> \u5927\u591a\u6570VLOD\u65b9\u6cd5\u4e3b\u8981\u5728\u5206\u7c7b\u4efb\u52a1\u4e2d\u4f7f\u7528\u6587\u672c\u7279\u5f81\u8fdb\u884c\u5bf9\u6bd4\u5b66\u4e60\uff0c\u4f46\u672a\u5145\u5206\u63a2\u7d22\u5176\u5bf9\u56de\u5f52\u8fc7\u7a0b\uff08\u5373\u76ee\u6807\u5b9a\u4f4d\uff09\u7684\u5f71\u54cd ;<\/li>\n\n\n\n<li><strong>\u591a\u6a21\u6001\u878d\u5408\u4e0d\u8db3\uff1a<\/strong> \u73b0\u6709\u7684\u591a\u6a21\u6001\u878d\u5408\u65b9\u6cd5\u672a\u80fd\u5c06\u6587\u672c\u7279\u5f81\u4e0e\u591a\u5c3a\u5ea6\u56fe\u50cf\u7279\u5f81\u5728\u5bf9\u5e94\u5c3a\u5ea6\u4e0a\u8fdb\u884c\u878d\u5408\uff0c\u8fd9\u635f\u5bb3\u4e86\u6a21\u578b\u7684\u8868\u793a\u80fd\u529b\uff0c\u53ef\u80fd\u5bfc\u81f4\u6982\u5ff5\u6df7\u6dc6 \u3002<\/li>\n<\/ol>\n\n\n\n<p>\u9488\u5bf9\u4e0a\u8ff0\u95ee\u9898\uff0c\u8fd9\u7bc7\u6587\u7ae0\u63d0\u51faZone-Yolo\uff0c\u4e00\u79cd\u57fa\u4e8e\u89c6\u89c9\u8bed\u8a00\u6a21\u578b\u7684YOLO\u68c0\u6d4b\u5668\u3002<\/p>\n\n\n\n<p><strong>\u4e8c\u3001\u8d21\u732e<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>\u5c3a\u5ea6\u611f\u77e5\u6a21\u6001\u878d\u5408 (SAMF)\uff1a<\/strong> \u63d0\u51fa\u4e86\u4e00\u79cd\u53cc\u6d41\u591a\u6a21\u6001\u878d\u5408\u65b9\u6cd5\uff0c\u901a\u8fc7\u5c3a\u5ea6\u611f\u77e5\u67e5\u8be2\uff08Scale-Aware Query\uff09\u5728\u5bf9\u5e94\u7684\u5c3a\u5ea6\u4e0a\u5bf9\u9f50\u56fe\u50cf\u548c\u6587\u672c\u7279\u5f81\uff0c\u5b9e\u73b0\u4e86\u4ece\u7c97\u5230\u7ec6\u7684\u7279\u5f81\u589e\u5f3a\uff0c\u89e3\u51b3\u4e86\u591a\u5c3a\u5ea6\u878d\u5408\u4e2d\u7684\u6982\u5ff5\u6df7\u6dc6\u95ee\u9898\u3002<\/li>\n\n\n\n<li><strong>\u533a\u57df\u63d0\u793a\u5b66\u4e60 (Zone Prompt Learning)\uff1a<\/strong> \u5f00\u521b\u6027\u5730\u5c06\u6587\u672c\u7279\u5f81\u5f15\u5165\u56de\u5f52\u5934\u90e8\u3002\u901a\u8fc7\u8bbe\u8ba1\u7c7b\u522b\u4e0d\u53ef\u77e5\u533a\u57df\u63d0\u793a\u3001\u9002\u914d\u5668\u548c\u533a\u57df\u5934\u90e8\uff0c\u6355\u83b7\u4e86\u201c\u533a\u57df-\u7c7b\u522b-\u5b9e\u4f53\u201d\u7684\u4e09\u5143\u7ec4\u5171\u73b0\u4fe1\u606f\uff0c\u663e\u8457\u63d0\u5347\u4e86\u5b9a\u4f4d\u7cbe\u5ea6\u3002<\/li>\n<\/ol>\n\n\n\n<p><strong>\u4e09\u3001\u65b9\u6cd5<\/strong><\/p>\n\n\n\n<p>Zone-YOLO\u57fa\u4e8e YOLOv8 \u56fe\u50cf\u7f16\u7801\u5668\u548c CLIP \u6587\u672c\u7f16\u7801\u5668\u6784\u5efa\uff0c\u901a\u8fc7\u5c3a\u5ea6\u611f\u77e5 VL \u9888\u90e8(VL-Neck)\u548c\u533a\u57df\u5934\u90e8(Zone Head)\u4e24\u4e2a\u5173\u952e\u7ec4\u4ef6\uff0c\u5b9e\u73b0\u4e86\u7279\u5f81\u589e\u5f3a\u548c\u5b9a\u4f4d\u6307\u5bfc\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"409\" src=\"https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-9-1024x409.png\" alt=\"\" class=\"wp-image-4595\" srcset=\"https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-9-1024x409.png 1024w, https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-9-300x120.png 300w, https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-9-768x307.png 768w, https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-9.png 1292w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u5c3a\u5ea6\u611f\u77e5\u6a21\u6001\u878d\u5408 (SAMF)<\/li>\n<\/ol>\n\n\n\n<p>SAMF \u6a21\u5757\u8d1f\u8d23\u5728\u7f51\u7edc\u7684\u9888\u90e8\u5b9e\u73b0\u591a\u5c3a\u5ea6\u56fe\u50cf\u7279\u5f81\u548c\u6587\u672c\u7279\u5f81\u7684\u7cbe\u786e\u5bf9\u9f50\u548c\u65e0\u7f1d\u878d\u5408\uff1a<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\u6838\u5fc3\u673a\u5236\uff1a \u5f15\u5165\u5c3a\u5ea6\u611f\u77e5\u67e5\u8be2(<strong><em>SQ<\/em><\/strong>)\u4f5c\u4e3a\u63a9\u7801\uff0c\u6307\u5bfc\u6a21\u6001\u6df7\u5408\u77e9\u9635(<strong><em>MI<sub>SAMF<\/sub><\/em><\/strong>)\u9488\u5bf9\u5f53\u524d\u5c3a\u5ea6\u7684\u56fe\u50cf\u7279\u5f81\uff0c\u63d0\u53d6\u6700\u76f8\u5173\u7684\u6587\u672c\u8bed\u4e49\u4fe1\u606f\uff0c\u4ece\u800c\u6709\u6548\u6291\u5236\u4e0d\u540c\u5c3a\u5ea6\u95f4\u7684\u6982\u5ff5\u6df7\u6dc6\u3002<\/li>\n\n\n\n<li>\u7279\u5f81\u589e\u5f3a\uff1a \u5229\u7528<strong><em>MI<sub>SAMF<\/sub><\/em><\/strong> \u8fdb\u884c\u53cc\u5411\u589e\u5f3a\uff08\u901a\u9053\u589e\u5f3a\u548c\u6a21\u6001\u589e\u5f3a\uff09\uff0c\u5b9e\u73b0\u4e86\u4ece\u7c97\u7c92\u5ea6\u5230\u7ec6\u7c92\u5ea6\u7684\u591a\u6a21\u6001\u7279\u5f81\u63d0\u70bc\u3002<\/li>\n<\/ul>\n\n\n\n<p>2. \u533a\u57df\u63d0\u793a\u5b66\u4e60\u65b9\u6cd5<\/p>\n\n\n\n<p>\u8fd9\u662f Zone-YOLO \u7684\u4e3b\u8981\u8d21\u732e\uff0c\u5b83\u5c06\u6587\u672c\u7684\u4f4d\u7f6e\u4fe1\u606f\u5f15\u5165\u76ee\u6807\u68c0\u6d4b\u7684\u8fb9\u754c\u6846\u56de\u5f52\u4efb\u52a1\uff0c\u4ee5\u63d0\u9ad8\u5b9a\u4f4d\u7cbe\u5ea6\u3002<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\u533a\u57df\u63d0\u793a\uff1a\u4f7f\u7528\u7c7b\u522b\u4e0d\u53ef\u77e5\u7684 9 \u4e2a\u56fa\u5b9a\u4f4d\u7f6e\u540d\u8bcd\uff08\u5982\u201ccenter\u201d\u3001\u201ctop left\u201d\uff09\u4f5c\u4e3a\u533a\u57df\u63d0\u793a\uff0c\u907f\u514d\u6307\u4ee3\u6a21\u7cca\u3002<\/li>\n\n\n\n<li>\u9002\u914d\u5668\uff1a\u5b66\u4e60\u533a\u57df\u63d0\u793a\u4e0e\u7c7b\u522b\u8bcd\u5d4c\u5165\u4e4b\u95f4\u7684 \u201c\u533a\u57df-\u7c7b\u522b\u5171\u73b0\u201d \u5173\u7cfb\uff0c\u751f\u6210\u7c7b\u522b\u7279\u5b9a\u533a\u57df\u5d4c\u5165(<strong><em>Z<\/em><\/strong> )\u3002<\/li>\n\n\n\n<li>\u533a\u57df\u5934\u90e8\uff1a\u5c06 <strong><em>Z<\/em><\/strong> \u4e0e\u56fe\u50cf\u7279\u5f81\uff08\u5b9e\u4f53\u4fe1\u606f\uff09\u878d\u5408\uff0c\u6355\u83b7 \u201c\u533a\u57df-\u7c7b\u522b-\u5b9e\u4f53\u4e09\u5143\u7ec4\u5171\u73b0\u201d \u77e5\u8bc6\uff0c\u5e76\u901a\u8fc7\u81ea\u6ce8\u610f\u529b\u673a\u5236\u7cbe\u70bc\u540e\uff0c\u6307\u5bfc\u56de\u5f52\u5206\u652f\u8fdb\u884c\u8fb9\u754c\u6846\u9884\u6d4b\u3002<\/li>\n\n\n\n<li>\u8f85\u52a9\u5206\u652f\uff1a\u91c7\u7528\u81ea\u76d1\u7763\u7684 MSE \u635f\u5931\uff0c\u786e\u4fdd\u533a\u57df\u5d4c\u5165\u5728\u7279\u5f81\u8f6c\u6362\u8fc7\u7a0b\u4e2d\u7684\u7a33\u5b9a\u6027\u3002<\/li>\n<\/ul>\n\n\n\n<p>\u901a\u8fc7 Zone Prompt\uff0c\u6a21\u578b\u4e0d\u4ec5\u5229\u7528\u4e86\u6587\u672c\u8fdb\u884c\u5206\u7c7b\u5bf9\u6bd4\uff0c\u8fd8\u5229\u7528\u5176\u4f4d\u7f6e\u5148\u9a8c\u77e5\u8bc6\uff0c\u89e3\u51b3\u4e86\u73b0\u6709 VLOD \u5bf9\u5b9a\u4f4d\u4efb\u52a1\u6307\u5bfc\u4e0d\u8db3\u7684\u5c40\u9650\u3002<\/p>\n\n\n\n<p><strong>\u56db\u3001\u5b9e\u9a8c<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u6570\u636e\u96c6<\/li>\n<\/ol>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>COCO:<\/strong> \u901a\u7528\u76ee\u6807\u68c0\u6d4b\u57fa\u51c6\u3002<\/li>\n\n\n\n<li><strong>BDD100K:<\/strong> \u81ea\u52a8\u9a7e\u9a76\u573a\u666f\uff0c\u5305\u542b\u590d\u6742\u5929\u6c14\u548c\u5149\u7167\u3002<\/li>\n\n\n\n<li><strong>VisDrone2019:<\/strong> \u65e0\u4eba\u673a\u89c6\u89d2\uff0c\u5305\u542b\u5927\u91cf\u5bc6\u96c6\u5c0f\u76ee\u6807\u3002<\/li>\n\n\n\n<li><strong>LVIS:<\/strong> \u957f\u5c3e\u5206\u5e03\u6570\u636e\u96c6\uff0c\u7528\u4e8e\u6d4b\u8bd5\u6cdb\u5316\u80fd\u529b\u3002<\/li>\n<\/ul>\n\n\n\n<p>2. \u8868\u73b0<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>COCO\u57fa\u51c6\uff1a\n<ul class=\"wp-block-list\">\n<li>Zone-YOLO-L \u8fbe\u523055.1 AP;<\/li>\n\n\n\n<li>\u76f8\u6bd4 YOLO-World \u548c YOLOv9\/v10\uff0c\u5728 <strong><em>AP<\/em><\/strong><sub><strong><em>75<\/em><\/strong> <\/sub>(\u9ad8\u7cbe\u5ea6\u5b9a\u4f4d) \u548c <strong><em>AP<sub>L<\/sub><\/em><\/strong> (\u5927\u76ee\u6807) \u4e0a\u63d0\u5347\u663e\u8457\uff0c\u8bc1\u660e\u4e86\u5f15\u5165\u533a\u57df\u63d0\u793a\u5bf9\u56de\u5f52\u4efb\u52a1\u7684\u6709\u6548\u6027\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"696\" src=\"https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-10-1024x696.png\" alt=\"\" class=\"wp-image-4597\" srcset=\"https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-10-1024x696.png 1024w, https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-10-300x204.png 300w, https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-10-768x522.png 768w, https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-10.png 1100w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\u4ea4\u901a\u573a\u666f (BDD100K &amp; VisDrone)\uff1a\n<ul class=\"wp-block-list\">\n<li>\u5728 VisDrone \u4e0a\uff0c<strong><em>AP<\/em><\/strong><em><strong><sub>50<\/sub><\/strong><\/em> \u6bd4\u7b2c\u4e8c\u540d\u9ad8\u51fa 2.0\uff0c\u6781\u5927\u5730\u6539\u5584\u4e86\u5bc6\u96c6\u5c0f\u76ee\u6807\u7684\u68c0\u6d4b\u3002<\/li>\n\n\n\n<li>\u5728 BDD100K \u4e0a\uff0c<strong><em>AP<sub>L<\/sub><\/em><\/strong> \u8868\u73b0\u4f18\u5f02\uff0c\u4e14 \u5728\u5f31\u5149\u548c\u590d\u6742\u80cc\u666f\u4e0b\u9c81\u68d2\u6027\u66f4\u5f3a\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"264\" src=\"https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-11-1024x264.png\" alt=\"\" class=\"wp-image-4598\" srcset=\"https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-11-1024x264.png 1024w, https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-11-300x77.png 300w, https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-11-768x198.png 768w, https:\/\/www.15zhi.net\/blog\/wp-content\/uploads\/2025\/12\/image-11.png 1070w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><strong>\u4e94\u3001\u7ed3\u8bba<\/strong><\/p>\n\n\n\n<p>Zone-YOLO \u6210\u529f\u5730\u5c06\u89c6\u89c9\u8bed\u8a00\u6a21\u578b\u7684\u4f18\u52bf\u6269\u5c55\u5230\u4e86\u76ee\u6807\u68c0\u6d4b\u7684\u56de\u5f52\u4efb\u52a1\u4e2d\u3002\u901a\u8fc7 SAMF \u5b9e\u73b0\u4e86\u7cbe\u7ec6\u7684\u591a\u6a21\u6001\u7279\u5f81\u5bf9\u9f50\uff0c\u901a\u8fc7 Zone Prompt \u673a\u5236\u5229\u7528\u201c\u533a\u57df-\u7c7b\u522b-\u5b9e\u4f53\u201d\u5171\u73b0\u4fe1\u606f\u663e\u8457\u589e\u5f3a\u4e86\u6a21\u578b\u7684\u5b9a\u4f4d\u80fd\u529b\u3002\u8be5\u6a21\u578b\u5728\u590d\u6742\u4ea4\u901a\u573a\u666f\u4e0b\u8868\u73b0\u51fa\u5353\u8d8a\u7684\u6027\u80fd\u548c\u9c81\u68d2\u6027\uff0c\u4e3a\u667a\u80fd\u4ea4\u901a\u7cfb\u7edf\u7684\u611f\u77e5\u6a21\u5757\u63d0\u4f9b\u4e86\u4e00\u4e2a\u5f3a\u6709\u529b\u7684\u57fa\u7840\u6a21\u578b\u3002<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u6765\u6e90\uff1aIEEE Transactions on Intelligent Transportation Syst [&hellip;]<\/p>\n","protected":false},"author":64,"featured_media":0,"comment_status":"closed","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-4594","post","type-post","status-publish","format-standard","hentry","category-events"],"_links":{"self":[{"href":"https:\/\/www.15zhi.net\/blog\/wp-json\/wp\/v2\/posts\/4594","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.15zhi.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.15zhi.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.15zhi.net\/blog\/wp-json\/wp\/v2\/users\/64"}],"replies":[{"embeddable":true,"href":"https:\/\/www.15zhi.net\/blog\/wp-json\/wp\/v2\/comments?post=4594"}],"version-history":[{"count":4,"href":"https:\/\/www.15zhi.net\/blog\/wp-json\/wp\/v2\/posts\/4594\/revisions"}],"predecessor-version":[{"id":4601,"href":"https:\/\/www.15zhi.net\/blog\/wp-json\/wp\/v2\/posts\/4594\/revisions\/4601"}],"wp:attachment":[{"href":"https:\/\/www.15zhi.net\/blog\/wp-json\/wp\/v2\/media?parent=4594"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.15zhi.net\/blog\/wp-json\/wp\/v2\/categories?post=4594"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.15zhi.net\/blog\/wp-json\/wp\/v2\/tags?post=4594"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}