2021年2月28日 星期日

物件偵測 (#1/5): MS COCO的資料集格式




在進行AI圖像上, 常會看到MS COCO的資料集,

COCO 的圖片資料集,提供3種標注檔:object instances(用於物件偵測), person keypoints(人的關鍵點,用於姿態識別)以及 image captions(圖像標題) , 每種標注類型都有相應的json 檔。標注檔也分好了訓練集、驗證集。


with open(annotation_file, "r") as f:
    data = json.load(f)
    annotations=data["annotations"]
    images=data["images"]
    categories=data["categories"]
    
    
    
print(f"Number of images: {len(annotations)}")
print(f"Number of images: {len(images)}")
print(f"Number of images: {len(categories)}")

The COCO dataset has been downloaded and extracted successfully.
Number of images: 36781
Number of images: 5000
Number of images: 80

images[60] => 取出index 60 這張圖的資訊

{'license': 1,
 'file_name': '000000360661.jpg',
 'coco_url': 'http://images.cocodataset.org/val2017/000000360661.jpg',
 'height': 480,
 'width': 640,
 'date_captured': '2013-11-18 21:33:43',
 'flickr_url': 'http://farm4.staticflickr.com/3670/9709793032_f9ee4f0aa2_z.jpg',
 'id': 360661}

annotations[60] => 取出index 60 annotations資訊

{'segmentation': [[267.51,
   222.31,
   292.15,
   222.31,
   291.05,
   237.02,
   265.67,
   237.02]],
 'area': 367.89710000000014,
 'iscrowd': 0,
 'image_id': 525083,
 'bbox': [265.67, 222.31, 26.48, 14.71],
 'category_id': 72,
 'id': 34096}


annotation{
    "id": int,    
    "image_id": int,
    "category_id": int,
    "segmentation": RLE or [polygon],
    "area": float,
    "bbox": [x,y,width,height],
    "iscrowd": 0 or 1,
}


每一張圖片會有一個image_id, 而一張圖可能包含一個以上的單一物件或群物件. 針對每一個物件,
不論是單一物件或群物件, 都會用一個annotation來表現物件內容.. 一張圖會有多個annotation, 即多個物件







annotation{
 "id": int, ==> 物件id
 "image_id": int, ==> 所屬的圖片
 "category_id": int, ==>此物件的類別id
 "segmentation": RLE or [polygon], ==> 單一物件或一群物件的區域描述
 "area": float, ==> 物件區域的Pixel總數
 "bbox": [x,y,width,height], ==> bounding box的座標
 "iscrowd": 0 or 1, ==> 0: 單一物件, 1: 一群物件 (如:一群觀眾) 
}


其中"segmentation": 若為單一物件, 則是以一個多邊形的座標點 [X1,Y1,X2,Y2, ....] 來描述此物件的區塊位置.
若是一群物件的區域描述, 如要描述一群蘋果,則會用Mask的方式來描述,如下圖所示。






一群物件的區域描述, 即iscrowd=1, 則segmentation的內容為

{'counts': [671, 10, 2, 2, 4, 22, 6, 31, 1, 11, 1, 10, 379, 16, 1, 25, 5, 55, 378, 43, 4, 55, 378, 44, 3, 55, 378, 44, 3, 55, 378, 44, 3, 55, 378, 44, 3, 55, 378, 44, 4, 54, 379, 29, 1, 16, 1, 54, 380, 28, 2, 15, 2, 53, 382, 23, 6, 15, 1, 8, 21, 1, 3, 3, 5, 12, 384, 20, 8, 16, 40, 12, 384, 16, 14, 15, 40, 10, 386, 10, 21, 14, 40, 8, 388, 8, 22, 15, 41, 3, 393, 3, 25, 15, 465, 15, 465, 15, 466, 14, 467, 13, 468, 12, 469, 10, 471, 8, 474, 3, 983, 7, 472, 9, 470, 11, 454, 6, 1, 20, 452, 28, 451, 30, 449, 31, 448, 33, 447, 34, 446, 35, 445, 35, 445, 35, 445, 35, 445, 36, 445, 36, 445, 35, 447, 33, 450, 30, 450, 30, 450, 12, 1, 17, 451, 10, 3, 16, 452, 8, 6, 13, 455, 4, 12, 8, 474, 3, 50865, 6, 459, 6, 8, 11, 454, 27, 452, 29, 450, 31, 448, 32, 448, 32, 448, 32, 448, 32, 448, 32, 448, 32, 448, 31, 450, 29, 452, 20, 2, 4, 456, 7, 1, 3, 1, 4, 7174, 6, 2, 6, 4, 14, 447, 34, 445, 36, 443, 38, 442, 38, 442, 38, 442, 38, 442, 38, 442, 38, 442, 38, 443, 36, 445, 34, 446, 6, 19, 6, 450, 3, 478, 1, 42714, 6, 473, 8, 471, 10, 469, 15, 465, 18, 462, 19, 461, 21, 459, 22, 458, 24, 456, 26, 455, 25, 456, 24, 458, 22, 461, 18, 463, 16, 466, 3, 5, 3, 3840, 7, 463, 20, 459, 22, 457, 27, 448, 33, 446, 35, 437, 44, 435, 46, 433, 47, 432, 48, 432, 48, 432, 48, 432, 48, 432, 30, 4, 14, 432, 29, 7, 12, 432, 29, 8, 10, 433, 29, 9, 8, 435, 13, 1, 1, 2, 10, 12, 3, 439, 12, 6, 7, 456, 10, 471, 3, 3, 2, 474, 1, 478, 2, 477, 3, 476, 4, 476, 9, 470, 11, 469, 12, 468, 13, 467, 14, 7, 1, 458, 23, 457, 23, 457, 23, 458, 22, 459, 21, 461, 19, 463, 18, 462, 19, 461, 20, 1, 9, 450, 33, 447, 34, 446, 36, 444, 37, 443, 38, 443, 38, 443, 37, 445, 35, 450, 7, 2, 21, 459, 21, 460, 20, 461, 19, 463, 3, 3, 10, 471, 8, 474, 3, 18209, 1, 479, 2, 478, 3, 477, 4, 476, 5, 475, 6, 474, 7, 474, 10, 471, 8, 474, 4, 5302, 7, 447, 4, 26, 4, 445, 6, 1, 15, 11, 3, 443, 29, 6, 3, 442, 32, 4, 2, 442, 38, 442, 38, 442, 38, 442, 38, 442, 38, 442, 37, 444, 16, 1, 18, 446, 8, 1, 3, 6, 3, 4, 8, 449, 3, 22, 4, 46076, 6, 468, 13, 466, 15, 461, 20, 459, 21, 458, 22, 457, 23, 457, 23, 457, 23, 457, 23, 457, 22, 458, 21, 459, 20, 460, 20, 461, 19, 462, 18, 463, 17, 462, 18, 461, 19, 460, 19, 461, 19, 461, 19, 461, 19, 461, 19, 461, 19, 461, 19, 461, 18, 463, 16, 465, 8, 1, 3, 470, 4, 31194, 12, 9, 6, 452, 29, 445, 36, 443, 38, 441, 39, 435, 45, 434, 46, 427, 53, 426, 54, 425, 55, 424, 55, 425, 54, 426, 53, 427, 51, 429, 25, 1, 24, 430, 22, 5, 22, 210], 'size': [480, 640]}






References:




尚有其他訓練影像的標記格式, 如PASCAL VOC 以XML格式儲存, TensorFlow Object Detection 以.csv 儲存, 而Darknet (Yolo) 以.txt 儲存

2021年2月7日 星期日

RP2040 與 Raspberry Pi Pico


Raspberry Pi 基本會也切入MCU 領域, 且還開了一個自己的晶片RP2040. 雙核的ARM Cortex-M0+ (133MHz), 內建264KB SRAM, 還外掛一個 16MB的 Flash。GPIO 輸出是 3.3V . 不過目前晶片還沒有對外銷售, 只有提供給幾家廠商,如Sparkfun開發板子. 重點是支援 TensorFlow Lite 框架, 可以開發輕量級Deep Learning 的應用


 
7 × 7 mm QFN-56 package 


RP2040的晶片架構圖



RP2040 Chip features:

  • Dual ARM Cortex-M0+ @ 133MHz
  • 264kB on-chip SRAM in six independent banks
  • Support for up to 16MB of off-chip Flash memory via dedicated QSPI bus
  • DMA controller
  • Fully-connected AHB crossbar
  • Interpolator and integer divider peripherals
  • On-chip programmable LDO to generate core voltage
  • 2 on-chip PLLs to generate USB and core clocks
  • 30 GPIO pins, 4 of which can be used as analog inputs
  • Peripherals
    • 2 UARTs
    • 2 SPI controllers
    • 2 I2C controllers
    • 16 PWM channels
    • USB 1.1 controller and PHY, with host and device support
    • 8 PIO state machines
基金會也以RP2040開發一個板子叫Raspberry Pi Pico。  Pico 內部帶有boot loader,不過沒有作業系統 (不像Pi 或Pi zero 有Linux作業系統), 但Pico有提供C/C++ SDK及MicorPython SDK供使用者快速開發.   



Pico dimensions (unassembled):  51.3mm x 21mm x 3.9mm