Thursday, March 28, 2019

Tensorflow Object Detection API - TFRecord creation

Understand how to create TFRecords is an important point when you use Tensorflow Object Detection API. In my Github repo (https://github.com/koheikawata/objectdetectiontest), you can find what directory structure looks like.


1. DEFINE parameters
num_shards: how you divide TFRecord files. In this code, I do not use this and TFRecord is formatted to a single file.
ratio_train: ratio of training examples from xml/jpg files. 0.8 means 80% for training and 20% for evaluation.
label_map_path: path to .pbtxt file, which defines class of objects like below.

image_dir: path to jpg image file directory. The folder name is JPEGImages. The extension of jpeg file should be .jpg.
annotation_dir: path to xml file directory which contains the annotation information of jpg images. The file name of xml files and jpg image files should be 1:1 relations. The folder name is Annotations. xml files should be formatted as Pascal VOC like below.

train_output_path: path to TFRecord file for training. The file name is train.record
val_output_path: path to TFRecord file for evaluation. The file name is val.record

2. Main function
tf.train.Example is the key component to create tf_examples. In the main function, it searches *.xml files in "Annotations" folder and create examples_list. By random function, files are shuffled and separated into train examples and evaluation examples based on ratio_train.



3. create_tf_record function

create_tf_record function mainly generates "data" dictionary from *.xml files. Below is examples that show how it divides examples into train and evaluation.