Thursday, March 28, 2019

Tensorflow Object Detection API - TFRecord creation

Understand how to create TFRecords is an important point when you use Tensorflow Object Detection API. In my Github repo (, you can find what directory structure looks like.

1. DEFINE parameters
num_shards: how you divide TFRecord files. In this code, I do not use this and TFRecord is formatted to a single file.
ratio_train: ratio of training examples from xml/jpg files. 0.8 means 80% for training and 20% for evaluation.
label_map_path: path to .pbtxt file, which defines class of objects like below.

image_dir: path to jpg image file directory. The folder name is JPEGImages. The extension of jpeg file should be .jpg.
annotation_dir: path to xml file directory which contains the annotation information of jpg images. The file name of xml files and jpg image files should be 1:1 relations. The folder name is Annotations. xml files should be formatted as Pascal VOC like below.

train_output_path: path to TFRecord file for training. The file name is train.record
val_output_path: path to TFRecord file for evaluation. The file name is val.record

2. Main function
tf.train.Example is the key component to create tf_examples. In the main function, it searches *.xml files in "Annotations" folder and create examples_list. By random function, files are shuffled and separated into train examples and evaluation examples based on ratio_train.

3. create_tf_record function

create_tf_record function mainly generates "data" dictionary from *.xml files. Below is examples that show how it divides examples into train and evaluation.