How to get labels by file name / folder name and join the queue under TensorFlow

Cifar-10 tutorial given by TensorFlow official website is a good example for the introduction of convolution neural network. Sometimes we want to run our own data directly with this model, but we find that its data type is not common. Generally, the data (pictures) we get are stored in folders, or directly marked with the category on the file name. At this time, we need to get the label through the file name. Obviously, it is straight It's not good to use cifar-10.

Here, of course, we can convert the data into cifar-10 type, but I don't like this way.
The other is the old way to process the file name and get the label.

In fact, processing file names is very simple for python. Just contact convolutional neural network, may not be familiar with the use of TensorFlow (I), how to modify the code to read data into the queue in this way? Look at the code:

import tensorflow as tf
import os

#This is the path where the data is located. Its subdirectories are several folders corresponding to a type of data,
#What is stored in the folder is the corresponding type of data.
#Naming method: 1, 2, 3, 4
path="./data"

classes=2#Number of data categories
imagesList=[]#List of stored image data (only the path of the image, not the image)
labelsList=[]#The storage label corresponds to the above picture one by one
filepaths=[os.path.join(path,"%d"%i)for i in range (1,classes+1)]#Path to the subdirectory where path is stored
for p in filepaths:
    for filename in os.listdir(p):#Get the name of the picture
        imagesList.append(os.path.join(p,filename))#The picture name and path are spliced, and then as the queue
        labelsList.append(int(p[-1]))#Here's how it's named. The last character of p is the category

image = tf.cast(imagesList,tf.string)#Data type converted to tf
label = tf.cast(labelsList,tf.int64)
queue = tf.train.slice_input_producer([image,label])#Generating queues is the key
label = queue[1]
image_c = tf.read_file(queue[0])
image = tf.image.decode_jpeg(image_c,channels = 3)
training_image_data = tf.image.resize_images(image, [2, 2]) #This function is to reshape lines. Why is it not clear? There is no need to report errors sometimes
example_batch, label_batch = tf.train.shuffle_batch([training_image_data,label], batch_size=1, capacity=2, min_after_dequeue=1)  #Generate Batch
# Run Graph  
with tf.Session() as sess:  
    coord = tf.train.Coordinator()  #Create a coordinator to manage threads  
    threads = tf.train.start_queue_runners(coord=coord)  #Start the QueueRunner, and the filename queue is queued.  
    for i in range(6):  
        e_val,l_val = sess.run([example_batch, label_batch])  
        print (e_val,l_val)#Make some simple data to test the effect.

    coord.request_stop()  
    coord.join(threads) 

Reference resources:
1,https://www.cnblogs.com/wktwj/p/7227544.html
2,https://blog.csdn.net/lujiandong1/article/details/53376802

Thank!

Tags: network Python Session

Posted on Sun, 05 Apr 2020 07:25:21 -0700 by Madatan