Tob*_*tor 1 python list conv-neural-network one-hot-encoding
I have a list of strings which serve as labels for my classification problem (image recognition with a Convolutional Neural Network). These labels consist of 5-8 characters (numbers from 0 to 9 and letters from A to Z). To train my neural network I would like to one hot encode the labels. I wrote a code to encode one label but I am still experiencing difficulties when trying to apply the code to a list.
Here is my code for one label which works fine:
from numpy import argmax
# define input string
data = '7C24698'
print(data)
# define universe of possible input values
characters = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ '
# define a mapping of chars to integers
char_to_int = dict((c, i) for i, c in enumerate(characters))
int_to_char = dict((i, c) for i, c in enumerate(characters))
# integer encode input data
integer_encoded = [char_to_int[char] for char in data]
print(integer_encoded)
# one hot encode
onehot_encoded = list()
for value in integer_encoded:
character = [0 for _ in range(len(characters))]
character[value] = 1
onehot_encoded.append(character)
print(onehot_encoded)
# invert encoding
inverted = int_to_char[argmax(onehot_encoded[0])]
print(inverted)
Run Code Online (Sandbox Code Playgroud)
I now want to get the same output for list of labels and store the output in a new list:
list_of_labels = ['7C24698', 'NDK745']
encoded_labels = []
Run Code Online (Sandbox Code Playgroud)
How can I do this?
小智 6
You can use LabelBinarizer from scikit-learn:
from sklearn.preprocessing import LabelBinarizer
>>> labels = ["first", "second", "third"]
>>> lb = LabelBinarizer()
>>> lb.fit(labels)
>>> lb.transform(labels)
array([[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])
Run Code Online (Sandbox Code Playgroud)
And to convert back the one-hot encoded labels back to string values:
>>> encoded_labels = [
[1, 0, 0],
[0, 1, 0],
[0, 0, 1]
]
>>> lb.inverse_transform(encoded_labels)
array(['first', 'second', 'third'])
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1071 次 |
| 最近记录: |