建立网站怎么做四年级的简短新闻播报
TensorFlow
模型
张量、变量共同点:具有形状、类型、值等3个属性。
不同点:变量可被TensorFlow的自动求导机制求导,常被用于机器学习模型的参数。
tfrecord
tensorflow定义的数据格式,一种二进制文件格式,用于保存和读取图像和文本数据。tfrecord文件包含了tf.train.Example protobuf数据。It is designed for use with TensorFlow and is used throughout the higher-level APIS such as TFX.
基本结构与数据类型
tf.train.Example的数据结构是一个字典称为Features
,其内部结构可从proto文件看出:
message Example {Features features = 1;
};message Features{map<string, Feature> featrue = 1;
};message Feature{oneof kind{BytesList bytes_list = 1;FloatList float_list = 2;Int64List int64_list = 3;}
};
数据类型Feature有3个,Int64、Bytes、Float;Int64存储bool、Enum、uint32、int32、int64、uint64,Bytes存储字符串、二进制,Float存储float(float32)和double(float64)。
文件格式即把数据参考字典结构做二进制数据的protobuf序列化,称为string。
def serialize_example(f1, f2, f3, f4):fts = {"feature0": _int64_feature(f1),"feature1": _int64_feature(f2),"feature2": _bytes_feature(f3),"feature3": _float_feature(f4),}m = tf.train.Example(features=tf.train.Features(feature=fts))return m.SerializeToString()
ps = serialize_example(3, True, b"goal", 0.999)
ex_proto = tf.train.Example.FromString(ps)
tf.train.Feature
是被tf.train.Example
兼容的。
import tensorflow as tf
def _bytes_feature(x):if isinstance(x, type(tf.constant(0))):x = x.numpy()return tf.train.Feature(bytes_list=tf.train.BytesList(value=[x]))
读写tfrecord文件
- 写文件
# Write the `tf.train.Example` observations to the file.
with tf.io.TFRecordWriter(filename) as writer:for i in range(n_observations):example = serialize_example(feature0[i], feature1[i], feature2[i], feature3[i])writer.write(example)
- 读文件
fn = "./Waymo.tfrecord"
rd = tf.data.TFRecordDataset(fn)
# 数据格式
feature_description = {'feature0': tf.io.FixedLenFeature([], tf.int64, default_value=0),'feature1': tf.io.FixedLenFeature([], tf.int64, default_value=0),'feature2': tf.io.FixedLenFeature([], tf.string, default_value=''),'feature3': tf.io.FixedLenFeature([], tf.float32, default_value=0.0),
}def _parse_function(example_proto):# Parse the input `tf.train.Example` proto using the dictionary above.return tf.io.parse_single_example(example_proto, feature_description)parsed_dataset = raw_dataset.map(_parse_function)
for parsed_record in parsed_dataset.take(10):print(repr(parsed_record))
Waymo Open Dataset
采用tfrecord的数据协议,Dataset结构需参考
https://github.com/waymo-research/waymo-open-dataset/blob/master/waymo_open_dataset/dataset.proto
使用Python库waymo-open-dataset
#与tensorflow版本对应,如tf为2.3.0
pip3 install waymo-open-dataset-tf-2-3-0 --user
fn = ["/data/Waymo_training_segment-10023947602400723454_1120_000_1140_000_with_camera_labels.tfrecord"
]
dataset = tf.data.TFRecordDataset(fn)
for data in dataset.take(1000):frame = open_dataset.Frame()frame.ParseFromString(bytearray(data.numpy()))# plt.figure(figsize=(25, 20))# for index, image in enumerate(frame.images):# show_camera_image(image, frame.camera_labels, [3, 3, index+1])# plt.show()ts = frame.timestamp_microsst_img = frame.images[0]for labels in frame.camera_labels:if labels.name == st_img.name:for label in labels.labels:x = int(label.box.center_x - 0.5 * label.box.length)y = int(label.box.center_y - 0.5 * label.box.width)width = int(label.box.length)height = int(label.box.width)
重复造轮子:用tf.io实现读取数据集。
问题
https://stackoverflow.com/questions/61166864/tensorflow-python-framework-ops-eagertensor-object-has-no-attribute-in-graph
Waymo Open Dataset文件解析格式,如何确定字典结构
raw_image_dataset = tf.data.TFRecordDataset('images.tfrecords')# Create a dictionary describing the features.
image_feature_description = {'height': tf.io.FixedLenFeature([], tf.int64),'width': tf.io.FixedLenFeature([], tf.int64),'depth': tf.io.FixedLenFeature([], tf.int64),'label': tf.io.FixedLenFeature([], tf.int64),'image_raw': tf.io.FixedLenFeature([], tf.string),
}def _parse_image_function(example_proto):# Parse the input tf.train.Example proto using the dictionary above.return tf.io.parse_single_example(example_proto, image_feature_description)parsed_image_dataset = raw_image_dataset.map(_parse_image_function)