有没有办法直接将音频文件(wav)加载到张量流中的张量?然后,再次将张量转换为音频文件?我看到有些人将音频转换为频谱图,但我找不到任何人可以将频谱图转换为音频。
该tf.contrib.ffmpeg.decode_audio()运算可以加载的音频数据(包括WAV格式)转换成张量,并且tf.contrib.ffmpeg.encode_audio()可以隐蔽它放回音频数据。
input_filename = tf.placeholder(tf.string, shape=[])
output_filename = tf.placeholder(tf.string, shape=[])
input_signal = tf.contrib.ffmpeg.decode_audio(
tf.read_file(input_filename), file_format="wav",
samples_per_second=44100, channel_count=2)
# ...
output_signal = ... # A 2-D tensor, [samples x channels]
encoded_audio_data = tf.contrib.ffmpeg.encode_audio(
output_signal, file_format="wav", samples_per_second=44100)
write_file_op = tf.write_file(output_filename, encoded_audio_data)
with tf.Session() as sess:
sess.run(write_file_op, {input_filename: "input.wav",
output_filename: "output.wav"})
Run Code Online (Sandbox Code Playgroud)
该tf.contrib模块已被弃用,但您仍然可以使用 Eager Execution 以 16 位 PCM WAV 格式加载和保存音频文件,并且tf.audio:
input_filename = tf.placeholder(tf.string, shape=[])
output_filename = tf.placeholder(tf.string, shape=[])
input_signal = tf.contrib.ffmpeg.decode_audio(
tf.read_file(input_filename), file_format="wav",
samples_per_second=44100, channel_count=2)
# ...
output_signal = ... # A 2-D tensor, [samples x channels]
encoded_audio_data = tf.contrib.ffmpeg.encode_audio(
output_signal, file_format="wav", samples_per_second=44100)
write_file_op = tf.write_file(output_filename, encoded_audio_data)
with tf.Session() as sess:
sess.run(write_file_op, {input_filename: "input.wav",
output_filename: "output.wav"})
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2095 次 |
| 最近记录: |