如何在音频叙述时根据网站上的音频实时突出显示文本

GKV*_*GKV 2 annotations text-to-speech reactjs

我正在尝试找出使用哪种技术来根据音频突出显示文本。就像正在做的事情一样https://speechify.com/在此输入图像描述

这是假设我能够运行 TTS 算法并且能够将文本转换为语音。我尝试了多种来源,但无法确定在音频说话时突出显示文本的确切技术或方法。

任何帮助将非常感激。我已经在互联网上浪费了两天时间来解决这个问题,但没有运气:(

oll*_*lle 6

一种简单的方法是使用SpeechSynthesisUtterance 边界事件提供的事件侦听器来使用 vanilla JS 突出显示单词。发出的事件为我们提供了字符索引,因此无需疯狂地使用正则表达式或超级人工智能的东西:)

\n

首先,确保 API 可用

\n
const synth = window.speechSynthesis\nif (!synth) {\n  console.error(\'no tts for you!\')\n  return\n}\n
Run Code Online (Sandbox Code Playgroud)\n

tts 话语会发出“boundary”事件,我们可以用它来突出显示文本。

\n
let text = document.getElementById(\'text\')\nlet originalText = text.innerText\nlet utterance = new SpeechSynthesisUtterance(originalText)\nutterance.addEventListener(\'boundary\', event => {\n  const { charIndex, charLength } = event\n  text.innerHTML = highlight(originalText, charIndex, charIndex + charLength)\n})\nsynth.speak(utterance)\n
Run Code Online (Sandbox Code Playgroud)\n

完整示例:

\n
const btn = document.getElementById("btn")\n\nconst highlight = (text, from, to) => {\n  let replacement = highlightBackground(text.slice(from, to))\n  return text.substring(0, from) + replacement + text.substring(to)\n}\nconst highlightBackground = sample => `<span style="background-color:yellow;">${sample}</span>`\n\nbtn && btn.addEventListener(\'click\', () => {\n  const synth = window.speechSynthesis\n  if (!synth) {\n    console.error(\'no tts\')\n    return\n  }\n  let text = document.getElementById(\'text\')\n  let originalText = text.innerText\n  let utterance = new SpeechSynthesisUtterance(originalText)\n  utterance.addEventListener(\'boundary\', event => {\n    const { charIndex, charLength } = event\n    text.innerHTML = highlight(originalText, charIndex, charIndex + charLength)\n   })\n  synth.speak(utterance)\n})\n
Run Code Online (Sandbox Code Playgroud)\n

代码沙盒链接

\n

这是非常基本的,您可以(并且应该)改进它。

\n

编辑

\n

糟糕,我忘记了它被标记为 ReactJs。这是 React 的相同示例(codesandbox 链接在评论中):

\n
import React from "react";\n\nconst ORIGINAL_TEXT =\n  "Call me Ishmael. Some years ago\xe2\x80\x94never mind how long precisely\xe2\x80\x94having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world.";\n\nconst splitText = (text, from, to) => [\n  text.slice(0, from),\n  text.slice(from, to),\n  text.slice(to)\n];\n\nconst HighlightedText = ({ text, from, to }) => {\n  const [start, highlight, finish] = splitText(text, from, to);\n  return (\n    <p>\n      {start}\n      <span style={{ backgroundColor: "yellow" }}>{highlight}</span>\n      {finish}\n    </p>\n  );\n};\n\nexport default function App() {\n  const [highlightSection, setHighlightSection] = React.useState({\n    from: 0,\n    to: 0\n  });\n  const handleClick = () => {\n    const synth = window.speechSynthesis;\n    if (!synth) {\n      console.error("no tts");\n      return;\n    }\n\n    let utterance = new SpeechSynthesisUtterance(ORIGINAL_TEXT);\n    utterance.addEventListener("boundary", (event) => {\n      const { charIndex, charLength } = event;\n      setHighlightSection({ from: charIndex, to: charIndex + charLength });\n    });\n    synth.speak(utterance);\n  };\n\n  return (\n    <div className="App">\n      <HighlightedText text={ORIGINAL_TEXT} {...highlightSection} />\n      <button onClick={handleClick}>klik me</button>\n    </div>\n  );\n}\n
Run Code Online (Sandbox Code Playgroud)\n