我正在尝试从文本区域中删除 MSWord 格式信息,但不知道如何执行此操作。\n情况就像我需要将 MSWord 中的一些内容复制粘贴到文本框编辑器中。\n它复制得很好,但问题是所有格式也都被复制,因此我的 300 个字符的句子扩展到 20000 个字符格式的句子。\n有人可以建议我该怎么做吗?
\n\n好吧,完成一些研发后,我已经达到了一定的阶段。
\n\n这是我从 Word 文档复制的文本
\n\nOnce the user clicks on the Cancel icon for a transaction on the Status of Business, and the transaction is eligible for cancellation, a new screen titled \xe2\x80\x9cCancel Transaction\xe2\x80\x9d will appear, with the following fields: \nRun Code Online (Sandbox Code Playgroud)\n\n这是我在 $("#textAreaId").val() 中得到的内容
\n\n"\n\n Normal\n 0\n\n\n\n\n false\n false\n false\n\n EN-US\n X-NONE\n X-NONE\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nOnce the user clicks on the Cancel icon for a\ntransaction on the Status of Business, and the transaction is eligible for\ncancellation, a new screen titled \xe2\x80\x9cCancel Transaction\xe2\x80\x9d will appear, with the\nfollowing fields: \n\n\n\n /* Style Definitions */\n table.MsoNormalTable\n {mso-style-name:"Table Normal";\n mso-style-parent:"";\n line-height:115%;\n font-:11.0pt;"Calibri","sans-serif";\n mso-bidi-"Times New Roman";}\n\n"\nRun Code Online (Sandbox Code Playgroud)\n
我终于在这里找到了解决方案
// removes MS Office generated guff
function cleanHTML(input) {
// 1. remove line breaks / Mso classes
var stringStripper = /(\n|\r| class=(")?Mso[a-zA-Z]+(")?)/g;
var output = input.replace(stringStripper, ' ');
// 2. strip Word generated HTML comments
var commentSripper = new RegExp('<!--(.*?)-->','g');
var output = output.replace(commentSripper, '');
var tagStripper = new RegExp('<(/)*(meta|link|span|\\?xml:|st1:|o:|font)(.*?)>','gi');
// 3. remove tags leave content if any
output = output.replace(tagStripper, '');
// 4. Remove everything in between and including tags '<style(.)style(.)>'
var badTags = ['style', 'script','applet','embed','noframes','noscript'];
for (var i=0; i< badTags.length; i++) {
tagStripper = new RegExp('<'+badTags[i]+'.*?'+badTags[i]+'(.*?)>', 'gi');
output = output.replace(tagStripper, '');
}
// 5. remove attributes ' style="..."'
var badAttributes = ['style', 'start'];
for (var i=0; i< badAttributes.length; i++) {
var attributeStripper = new RegExp(' ' + badAttributes[i] + '="(.*?)"','gi');
output = output.replace(attributeStripper, '');
}
return output;
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
4346 次 |
| 最近记录: |