小编ks1*_*321的帖子

我正在使用 R 包 edgarWebR 来解析 SEC 文件，例如https://www.sec.gov/Archives/edgar/data/1060224/000090480206000008/sa10k306.htm。它返回一个数据框，其中一列（称为“原始”）是 HTML。它将 HTML 页面分解为段落，每段一行：

其他栏目	生的	文本
第一排	`<p id="PARA339" style="TEXT-ALIGN: left; MARGIN: 0pt; LINE-HEIGHT: 1.25"><font style="FONT-SIZE: 10pt; FONT-FAMILY: Times New Roman, Times, serif"><i>We had a net loss of $1.</i><i><b>55</b></i><i> million for the year ended December 31, 201</i><i>6</i><i> and have an accumulated deficit of $</i><i>61.5</i><i> million as of December 31, 201</i><i>6</i><i>. To achieve sustainable profitability, we must generate increased revenue.</i></font></p>`	截至 2016 年 12 月 31 日止年度，我们的净亏损为 155 万美元，截至 2016 年 12 …

5
推荐指数

1
解决办法

260
查看次数