如何使用iTextSharp将HTML转换为PDF

我想使用iTextSharp将以下HTML转换为PDF,但不知道从哪里开始:

<style>
.headline{font-size:200%}
</style>
<p>
  This <em>is </em>
  <span class="headline" style="text-decoration: underline;">some</span>
  <strong>sample<em> text</em></strong>
  <span style="color: red;">!!!</span>
</p>

Run Code Online (Sandbox Code Playgroud)

c# pdf-generation itextsharp xmlworker

Chr*_*aas

lucky-day

66
推荐指数

3
解决办法

20万
查看次数

ItextSharp尝试解析html进行pdf转换时出错

我使用ItextSharp模块将下面列出的html转换为pdf页面.

        <div style="font-size: 18pt; font-weight: bold;">
    mma<br>mmar</div><br> <br>
    <div style="font-size: 14pt;">Click to View Pricing
    </div>
    <br>
    <div>
    <table>
    <tr><td> <a href="http://www.mma.com/fci" style="color: Blue; font-size: 10pt; text-decoration: underline;"> FCI</a>:</td> 
<td><a href="http://www.mma.com/access/?pn=78211-014" style="color: Blue; font-size: 10pt; text-decoration: underline;"> 78211-014</a></td></tr><tr><td></td> <td>
<a href="http://www.mma.com/access/?pn=78211-009" 
style="color: Blue; font-size: 10pt; text-decoration: underline;">78211-009</td></tr><tr><td></td> <td>
<a href="http://www.mma.com/access/?pn=78211-006" 
style="color: Blue; font-size: 10pt; text-decoration: underline;">78211-006</td></tr><tr><td></td> <td>
<a href="http://www.mma.com/access/?pn=78211-007" 
style="color: Blue; font-size: 10pt; text-decoration: underline;">78211-007</td></tr><tr><td></td> <td>
<a href="http://www.mma.com/access/?pn=78211-003" 
style="color: Blue; font-size: 10pt; text-decoration: underline;">78211-003</td></tr><tr><td></td> <td>
<a href="http://www.mma.com/access/?pn=78211-005" 
style="color: Blue; font-size: 10pt; …

Run Code Online (Sandbox Code Playgroud)

c# itext html-parsing html-agility-pack xmlworker

kar*_*rry

2017 03-04

12
推荐指数

1
解决办法

5万
查看次数

使用itextsharp xmlworker将html转换为pdf并垂直写入文本

是否有可能在xmlworker中实现自下而上的文本方向编写？我想在表格中使用它.我的代码是

     <table border=1>
     <tr>
     <td style="padding-right:18px">
          <p style="writing-mode:sideways-lr;text-align:center">First</p</td>
     <td style="padding-right:18px">
          <p style="writing-mode:sideways-lr;text-align:center">Second</p></td></tr>
     <tr><td><p style="text-align:center">1</p>  </td>
         <td><p style="text-align:center">2</p></td> 
     </tr>
        </table>

Run Code Online (Sandbox Code Playgroud)

但它从html转换为pdf后无法正常工作.文本FIRST和SECOND不是从下到上的方向.

html c# asp.net itextsharp xmlworker

Dan*_*iel

2016 03-27

10
推荐指数

1
解决办法

2万
查看次数

使用iText将HTML转换为PDF

我发布这个问题是因为许多开发人员以不同的形式提出或多或少相同的问题.我将自己回答这个问题(我是iText集团的创始人/首席技术官),因此它可以成为"维基答案".如果Stack Overflow"文档"功能仍然存在,那么这将是文档主题的一个很好的候选者.

源文件:

我想将以下HTML文件转换为PDF:

<html>
    <head>
        <title>Colossal (movie)</title>
        <style>
            .poster { width: 120px;float: right; }
            .director { font-style: italic; }
            .description { font-family: serif; }
            .imdb { font-size: 0.8em; }
            a { color: red; }
        </style>
    </head>
    <body>
        <img src="img/colossal.jpg" class="poster" />
        <h1>Colossal (2016)</h1>
        <div class="director">Directed by Nacho Vigalondo</div>
        <div class="description">Gloria is an out-of-work party girl
            forced to leave her life in New York City, and move back home.
            When reports surface that a giant creature is destroying Seoul,
            she …

Run Code Online (Sandbox Code Playgroud)

html pdf pdf-generation itext xmlworker

Bru*_*gie

2017 12-22

10
推荐指数

2
解决办法

2万
查看次数

iTextSharp HTMLWorker.ParseToList()抛出NullReferenceException

我正在使用iTextSharp v.4来合并一大堆html文件.它工作正常,直到我需要升级到iTextSharp的第5版.

当我将流读取器(读取html文件的内容)传递到HTMLWorker对象的ParseToList方法时,问题就出现了.它抛出一个空引用异常.在调试它时,我可以访问streamReader并确认读取了正确的文件内容.

这是代码:

List<IElement> objects;
try
{
    objects = HTMLWorker.ParseToList(new StringReader(htmlString), null);
}
catch (Exception e)
{
    htmlString = "<html><head></head><body><br/><br/><h2 style='color:#FF0000'>ERROR READING FILE!</h2><h3>File Excluded From Stitched Document!</h3><br/><br/><p>There was an error while trying to read the following file:</p><p><span style='color:#FF0000'>" + fileName + "</span></p></body></html>";
    objects = HTMLWorker.ParseToList(new StringReader(htmlString), null);
}

Run Code Online (Sandbox Code Playgroud)

在catch块中,您将看到我然后使用几乎相同的代码将文本添加到pdf中以表示存在问题.这段代码工作正常.这当然让我觉得问题在于原始html字符串的内容,所以这里是字符串的内容,因为它是在传递给解析器之前的内容:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
    <meta http-equiv="Pragma" content="no-cache" />
    <meta http-equiv="cache-control" content="no-cache" />
</head>
<body style="font-family: Arial, Helvetica, sans-serif; …

Run Code Online (Sandbox Code Playgroud)

.net html c# itext xmlworker

hof*_*lie

2017 03-04

9
推荐指数

1
解决办法

2万
查看次数

IText使用XML Worker防止跨多个页面的行划分

我们正在使用带有XML Worker的iText 5.5.7,并且遇到了长表的问题,其中在页面末尾运行的行被分成两个到下一页(见图).

我们尝试page-break-inside:avoid;按照预防页面中断的建议使用iText,XMLWorker和iText在HTML表格中的PDF页面之间切换但没有效果.

我们试过了

在a中包装每一行<tbody>并应用分页符避免(没有效果)
定位tr, td和应用分页符(无效)
将每个内容包装td在a中div并应用分页符(itext一旦到达页面结尾就停止处理行)

我们的印象page-break-inside:avoid是受到支持但尚未看到对此的确认.是否有使用XML worker创建此效果的示例或最佳实践,或者是执行此级别操作所需的Java API？

干杯

目前正在分页的行:

期望的效果:包含太多数据的行包装到下一页

java itext page-break-inside xmlworker

owl*_*ool

2018 07-11

9
推荐指数

1
解决办法

1934
查看次数

使用iText库将html转换为pdf时未应用hr的内联CSS

我正在使用Itext库为android将html转换为pdf,这是正常工作但在某些事情它没有正确解析.我想创建一个红色的虚线分隔符,但它总是给我一个深灰色的实线分隔符.

我的html标签是

<hr noshade style="border: 0; width:100%;border-bottom-width: 1px; border-bottom-style: dotted; border-bottom-color: red">

Run Code Online (Sandbox Code Playgroud)

我的转换代码

        Document document = new Document(PageSize.A4);

        //this sets the margin to the created pdf
        document.setMargins(35, 35, 150, 100);

        PdfWriter writer = PdfWriter.getInstance(document,
                new FileOutputStream(fileWithinMyDir));
        if (isPrescription) {
            HeaderFooterPageEvent event = new HeaderFooterPageEvent();
            writer.setPageEvent(event);
        } else {
            CertificateFooterPageEvent event = new CertificateFooterPageEvent();
            writer.setPageEvent(event);
        }
        document.open();
        HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
        htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
        htmlContext.setImageProvider(new AbstractImageProvider() {
            public String getImageRootPath() {

                Uri uri = Uri.parse("file:///android_asset/");

                return uri.toString();
            }
        });


        CSSResolver cssResolver = …

Run Code Online (Sandbox Code Playgroud)

html android itext xmlworker

Vis*_*kar

2017 03-02

9
推荐指数

1
解决办法

1288
查看次数

iText7 将 HTML 转换为 PDF“System.NullReferenceException”。

旧标题：iTextSharp 将 HTML 转换为 PDF“文档没有页面。”

我正在使用 iTextSharp 和 xmlworker 将 html 从视图转换为 ASP.NET Core 2.1 中的 PDF

我尝试了在网上找到的许多代码片段，但都生成了异常：

该文档没有页面。

这是我当前的代码：

public static byte[] ToPdf(string html)
{
    byte[] output;
    using (var document = new Document())
    {
        using (var workStream = new MemoryStream())
        {
            PdfWriter writer = PdfWriter.GetInstance(document, workStream);
            writer.CloseStream = false;
            document.Open();
            using (var reader = new StringReader(html))
            {
               XMLWorkerHelper.GetInstance().ParseXHtml(writer, document, reader);
               document.Close();
               output = workStream.ToArray();
            }
        }
   }
   return output;
}

Run Code Online (Sandbox Code Playgroud)

更新1

感谢 @Bruno Lowagie 的建议，我升级到了 iText7 和 pdfHTML，但我找不到太多关于它的教程。 …

pdf itext xmlworker .net-core itext7

Ste*_*tti

2022 06-04

8
推荐指数

1
解决办法

5236
查看次数

iText的XmlWorker无法识别表格单元格的边框底部

XmlWorker无法识别表格单元格上的border-bottom.

这是我的代码:

<table>
    <tbody>
        <tr>
            <th style="width: 20%; height: 40px; vertical-align: top; border-bottom: 1px solid gray">Your name</th>
            <td style="width: 80%; border-bottom: 1px solid gray"></td>
        </tr>
        <tr>
            <th style="height: 40px; vertical-align: top; border-bottom: 1px solid gray">Your lastname</th>
            <td style="border-bottom: 1px solid gray"></td>
        </tr>
    </tbody>
</table>

Run Code Online (Sandbox Code Playgroud)

我正在使用这个官方工具进行测试:
http://demo.itextsupport.com/xmlworker/ (点击"html"按钮,在"字体大小"下面,然后粘贴代码)

我也在使用iTextSharp + MvcRazorToPdf(两个C#库)进行测试.

问题:
如何让边框底部工作？

编辑:
根据此兼容性摘要:http://demo.itextsupport.com/xmlworker/itextdoc/CSS-conformance-list.htm,css 属性border-bottom应该可以正常使用html元素td(cell)

itext itextsharp xmlworker mvcrazortopdf

spo*_*rts

2014 12-17

7
推荐指数

1
解决办法

4616
查看次数

iTextSharp使用空白页面创建PDF

我刚刚将iTextSharp XMLWorker nuget包(及其依赖项)添加到我的项目中,并且我正在尝试将HTML从字符串转换为PDF文件,即使没有抛出异常,也会生成PDF文件两个空白页.为什么？

以前版本的代码只使用了带有HTMLWorker和ParseList方法的iTextSharp 5.5.8.0,然后我切换到

这是我正在使用的代码:

public void ExportToPdf() {
 string htmlString = "";

 Document document = new Document(PageSize.A4, 40, 40, 40, 40);
 var memoryStream = new MemoryStream();

 PdfWriter writer = PdfWriter.GetInstance(document, memoryStream);
 document.Open();

 htmlString = sbBodyMail.ToString();

 XMLWorkerHelper.GetInstance().ParseXHtml(writer, document, new StringReader(htmlString));

 document.Close();
 DownloadFile(memoryStream);
}

public void DownloadFile(MemoryStream memoryStream) {
 //Clears all content output from Buffer Stream
 Response.ClearContent();
 //Clears all headers from Buffer Stream
 Response.ClearHeaders();
 //Adds an HTTP header to the output stream
 Response.AddHeader("Content-Disposition", "attachment;filename=Report_Diagnosis.pdf");
 //Gets or Sets the HTTP MIME …

Run Code Online (Sandbox Code Playgroud)

c# asp.net itextsharp xmlworker

Jul*_*ira

2016 02-17

7
推荐指数

1
解决办法

2566
查看次数