我正在使用iTextSharp从PDF中读取文本内容.我也能读到这一点.但我正在丢失文字格式,如字体,颜色等.有没有办法获得格式.
以下是我用于确切文本的代码段 -
PdfReader reader = new PdfReader("F:\\EBooks\\AspectsOfAjax.pdf");
textBox1.Text = ExtractTextFromPDFBytes(reader.GetPageContent(1));
private string ExtractTextFromPDFBytes(byte[] input)
{
if (input == null || input.Length == 0) return "";
try
{
string resultString = "";
// Flag showing if we are we currently inside a text object
bool inTextObject = false;
// Flag showing if the next character is literal e.g. '\\' to get a '\' character or '\(' to get '('
bool nextLiteral = false;
// () Bracket nesting level. Text appears …
Run Code Online (Sandbox Code Playgroud)