使用 nbconvert 执行包含内联降价的 Jupyter 笔记本

Question

使用 nbconvert 执行包含内联降价的 Jupyter 笔记本

Mol*_*lly 8 python jupyter jupyter-notebook

我有一个 Jupyter 笔记本，它在 Markdown 单元格中包含 python 变量，如下所示：

代码单元格：

x = 10

Run Code Online (Sandbox Code Playgroud)

降价单元格：

The value of x is {{x}}.

Run Code Online (Sandbox Code Playgroud)

在IPython的笔记本扩展Python的降价使得如果我移请在笔记本执行降价电池我动态显示这些变量。

降价单元格：

The value of x is 10.

Run Code Online (Sandbox Code Playgroud)

我想以编程方式执行笔记本中的所有单元格，并使用以下内容将它们保存到新笔记本中：

import nbformat
from nbconvert.preprocessors import ExecutePreprocessor

with open('report.ipynb') as f:
    nb = nbformat.read(f, as_version=4)
        ep = ExecutePreprocessor(timeout=600, kernel_name='python3')
        ep.preprocess(nb, {})
with open('report_executed.ipynb', 'wt') as f:
    nbformat.write(nb, f)

Run Code Online (Sandbox Code Playgroud)

这将执行代码单元而不是降价单元。它们仍然是这样的：

The value of x is {{x}}.

Run Code Online (Sandbox Code Playgroud)

我认为问题在于笔记本不受信任。有没有办法告诉 ExecutePreprocessor 信任笔记本？是否有另一种方法以编程方式执行笔记本，包括降价单元格中的 python 变量？

Answer 1

Gor*_*ean 6

ExecutePreprocessor只查看代码单元格，因此您的降价单元格完全没有受到影响。如您所说，要进行降价处理，您需要 Python Markdown 预处理器。

不幸的是，Python Markdown 预处理器系统只在实时笔记本中执行代码，它通过修改与渲染单元格相关的 javascript 来实现。修改将执行代码片段的结果存储在单元元数据中。

在PyMarkdownPreprocessor类（在pre_pymarkdown.py）被设计为与先确认呈现在现场的笔记本设置笔记本nbconvert操作使用。它处理降价单元格，用{{}}存储在元数据中的值替换模式。

但是，在您的情况下，您没有实时笔记本元数据。我有一个类似的问题，我通过编写自己的执行预处理器来解决它，该预处理器还包括处理降价单元的逻辑：

from nbconvert.preprocessors import ExecutePreprocessor, Preprocessor
import nbformat, nbconvert
from textwrap import dedent

class ExecuteCodeMarkdownPreprocessor(ExecutePreprocessor):

    def __init__(self, **kw):
        self.sections = {'default': True} # maps section ID to true or false
        self.EmptyCell = nbformat.v4.nbbase.new_raw_cell("")

        return super().__init__(**kw)

    def preprocess_cell(self, cell, resources, cell_index):
        """
        Executes a single code cell. See base.py for details.
        To execute all cells see :meth:`preprocess`.
        """

        if cell.cell_type not in ['code','markdown']:
            return cell, resources

        if cell.cell_type == 'code':
            # Do code stuff
            return self.preprocess_code_cell(cell, resources, cell_index)

        elif cell.cell_type == 'markdown':
            # Do markdown stuff
            return self.preprocess_markdown_cell(cell, resources, cell_index)
        else:
            # Don't do anything
            return cell, resources

    def preprocess_code_cell(self, cell, resources, cell_index):
        ''' Process code cell.
        '''
        outputs = self.run_cell(cell)
        cell.outputs = outputs

        if not self.allow_errors:
            for out in outputs:
                if out.output_type == 'error':
                    pattern = u"""\
                        An error occurred while executing the following cell:
                        ------------------
                        {cell.source}
                        ------------------
                        {out.ename}: {out.evalue}
                        """
                    msg = dedent(pattern).format(out=out, cell=cell)
                    raise nbconvert.preprocessors.execute.CellExecutionError(msg)

        return cell, resources

    def preprocess_markdown_cell(self, cell, resources, cell_index):
        # Find and execute snippets of code
        cell['metadata']['variables'] = {}
        for m in re.finditer("{{(.*?)}}", cell.source):
            # Execute code
            fakecell = nbformat.v4.nbbase.new_code_cell(m.group(1))
            fakecell, resources = self.preprocess_code_cell(fakecell, resources, cell_index)

            # Output found in cell.outputs
            # Put output in cell['metadata']['variables']
            for output in fakecell.outputs:
                html = self.convert_output_to_html(output)
                if html is not None:
                    cell['metadata']['variables'][fakecell.source] = html
                    break
        return cell, resources

    def convert_output_to_html(self, output):
        '''Convert IOpub output to HTML

        See https://github.com/ipython-contrib/IPython-notebook-extensions/blob/master/nbextensions/usability/python-markdown/main.js
        '''
        if output['output_type'] == 'error':
            text = '**' + output.ename + '**: ' + output.evalue; 
            return text
        elif output.output_type == 'execute_result' or output.output_type == 'display_data':
            data = output.data
            if 'text/latex' in data:
                html = data['text/latex']
                return html
            elif 'image/svg+xml' in data:
                # Not supported
                #var svg = ul['image/svg+xml'];
                #/* embed SVG in an <img> tag, still get eaten by sanitizer... */
                #svg = btoa(svg);
                #html = '<img src="data:image/svg+xml;base64,' + svg + '"/>';
                return None
            elif 'image/jpeg' in data:
                jpeg = data['image/jpeg']
                html = '<img src="data:image/jpeg;base64,' + jpeg + '"/>'
                return html
            elif 'image/png' in data:
                png = data['image/png']
                html = '<img src="data:image/png;base64,' + png + '"/>'
                return html
            elif 'text/markdown' in data:
                text = data['text/markdown']
                return text
            elif 'text/html' in data:
                html = data['text/html']
                return html
            elif 'text/plain' in data:
                text = data['text/plain']
                # Strip <p> and </p> tags
                # Strip quotes
                # html.match(/<p>([\s\S]*?)<\/p>/)[1]
                text = re.sub(r'<p>([\s\S]*?)<\/p>', r'\1', text)
                text = re.sub(r"'([\s\S]*?)'",r'\1', text)
                return text
            else:
            # Some tag we don't support
                return None
        else:
            return None

Run Code Online (Sandbox Code Playgroud)

然后，您可以使用类似于您发布的代码的逻辑处理您的笔记本：

import nbformat
from nbconvert.preprocessors import ExecutePreprocessor
import ExecuteCodeMarkdownPreprocessor # from wherever you put it
import PyMarkdownPreprocessor # from pre_pymarkdown.py

with open('report.ipynb') as f:
    nb = nbformat.read(f, as_version=4)
    ep = ExecuteCodeMarkdownPreprocessor(timeout=600, kernel_name='python3')
    ep.preprocess(nb, {})
    pymk = PyMarkdownPreprocessor()
    pymk.preprocess(nb, {})

with open('report_executed.ipynb', 'wt') as f:
    nbformat.write(nb, f)

Run Code Online (Sandbox Code Playgroud)

请注意，通过包含 Python Markdown 预处理，您生成的笔记本文件将不再具有 Markdown{{}}单元格中的语法 - Markdown将具有静态内容。如果结果笔记本的接收者更改了代码并再次执行，则降价将不会更新。但是，如果您要导出为不同的格式（例如 HTML），那么您确实希望将{{}}语法替换为静态内容。

归档时间：	10 年前
查看次数：	4380 次
最近记录：	4 年，8 月前