Welcome to reportsrender’s documentation!

Generate reproducible reports from Rmarkdown or jupyter notebooks

Build Status Documentation Status The uncompromising python formatter

Reportsrender allows to create reproducible, consistently looking HTML reports from both jupyter notebooks and Rmarkdown files. It makes use of papermill and Rmarkdown to execute notebooks and uses Pandoc to convert them to HTML.

Features:
  • two execution engines: papermill and Rmarkdown.

  • support any format supported by jupytext.

  • create self-contained HTML that can be shared easily.

  • hide inputs and/or outputs of cells.

  • parametrized reports

See the documentation for more details!

Getting started

  • Execute an rmarkdown document to HTML using the Rmarkdown engine

reportsrender --engine=rmd my_notebook.Rmd report.html
  • Execute a parametrized jupyter notebook with papermill

reportsrender --engine=papermill jupyter_notebook.ipynb report.html --params="data_file=table.tsv"

Usage from command line

reportsrender

Execute and render a jupyter/Rmarkdown notebook.
The `index` subcommand generates an index html
or markdown file that links to html documents.

Usage:
  reportsrender <notebook> <out_file> [--cpus=<cpus>] [--params=<params>] [--engine=<engine>]
  reportsrender index [--index=<index_file>] [--title=<title>] [--] <html_files>...
  reportsrender --help

Arguments and options:
  <notebook>            Input notebook to be executed. Can be any format supported by jupytext.
  <out_file>            Output HTML file.
  -h --help             Show this screen.
  --cpus=<cpus>         Number of CPUs to use for Numba/Numpy/OpenBLAS/MKL [default: 1]
  --params=<params>     space-separated list of key-value pairs that will be passed
                        to papermill/Rmarkdown.
                        E.g. "input_file=dir/foo.txt output_file=dir2/bar.html"
  --engine=<engine>     Engine to execute the notebook. [default: auto]

Arguments and options of the `index` subcommand:
  <html_files>          List of HTML files that will be included in the index. The tool
                        will generate relative links from the index file to these files.
  --index=<index_file>  Path to the index file that will be generated. Will be
                        overwritten if exists. Will auto-detect markdown (.md) and
                        HTML (.html) format based on the extension. [default: index.html]
  --title=<title>       Headline of the index. [default: Index]

Possible engines are:
  auto                  Use `rmd` engine for `*.Rmd` files, papermill otherwise.
  rmd                   Use `rmarkdown` to execute the notebook. Supports R and
                        python (through reticulate)
  papermill             Use `papermill` to execute the notebook. Works for every
                        kernel available in the jupyter installation.

Installation

Manual installation:

Get dependencies:

For the Rmarkdown render engine additionally (there is no need to install them if you are not going to use the Rmarkdown rendeirng engine):

  • R and the following packages:

rmarkdown
reticulate

then,

Install from pip:

pip install reportsrender

or,

Install from github:

pip install flit
flit installfrom github:grst/reportsrender

Features

Execution engines

Reportsrender comes with two execution engines:

  • Rmarkdown. This engine makes use of the Rmarkdown package implemented in R. Essentially, this engine calls Rscript -e “rmarkdown::render()”. It supports Rmarkdown notebooks (Rmd format) and python notebooks through reticulate.

  • Papermill. This engine combines papermill and nbconvert to parametrize and execute notebooks. It supports any programming language for which a jupyter kernel is installed.

Supported notebook formats

Reportsrender uses jupytext to convert between input formats. Here is the full list of supported formats.

So no matter if you want to run an Rmd file with papermill, an ipynb with Rmarkdown or a Hydrogen percent script, reportsrender has got you covered.

Hiding cell inputs/outputs

You can hide inputs and or outputs of individual cells:

Papermill engine:

Within a jupyter notebook:

  • edit cell metadata

  • add one of the following tags: hide_input, hide_output, remove_cell

{
    "tags": [
        "remove_cell"
    ]
}

Rmarkdown engine:

  • all native input control options (e.g. results=’hide’, include=FALSE, echo=FALSE) are supported. See the Rmarkdown documentation for more details.

Jupytext automatically converts the tags to Rmarkdown options for all supported formats.

Parametrized notebooks

Papermill engine:

Example:

  • Add the tag parameters to the metadata of a cell in a jupyter notebook.

  • Declare default parameters in that cell:

input_file = '/path/to/default_file.csv'
  • Use the variable as any other:

import pandas as pd
pd.read_csv(input_file)

Rmarkdown engine:

Example:

  • Declare the parameter in the yaml frontmatter.

  • You can set default parameters that will be used when the notebook is executed interactively in Rstudio. They will be overwritten when running through reportsrender.

---
title: My Document
output: html_document
params:

  input_file: '/path/to/default_file.csv'
---
  • Access the parameters from the code:

read_csv(params$input_file)

Be compatible with both engines:

Yes it’s possible! You can execute the same notebook with both engines. Adding parameters is a bit more cumbersome though.

Example (Python notebook stored as .Rmd file using jupytext):

---
title: My Document
output: html_document
params:
  input_file: '/path/to/default_file.csv'
---

```{python tags=c("parameters")}
try:
    # try to get param from Rmarkdown using reticulate.
    input_file = r.params["input_file"]
except:
    # won't work if running papermill. Re-declare default parameters.
    input_file = "/path/to/default_file.csv"
```

Sharing reports

Reportsrender create self-contained HTML files that can be easily shared, e.g. via email.

I do, however, recommend using github pages to upload and share your reports. A central website serves as a single point of truth and elimiates the problem of different versions of your reports being emailed around.

You can make use of reportsrender index to automatically generate an index page listing multiple reports:

Say, you generated several reports and already put them into your github-pages directory:

gh-pages
├── 01_preprocess_data.html
├── 02_analyze_data.html
└── 03_visualize_data.htmlp

Then you can generate an index file listing and linking to your reports by running

reportsrender index --index gh-pages/index.md gh-pages/*.html

For more details see Usage from command line and reportsrender.build_index()

Password protection

Not all analyses can be shared publicly. Unfortunately, github-pages does not support password protection.

There is a workaround, though:

As github-pages doesn’t list directories, you can simply create a long, cryptic subdirectory, e.g. t8rry6poj7ua6eujqpb57 and put your reports within. Only people with whom you share the exact link will be able to access the site.

Combine notebooks into a pipeline

Reportsrender is built with pipelines in mind. You can easily combine individual analysis steps into a fully reproducible pipeline using workflow engines such as Nextflow or Snakemake.

A full example how such a pipeline might look like is available in a dedicated GitHub repository: universal_analysis_pipeline. It’s based on Nextflow, but could easily be adapted to other pipelining engines.

Usage as Python library

Reportsrender provides a public API that can be used to execute and convert notebooks to HTML:

Execute and render notebooks as HTML reports.

render_rmd(input_file, output_file[, params])

Wrapper function to render an Rmarkdown document with the R rmarkdown package and convert it to HTML using pandoc and a custom template.

render_papermill(input_file, output_file[, …])

Wrapper function to render a jupytext/jupyter notebook with papermill and pandoc.

run_pandoc(in_file, out_file[, res_path, …])

Convert to HTML using pandoc.

build_index(html_files, output_file[, title])

Create an index file referencing all specified html files.