-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changing pandas dataframe display style in Rmarkdown #783
Comments
Even leaving aside the styling, there are two things I find interesting with this issue:
I could see this getting much cleaner and customizable through an option somewhere, e.g. |
Did something change to align with this request? I can't get my pandas dataframes to just print output anymore in my markdown files. It always converts int to an HTML table unless I wrap a |
It would be great if pandas data frames were shown nicely in Rmarkdown (R notebooks) same as they appear on Jupyter notebooks (or better, with an indicator of a datatype for each column). The only reason I don't use Rstudio for python is because I am not able to see the full data frames - not scrollable to left and right. This simple feature is very important for data exploration. |
Would it be possible to change the class of pandas DataFrame returned from python and have some adapted methods for printing ? When we do
We end up with an object of class
With an additional class, let's say |
If I understand correctly, this is an MRE:
When this document is rendered via and so you don't get the nice HTML rendering for the Pandas DataFrame you might've hoped for. |
This is where Pandas DataFrames get handled by the reticulate Python engine: Line 578 in a1d7f7f
Note that we don't do anything here; we just use the captured (default) print style for the DataFrame. We considered using the I'm not exactly sure what Jupyter is doing here when rendering DataFrames; presumedly they're using their own tooling for rendering to HTML? Or maybe they're using |
Thanks @kevinushey for your detailed answer. In my case, moving to |
If it can help, in the past, |
I don't know how exactly Jupyter does it, but their output is equivalent to That would mean, in turn, that quarto gets df printing behavior that is consistent across engines (which is the cause of our upstream issue) |
As this came up again on Quarto side, I looked into this a bit. Here are some thoughts and insights
Quarto and R Markdown will do different styling, but at the end this is a matter of printing method to do at knitr step. Currently it is default priting, but it could be improved. AFAIU Jupyter (or nbclient or anything in the stack) registers some representation like reticulate could do something similar to send information to knitr or do the choice itself based on Documenting how to explicitly style a Pandas table using
Going through this idea is also a good option for R Markdown. @kevinushey @t-kalinowski hopes this helps. Happy to help make this better. We would love to have Jupyter and Knitr output for Python to be equivalent in Quarto ! (part of quarto-dev/quarto-cli#3457) Examples showing the different point mentioned aboveHere are some tests I did with the rendering and different options with R Markdown Rmd Source---
title: "Pandas Printing"
author: "Kevin Ushey"
date: "`r Sys.Date()`"
output: html_document
---
```{r}
library(reticulate)
use_virtualenv("r-reticulate", required = TRUE)
py_install(c("pandas", "IPython", "tabulate"))
```
```{python, echo=FALSE}
import pandas as pd
data = {
'size': [1., 1.5, 1],
'weight': [3, 5, 2.5]
}
df = pd.DataFrame(data, index = ['cat', 'dog', 'koala'])
```
# Default render
```{python}
df
```
# Try HTML
Some quote are still there preventing correct printing
```{python}
df.to_html()
```
```{python, results = "asis"}
df.to_html()
```
So it requires some special processing
```{python}
df_html = df.to_html()
```
```{r, results='asis'}
cat(py$df_html)
```
# Using IPython Display helps
```{python, results = "asis"}
from IPython.display import HTML
HTML(df.to_html())
```
# Improve stylings using Bootstrap class
```{python}
df_html = df.to_html(classes = ["table", "table_condensed"])
```
```{r, results='asis'}
cat(py$df_html)
```
```{python, results = "asis"}
HTML(df.to_html(classes = ["table", "table_condensed"]))
```
# Try Markdown
Still quoting, so it requires some special printing
```{python}
df.to_markdown()
```
```{python, results = "asis"}
df.to_markdown()
```
```{python}
df_markdown = df.to_markdown()
```
```{r, results = "asis"}
cat(py$df_markdown)
```
# Using IPython Display helps
```{python, results = "asis"}
from IPython.display import Markdown
Markdown(df.to_markdown())
``` And same document in Quarto Qmd Source---
title: "Pandas Printing"
author: "Christophe Dervieux"
date: today
engine: knitr
format:
html:
code-tools:
source: true
---
```{r}
library(reticulate)
use_virtualenv("r-reticulate", required = TRUE)
py_install(c("pandas", "IPython", "tabulate"))
```
```{python}
import pandas as pd
data = {
'size': [1., 1.5, 1],
'weight': [3, 5, 2.5]
}
df = pd.DataFrame(data, index = ['cat', 'dog', 'koala'])
```
# Default render
```{python}
df
```
# Try HTML
Some quote are still there preventing correct printing
```{python}
df.to_html()
```
```{python}
#| output: asis
df.to_html()
```
So it requires some special processing
```{python}
df_html = df.to_html()
```
```{r}
#| output: asis
cat(py$df_html)
```
# Using IPython Display helps
```{python}
#| output: asis
from IPython.display import HTML
HTML(df.to_html())
```
# Improve stylings using Bootstrap class
```{python}
df_html = df.to_html(classes = ["table", "table_condensed"])
```
```{r}
#| output: asis
cat(py$df_html)
```
```{python}
#| output: asis
HTML(df.to_html(classes = ["table", "table_condensed"]))
```
# Try Markdown
Still quoting, so it requires some special printing
```{python}
df.to_markdown()
```
```{python}
#| output: asis
df.to_markdown()
```
```{python}
df_markdown = df.to_markdown()
```
```{r}
#| output: asis
cat(py$df_markdown)
```
# Using IPython Display helps
```{python}
#| output: asis
from IPython.display import Markdown
Markdown(df.to_markdown())
``` |
Note that I understand now reticulate is catching Pandas DataFrame before any Lines 576 to 580 in a1d7f7f
Regarding quarto-dev/quarto-cli#3457, if the I confirm that removing |
I'm leaning towards changing reticulate to produce the Markdown representation when running trough knitr, witht this change table would be displayed like this in RMarkdown and It would still not look exactly the same as in Quarto + Jupyter Engine, which is displayed like this: The pro of this approach is that it only requires changing reticulate and no need for special handling from RMarkdown which I think can be tricky to coordinate. Do you think this a reasonable approach @cderv? |
About this, I don't think anything is needed in rmarkdown or knitr in general for what *reticulate is doing. knitr is a toolbox for custom engine to use, and everything that reticulate does in a knitting context is defined inside reticulate. So regarding this printing issue, this is only happening based on how reticulate decided to print content, possily in Usually any issue reported as knitr issue but relevant to reticulate python engine are to be fixed in reticulate itself. However, I may be missing something...
I guess this would be fine to output Markdown table only. Quarto does parse Markdown tables through Pandoc and does a lot. but Quarto does parse also HTML table so it would be fine too (https://quarto.org/docs/authoring/tables.html) I believe for Jupyter engine, Quarto will select the HTML output as I explained above: #783 (comment) But regarding styling, this is only a matter of CSS. We can definitely fix that in Quarto to get the same styling. Hope it helps. Happy to discuss, help and test as needed. |
I have come across this issue lately and think it would be really helpful if reticulate supported rich display of pandas data frames. In addition to what has already been mentioned above and the helpful documents shared by @cderv, I wanted to note that <style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style> |
I would like to be able to change the display style of a pandas data frame, this code works in Jupyter, would be awesome to get it to work in R markdown. Currently it displays an incomplete version of the html string instead of the nicely formatted html table. Rmarkdown file attached.
dframe.Rmd.zip
Displaying a pandas data frame nicely
OK we have a complicated pandas data frame and we want to show it nicely. Passing it to R and using kable
or something like that is not an option because when passing a pandas dataframe with multi-index to R
those indexes will dissapear. Let's start by displaying the dataframe:
OK not bad (what are those commas before and after the table btw?), but looks boring. Let's try to
beautify with some CSS. OOPS, but the resulting html is not rendered, why?
The text was updated successfully, but these errors were encountered: