14

I have a huge pandas dataframe I am converting to html table i.e. dataframe.to_html(), its about 1000 rows. Any easy way to use pagination so that I dont have to scroll the whole 1000 rows. Say, view the first 50 rows then click next to see subsequent 50 rows?

DougKruger
  • 3,874
  • 11
  • 37
  • 57
  • That's an intersesting question indeed! If the "pagination" can be implemented using CSS classes, you can try to use [Style](http://pandas.pydata.org/pandas-docs/stable/style.html) conditionally (i.e. rows 0-49 - Style: page1, 50-99 - Style: page2, etc.). – MaxU - stop genocide of UA Aug 11 '16 at 19:25
  • Are you trying to view it within a Jupyter notebook, or as an independent HTML file? – Shovalt Mar 06 '18 at 14:39

1 Answers1

9

The best solution I can think of involves a couple of external JS libraries: JQuery and its DataTables plugin. This will allow for much more than pagination, with very little effort.

Let's set up some HTML, JS and python:

from tempfile import NamedTemporaryFile
import webbrowser

base_html = """
<!doctype html>
<html><head>
<meta http-equiv="Content-type" content="text/html; charset=utf-8">
<script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.2/jquery.min.js"></script>
<link rel="stylesheet" type="text/css" href="https://cdn.datatables.net/1.10.16/css/jquery.dataTables.css">
<script type="text/javascript" src="https://cdn.datatables.net/1.10.16/js/jquery.dataTables.js"></script>
</head><body>%s<script type="text/javascript">$(document).ready(function(){$('table').DataTable({
    "pageLength": 50
});});</script>
</body></html>
"""

def df_html(df):
    """HTML table with pagination and other goodies"""
    df_html = df.to_html()
    return base_html % df_html

def df_window(df):
    """Open dataframe in browser window using a temporary file"""
    with NamedTemporaryFile(delete=False, suffix='.html') as f:
        f.write(df_html(df))
    webbrowser.open(f.name)

And now we can load a sample dataset to test it:

from sklearn.datasets import load_iris
import pandas as pd

iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)

df_window(df)

The beautiful result: enter image description here

A few notes:

  • Notice the pageLength parameter in the base_html string. This is where I defined the default number of rows per page. You can find other optional parameters in the DataTable options page.
  • The df_window function was tested in a Jupyter Notebook, but should work in plain python as well.
  • You can skip df_window and simply write the returned value from df_html into an HTML file.

Edit: how to make this work with a remote session (e.g. Colab)

When working on a remote notebook, like in Colab or Kaggle the temporary file approach won't work, since the file is saved on the remote machine and not accessible by your browser. A workaround for that would be to download the constructed HTML and open it locally (adding to the previous code):

import base64
from IPython.core.display import display, HTML

my_html = df_html(df)
my_html_base64 = base64.b64encode(my_html.encode()).decode('utf-8')
display(HTML(f'<a download href="data:text/html;base64,{my_html_base64}" target="_blank">Download HTML</a>'))

This will result in a link containing the entire HTML encoded as a base64 string. Clicking it will download the HTML file and you can then open it directly and view the table.

Shovalt
  • 5,509
  • 2
  • 30
  • 46
  • I get error in my notebook while using it. `TypeError: a bytes-like object is required, not 'str'`. Do you have any idea? – Ronak Shah Jan 17 '19 at 10:11
  • @RonakShah, I assume you are using python3. Try adding `mode='w+'` to the `NamedTemporaryFile` parameters and let me know if it works. – Shovalt Jan 17 '19 at 10:28
  • There seems to be some issue from my end which I need to figure out. The code given by you works fine. Thank you for your help :) – Ronak Shah Jan 18 '19 at 08:17
  • I have a question. When I tried to run in google collab I can.t do it. is there any suggestion to get in collab?. Thanks in advance. – GSandro_Strongs Apr 10 '21 at 16:03
  • 1
    @GSandro_Strongs I've edited the answer with a solution for you. Let me know how it works. – Shovalt Apr 14 '21 at 06:16
  • @RonakShah thanks for your help, I add this line display(HTML(df_html(df))) and I got the table in the google colab with no necesarry to download. – GSandro_Strongs Apr 15 '21 at 16:57
  • Not working for me. It's returning plain table/df when tried on Jupyter. I don't know much about JS and JQuery. Do I need to install anything on my venv to make this work? – Darshan Jun 25 '21 at 11:54
  • @Darshan - no need to install anything in normal environments. Are you trying the first (local) or second (colab) solution? Any way you can share your code? – Shovalt Jun 30 '21 at 05:42