format/round numerical legend label in GeoPandas

Question

I'm looking for a way to format/round the numerical legend labels in those maps produced by .plot() function in GeoPandas. For example:

gdf.plot(column='pop2010', scheme='QUANTILES', k=4)

This gives me a legend with many decimal places:

I want the legend label to be integers.

Brendan · Accepted Answer · 2019-06-21T14:52:29.243

As I recently encountered the same issue, and a solution does not appear to be readily available on Stack Overflow or other sites, I thought I would post the approach I took in case it is useful.

First, a basic plot using the geopandas world map:

# load world data set    
world_orig = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
world = world_orig[(world_orig['pop_est'] > 0) & (world_orig['name'] != "Antarctica")].copy()
world['gdp_per_cap'] = world['gdp_md_est'] / world['pop_est']

# basic plot
fig = world.plot(column='pop_est', figsize=(12,8), scheme='fisher_jenks', 
                 cmap='YlGnBu', legend=True)
leg = fig.get_legend()
leg._loc = 3
plt.show()

The method I used relied on the get_texts() method for the matplotlib.legend.Legend object, then iterating over the items in leg.get_texts(), splitting the text element into the lower and upper bounds, and then creating a new string with formatting applied and setting this with the set_text() method.

# formatted legend
fig = world.plot(column='pop_est', figsize=(12,8), scheme='fisher_jenks', 
                 cmap='YlGnBu', legend=True)
leg = fig.get_legend()
leg._loc = 3

for lbl in leg.get_texts():
    label_text = lbl.get_text()
    lower = label_text.split()[0]
    upper = label_text.split()[2]
    new_text = f'{float(lower):,.0f} - {float(upper):,.0f}'
    lbl.set_text(new_text)

plt.show()

This is very much a 'trial and error' approach, so I wouldn't be surprised if there were a better way. Still, perhaps this will be helpful.

I finally got a chance to read the paysal doc and update a solution. Please take a look. — steven, Jun 26 '19 at 14:20
The solution from steven below is perhaps more systematic, but I liked your solution as a small "fixup", only modifying the final plot. In case anyone is trying this with subplots, e.g. `fig, ax = plt.subplots(1, 1,figsize=(10,12))`, use `leg = ax.get_legend()` to get legend, not `leg = fig.get_legend()`. — Alex, Jul 10 '20 at 06:20
forgot to say: I also had to remove `leg._loc = 3`, otherwise I would get a ValueError: `too many values to unpack (expected 2)`. However, with `fig = world.plot(...legend=True, legend_kwds={'loc': 'lower right'})` it works. — Alex, Jul 10 '20 at 06:59

steven · Answer 2 · 2021-03-24T03:55:37.173

Method 1:

GeoPandas uses PySal's mapclassify. Here's an example of quantiles map (k=5).

import matplotlib.pyplot as plt
import numpy as np
import mapclassify   # 2.3.0
import geopandas as gpd   # 0.8.1

# load dataset
path = gpd.datasets.get_path('naturalearth_lowres')
gdf = gpd.read_file(path)
# generate a random column
np.random.seed(0)
gdf['random_col'] = np.random.normal(100, 10, len(gdf))

# plot quantiles map
fig, ax = plt.subplots(figsize=(10, 10))
gdf.plot(column='random_col', scheme='quantiles', k=5, cmap='Blues',
         legend=True, legend_kwds=dict(loc=6), ax=ax)

This gives us:

Assume that we want to round the numbers in the legend. We can get the classification via .Quantiles() function in mapclassify.

mapclassify.Quantiles(gdf.random_col, k=5)

The function returns an object mapclassify.classifiers.Quantiles:

Quantiles               

    Interval       Count
------------------------
[ 74.47,  91.51] |    36
( 91.51,  97.93] |    35
( 97.93, 103.83] |    35
(103.83, 109.50] |    35
(109.50, 123.83] |    36

The object has an attribute bins, which returns an array containing the upper bounds in all classes.

array([ 91.51435701,  97.92957441, 103.83406507, 109.49954895,
       123.83144775])

Thus, we can use this function to get all the bounds of the classes since the upper bound in a lower class equals the lower bound in the higher class. The only one missing is the lower bound in the lowest class, which equals the minimum value of the column you are trying to classify in your DataFrame.

Here's an example to round all numbers to integers:

# get all upper bounds
upper_bounds = mapclassify.Quantiles(gdf.random_col, k=5).bins

# get and format all bounds
bounds = []
for index, upper_bound in enumerate(upper_bounds):
    if index == 0:
        lower_bound = gdf.random_col.min()
    else:
        lower_bound = upper_bounds[index-1]

    # format the numerical legend here
    bound = f'{lower_bound:.0f} - {upper_bound:.0f}'
    bounds.append(bound)

# get all the legend labels
legend_labels = ax.get_legend().get_texts()

# replace the legend labels
for bound, legend_label in zip(bounds, legend_labels):
    legend_label.set_text(bound)

We will eventually get:

Method 2:

In addition to GeoPandas' .plot() method, you can also consider .choropleth() function offered by geoplot in which you can easily use different types of scheme and number of classes while passing a legend_labels arg to modify the legend labels. For example,

import geopandas as gpd
import geoplot as gplt

path = gpd.datasets.get_path('naturalearth_lowres')
gdf = gpd.read_file(path)

legend_labels = ['< 2.4', '2.4 - 6', '6 - 15', '15 - 38', '38 - 140 M']
gplt.choropleth(gdf, hue='pop_est', cmap='Blues', scheme='quantiles',
                legend=True, legend_labels=legend_labels)

which gives you

Indeed. You need to change to the corresponding classification function in pysal. This geopands [doc](http://geopandas.org/mapping.html#choosing-colors) explains — steven, Jun 26 '19 at 14:39
Method 1 worked for me - thanks! I converted it to "percentiles" which meant removing the k parameter and works perfectly. It's one of those problems you think should be much easier to solve! Great work around though. — c_m_conlan, Jan 30 '21 at 12:25

format/round numerical legend label in GeoPandas

2 Answers2

Linked