If I understood the question correctly, you have trained an algorithm that splits your data into $N$ disjoint clusters. Now you want to assign prediction $1$ to some subset of the clusters, and $0$ to the rest of them. And amont those subsets, you want to find the pareto-optimal ones, i.e. those who maximize true positive rate given fixed number of positive predictions (this is equivalent to fixing PPV). Is it correct?
This sounds very much like knapsack problem! Cluster sizes are "weights" and number of positive samples in a cluster are "values", and you want to fill your knapsack of fixed capacity with as much value as possible.
The knapsack problem has several algorihms for finding exact solutions (e.g. by dynamic programming). But a useful greedy solution is to sort your clusters in decreasing order of $\frac{value}{weight}$ (that is, share of positive samples), and take the first $k$. If you take $k$ from $0$ to $N$, you can very cheaply sketch your ROC curve.
And if you assign $1$ to the first $k-1$ clusters and to the random fraction $p\in[0,1]$ of samples in the $k$th cluster, you get the upper bound to the knapsack problem. With this, you can draw the upper bound for your ROC curve.
Here goes a python example:
import numpy as np
from itertools import combinations, chain
import matplotlib.pyplot as plt
np.random.seed(1)
n_obs = 1000
n = 10
# generate clusters as indices of tree leaves
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_predict
X, target = make_classification(n_samples=n_obs)
raw_clusters = DecisionTreeClassifier(max_leaf_nodes=n).fit(X, target).apply(X)
recoding = {x:i for i, x in enumerate(np.unique(raw_clusters))}
clusters = np.array([recoding[x] for x in raw_clusters])
def powerset(xs):
""" Get set of all subsets """
return chain.from_iterable(combinations(xs,n) for n in range(len(xs)+1))
def subset_to_metrics(subset, clusters, target):
""" Calculate TPR and FPR for a subset of clusters """
prediction = np.zeros(n_obs)
prediction[np.isin(clusters, subset)] = 1
tpr = sum(target*prediction) / sum(target) if sum(target) > 0 else 1
fpr = sum((1-target)*prediction) / sum(1-target) if sum(1-target) > 0 else 1
return fpr, tpr
# evaluate all subsets
all_tpr = []
all_fpr = []
for subset in powerset(range(n)):
tpr, fpr = subset_to_metrics(subset, clusters, target)
all_tpr.append(tpr)
all_fpr.append(fpr)
# evaluate only the upper bound, using knapsack greedy solution
ratios = [target[clusters==i].mean() for i in range(n)]
order = np.argsort(ratios)[::-1]
new_tpr = []
new_fpr = []
for i in range(n):
subset = order[0:(i+1)]
tpr, fpr = subset_to_metrics(subset, clusters, target)
new_tpr.append(tpr)
new_fpr.append(fpr)
plt.figure(figsize=(5,5))
plt.scatter(all_tpr, all_fpr, s=3)
plt.plot(new_tpr, new_fpr, c='red', lw=1)
plt.xlabel('TPR')
plt.ylabel('FPR')
plt.title('All and Pareto-optimal subsets')
plt.show();
This code will draw a nice picture for you:

The blue dots are (FPR, TPR) tuples for all $2^{10}$ subsets, and the red line connects (FPR, TPR) for the pareto-optimal subsets.
And now the bit of salt: you did not have to bother about subsets at all! What I did is sorted tree leaves by the fraction of positive samples in each. But what I got is exactly the ROC curve for the probabilistic prediction of the tree. This means, you cannot outperform the tree by hand-picking its leaves based on the target frequencies in the training set.
You can relax and keep using ordinary probabilistic prediction :)