2

I want a subset of all simple paths in a relatively large network (~1k nodes and 5k-50k edges). Currently, I generate all simple paths (igraph::all_simple_paths) followed by subsetting. However, all_simple_paths is memory- and time-intensive for larger networks, and since I don't need all paths, running it seems like a waste.

My current strategy is randomly downsample network to a reasonable number of edges --> all_simple_paths --> subset. Repeating this a few times and combining seems to give reasonable results. However this still spends time generating paths I don't use. What are better ways to do this?

For clarity, any algorithm that generates a representative subset of all simple paths is sufficient. I assume that "representative" means an unbiased random sample.


Edit: I see this answer addressing a similar problem, but it's many times slower than all_simple_paths-->subset for my networks.

R Greg Stacey
  • 391
  • 3
  • 15
  • Is the problem memory or time? If memory, igraph itself could be changed to allow testing each path for inclusion in your subset before recoding it. This would not save time. Saving time is only possible by having a path-generation algorithm specific to your inclusion criterion, which is probably a mini research problem in itself. The criterion that igraph does already support is an upper limit on the path length. – Szabolcs Apr 09 '21 at 11:56
  • Thanks. How do I implement the upper limit path length in `all_simple_paths`? I don't have an inclusion criterion, just want a random sample of all the paths, so a path generating algorithm that "stops partway" would be good enough. – R Greg Stacey Apr 09 '21 at 19:04
  • "How do I implement the upper limit path length" Did you check the docs and look at the `cutoff` argument? – Szabolcs Apr 09 '21 at 19:47
  • 1
    An algorithm that stops partway will decidedly _not_ produce a random sample. "Random" is not the same as "arbitrary". – Szabolcs Apr 09 '21 at 19:48
  • Thanks @Szabolcs. I don't see `cutoff` in the R `all_simple_paths` arguments, and using it throws an error. Are you maybe using python igraph instead of R? – R Greg Stacey Apr 09 '21 at 22:37
  • I don't mean literally stopping partway, just that there are no inclusion criteria other than "is a simple path". But I see your point. For time-saving algorithms, would something like "ignore x% of starting nodes" be reasonable? Happy to clarify this in the post if I'm unclear. – R Greg Stacey Apr 09 '21 at 23:04
  • Oops, I didn't realize I had the development version installed. `cutoff` is not present in the released version. – Szabolcs Apr 10 '21 at 06:51
  • Please see https://github.com/igraph/rigraph/ on how to install the dev version. – Szabolcs Apr 10 '21 at 09:02

0 Answers0