With your submarine's subterranean subsystems subsisting suboptimally, the only way you're getting out of this cave anytime soon is by finding a path yourself. Not just a path - the only way to know if you've found the best path is to find all of them.
Fortunately, the sensors are still mostly working, and so you build a rough map of the remaining caves (your puzzle input). For example:
start-A
start-b
A-c
A-b
b-d
A-end
b-end
This is a list of how all of the caves are connected. You start in the cave named start
, and your destination is the cave named end. An entry like b-d
means that cave b
is connected to cave d
- that is, you can move between them.
So, the above cave system looks roughly like this:
start
/ \
c--A-----b--d
\ /
end
Your goal is to find the number of distinct paths that start at start
, end at end
, and don't visit small caves more than once. There are two types of caves: big caves (written in uppercase, like A
) and small caves (written in lowercase, like b
). It would be a waste of time to visit any small cave more than once, but big caves are large enough that it might be worth visiting them multiple times. So, all paths you find should visit small caves at most once, and can visit big caves any number of times.
Given these rules, there are 10 paths through this example cave system:
start,A,b,A,c,A,end
start,A,b,A,end
start,A,b,end
start,A,c,A,b,A,end
start,A,c,A,b,end
start,A,c,A,end
start,A,end
start,b,A,c,A,end
start,b,A,end
start,b,end
(Each line in the above list corresponds to a single path; the caves visited by that path are listed in the order they are visited and separated by commas.)
Note that in this cave system, cave d
is never visited by any path: to do so, cave b
would need to be visited twice (once on the way to cave d
and a
second time when returning from cave d
), and since cave b
is small, this is not allowed.
Here is a slightly larger example:
dc-end
HN-start
start-kj
dc-start
dc-HN
LN-dc
HN-end
kj-sa
kj-HN
kj-dc
The 19 paths through it are as follows:
start,HN,dc,HN,end
start,HN,dc,HN,kj,HN,end
start,HN,dc,end
start,HN,dc,kj,HN,end
start,HN,end
start,HN,kj,HN,dc,HN,end
start,HN,kj,HN,dc,end
start,HN,kj,HN,end
start,HN,kj,dc,HN,end
start,HN,kj,dc,end
start,dc,HN,end
start,dc,HN,kj,HN,end
start,dc,end
start,dc,kj,HN,end
start,kj,HN,dc,HN,end
start,kj,HN,dc,end
start,kj,HN,end
start,kj,dc,HN,end
start,kj,dc,end
Finally, this even larger example has 226 paths through it:
fs-end
he-DX
fs-he
start-DX
pj-DX
end-zg
zg-sl
zg-pj
pj-he
RW-he
fs-DX
pj-RW
zg-RW
start-pj
he-WI
zg-he
pj-fs
start-RW
How many paths through this cave system are there that visit small caves at most once?
# Python imports
from collections import defaultdict
from copy import copy
from pathlib import Path
from typing import Callable, Dict, Generator, Iterable, List, Tuple
import networkx as nx
# Paths to data
testpath1 = Path("day12_test.txt")
testpath2 = Path("day12_test2.txt")
testpath3 = Path("day12_test3.txt")
datapath = Path("day12_data.txt")
It seems natural to use networkx
to represent the graph in the puzzle.
def load_input(fpath: Path) -> List[Tuple[str, str]]:
"""Return graph as a list of edges
:param fpath: Path to data file
"""
with fpath.open("r") as ifh:
return [tuple(_.strip().split("-")) for _ in ifh.readlines() if len(_.strip())]
def graph_from_edges(edges: List[Tuple[str, str]]) -> nx.Graph:
"""Returns a graph corresponding to the edgelist
:param edges: list of edges in puzzle data
"""
gph = nx.Graph() # empty Graph
# Add edges
for start, end in edges:
gph.add_edge(start, end)
return gph
Our strategy is to maintain multiple "active" paths, spreading out one neighbour at a time. The initial path is the start
node. We maintain a list of visited small caves for each active path.
A path becomes inactive (and drops out) if a small cave would be visited twice. It becomes inactive (and a "final path") if the end
node is visited.
def find_paths(gph: nx.Graph) -> List[List[str]]:
"""Returns all paths from start to end that never touch a smallcave twice
:param gph: puzzle graph
"""
paths = [(["start"], ["start",])] # initialise paths
fullpaths = [] # will hold paths that reach the end
# Iterate until all test paths are discarded, or found to be
# valid paths from start to end
while len(paths):
newpaths = [] # List of valid paths found in this iteration
for smallcaves, path in paths: # Check each active path
# Iterate over neighbour node labels at end of path, if not
# an already visited small cave
for nbr in [str(_) for _ in gph[path[-1]] if _ not in smallcaves]:
if nbr == "end": # neighbour is end
# Extend path to end and add to fullpaths
fullpaths.append(path[:] + [nbr])
elif nbr.lower() == nbr: # neighbour is small cave
# Extend list of small caves, extend path, and add
# to list of paths for next iteration
newpaths.append((smallcaves[:] + [nbr], path[:] + [nbr]))
else:
# Do not change list of small caves, extend path,
# and add to list of paths for next iteration
newpaths.append((smallcaves[:], path[:] + [nbr]))
# Update new paths for next iteration
paths = newpaths[:]
return fullpaths
We try this on each of the test sets in the puzzle:
edges = load_input(testpath1)
gph = graph_from_edges(edges)
len(find_paths(gph))
10
edges = load_input(testpath2)
gph = graph_from_edges(edges)
len(find_paths(gph))
19
edges = load_input(testpath3)
gph = graph_from_edges(edges)
len(find_paths(gph))
226
And then on the puzzle input:
edges = load_input(datapath)
gph = graph_from_edges(edges)
len(find_paths(gph))
4413
After reviewing the available paths, you realize you might have time to visit a single small cave twice. Specifically, big caves can be visited any number of times, a single small cave can be visited at most twice, and the remaining small caves can be visited at most once. However, the caves named start and end can only be visited exactly once each: once you leave the start cave, you may not return to it, and once you reach the end cave, the path must end immediately.
Now, the 36 possible paths through the first example above are:
start,A,b,A,b,A,c,A,end
start,A,b,A,b,A,end
start,A,b,A,b,end
start,A,b,A,c,A,b,A,end
start,A,b,A,c,A,b,end
start,A,b,A,c,A,c,A,end
start,A,b,A,c,A,end
start,A,b,A,end
start,A,b,d,b,A,c,A,end
start,A,b,d,b,A,end
start,A,b,d,b,end
start,A,b,end
start,A,c,A,b,A,b,A,end
start,A,c,A,b,A,b,end
start,A,c,A,b,A,c,A,end
start,A,c,A,b,A,end
start,A,c,A,b,d,b,A,end
start,A,c,A,b,d,b,end
start,A,c,A,b,end
start,A,c,A,c,A,b,A,end
start,A,c,A,c,A,b,end
start,A,c,A,c,A,end
start,A,c,A,end
start,A,end
start,b,A,b,A,c,A,end
start,b,A,b,A,end
start,b,A,b,end
start,b,A,c,A,b,A,end
start,b,A,c,A,b,end
start,b,A,c,A,c,A,end
start,b,A,c,A,end
start,b,A,end
start,b,d,b,A,c,A,end
start,b,d,b,A,end
start,b,d,b,end
start,b,end
The slightly larger example above now has 103 paths through it, and the even larger example now has 3509 paths through it.
Given these new rules, how many paths through this cave system are there?
The modification we make is to swap out the list of visited small caves for each path, and use a defaultdict(int)
keyed by small cave label, instead. This lets us keep a count of visits to each small cave and impose a maximum of one cave being visited twice.
Otherwise, our strategy is the same: we maintain multiple "active" paths, spreading out one neighbour at a time. The initial path is the start node. We maintain a dict of visited small caves for each active path.
A path becomes inactive (and drops out) if more than one small cave would be visited twice. It becomes inactive (and a "final path") if the end node is visited.
def find_paths_with_one_small_revisit(gph: nx.Graph) -> List[List[str]]:
"""Returns all paths from start to end that never touch a smallcave twice
:param gph: puzzle graph
"""
paths = [(defaultdict(int), ["start",])] # initialise paths
fullpaths = [] # will hold paths that reach the end
# Iterate until all test paths are discarded, or found to be
# valid paths from start to end
while len(paths):
newpaths = [] # list of valid paths found in this iteration
maxvisits = 0 # initialise max visit count for small caves
for smallcaves, path in paths: # Check each active path
# Do we need to update maxvisits?
if maxvisits < 3 and len(smallcaves):
maxvisits = max(smallcaves.values())
# Iterate over neighbour node labels at end of path, if not
# a small cave that has already been visited twice
for nbr in [str(_) for _ in gph[path[-1]] if smallcaves[_] < 2]:
if nbr == "end": # neighbour is end
# Extend path to end and add to fullpaths
fullpaths.append(path[:] + [nbr])
elif nbr.upper() == nbr: # neigbour is large cave
# Do not change dict of small caves, extend path,
# and add to list of paths for next iteration
newpaths.append((copy(smallcaves), path[:] + [nbr]))
elif nbr == "start": # do not revisit start
pass
else: # neighbour is small cave
# Extend dict of small caves, extend path, and add
# to list of paths for next iteration
newcaves = copy(smallcaves)
if maxvisits < 2: # No cave has yet been visited twice
newcaves[nbr] += 1
newpaths.append((newcaves, path[:] + [nbr]))
elif newcaves[nbr] == 0: # This cave has not been visited
newcaves[nbr] += 1
newpaths.append((newcaves, path[:] + [nbr]))
else: # Cave has been visited, and max revisits already reached
pass
# Update new paths for next iteration
paths = newpaths[:]
return fullpaths
Trying this on the test data:
edges = load_input(testpath1)
gph = graph_from_edges(edges)
len(find_paths_with_one_small_revisit(gph))
36
edges = load_input(testpath2)
gph = graph_from_edges(edges)
len(find_paths_with_one_small_revisit(gph))
103
edges = load_input(testpath3)
gph = graph_from_edges(edges)
len(find_paths_with_one_small_revisit(gph))
3509
And on the puzzle data:
edges = load_input(datapath)
gph = graph_from_edges(edges)
len(find_paths_with_one_small_revisit(gph))
118803
The solution seems a bit slow. We should be able to get the answer in under a second.
Some improvement is needed.
%%timeit
edges = load_input(datapath)
gph = graph_from_edges(edges)
len(find_paths_with_one_small_revisit(gph))
4.08 s ± 153 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)