Differences between revisions 17 and 18
Revision 17 as of 2021-06-16 08:17:10
Size: 16262
Editor: XmlRpcBot
Comment:
Revision 18 as of 2021-06-21 14:16:33
Size: 20446
Editor: XmlRpcBot
Comment:
Deletions are marked like this. Additions are marked like this.
Line 46: Line 46:
== hillclimbing == == Hill climbing ==
This algorithm uses hill climbing to generate patterns optimized for the [[Doc/Evaluator#Canonical_PDB|Canonical PDB]] heuristic. It it described in the following paper:

 * Patrik Haslum, Adi Botea, Malte Helmert, Blai Bonet and Sven Koenig.<<BR>>
 [[http://www.informatik.uni-freiburg.de/~ki/papers/haslum-etal-aaai07.pdf|Domain-Independent Construction of Pattern Database Heuristics for Cost-Optimal Planning]].<<BR>>
 In ''Proceedings of the 22nd AAAI Conference on Artificial Intelligence (AAAI 2007)'', pp. 1007-1012. AAAI Press, 2007.

For implementation notes, see:

 * Silvan Sievers, Manuela Ortlieb and Malte Helmert.<<BR>>
 [[https://ai.dmi.unibas.ch/papers/sievers-et-al-socs2012.pdf|Efficient Implementation of Pattern Database Heuristics for Classical Planning]].<<BR>>
 In ''Proceedings of the Fifth Annual Symposium on Combinatorial Search (SoCS 2012)'', pp. 105-111. AAAI Press, 2012.
Line 50: Line 62:
Line 58: Line 69:
'''Note:''' The pattern collection created by the algorithm will always contain all patterns consisting of a single goal variable, even if this violates the pdb_max_size or collection_max_size limits.

'''Note:''' This pattern generation method generates patterns optimized for use with the canonical pattern database heuristic.

=== Implementation Notes ===
The following will very briefly describe the algorithm and explain the differences between the original implementation from 2007 and the new one in Fast Downward.

The aim of the algorithm is to output a pattern collection for which the [[Doc/Evaluator#Canonical_PDB|Canonical PDB]] yields the best heuristic estimates.

The algorithm is basically a local search (hill climbing) which searches the "pattern neighbourhood" (starting initially with a pattern for each goal variable) for improving the pattern collection. This is done as described in the section "pattern construction as search" in the paper, except for the corrected search neighbourhood discussed below. For evaluating the neighbourhood, the "counting approximation" as introduced in the paper was implemented. An important difference however consists in the fact that this implementation computes all pattern databases for each candidate pattern rather than using A* search to compute the heuristic values only for the sample states for each pattern.

Also the logic for sampling the search space differs a bit from the original implementation. The original implementation uses a random walk of a length which is binomially distributed with the mean at the estimated solution depth (estimation is done with the current pattern collection heuristic). In the Fast Downward implementation, also a random walk is used, where the length is the estimation of the number of solution steps, which is calculated by dividing the current heuristic estimate for the initial state by the average operator costs of the planning task (calculated only once and not updated during sampling!) to take non-unit cost problems into account. This yields a random walk of an expected lenght of np = 2 * estimated number of solution steps. If the random walk gets stuck, it is being restarted from the initial state, exactly as described in the original paper.

The section "avoiding redundant evaluations" describes how the search neighbourhood of patterns can be restricted to variables that are relevant to the variables already included in the pattern by analyzing causal graphs. There is a mistake in the paper that leads to some relevant neighbouring patterns being ignored. See the [[https://ai.dmi.unibas.ch/research/publications.html|errata]] for details. This mistake has been addressed in this implementation. The second approach described in the paper (statistical confidence interval) is not applicable to this implementation, as it doesn't use A* search but constructs the entire pattern databases for all candidate patterns anyway.
The search is ended if there is no more improvement (or the improvement is smaller than the minimal improvement which can be set as an option), however there is no limit of iterations of the local search. This is similar to the techniques used in the original implementation as described in the paper.

Factory for pattern collections

combo

combo(max_states=1000000)
  • max_states (int [1, infinity]): maximum abstraction size for combo strategy

Genetic Algorithm Patterns

The following paper describes the automated creation of pattern databases with a genetic algorithm. Pattern collections are initially created with a bin-packing algorithm. The genetic algorithm is used to optimize the pattern collections with an objective function that estimates the mean heuristic value of the the pattern collections. Pattern collections with higher mean heuristic estimates are more likely selected for the next generation.

genetic(pdb_max_size=50000, num_collections=5, num_episodes=30, mutation_probability=0.01, disjoint=false, random_seed=-1)
  • pdb_max_size (int [1, infinity]): maximal number of states per pattern database

  • num_collections (int [1, infinity]): number of pattern collections to maintain in the genetic algorithm (population size)

  • num_episodes (int [0, infinity]): number of episodes for the genetic algorithm

  • mutation_probability (double [0.0, 1.0]): probability for flipping a bit in the genetic algorithm

  • disjoint (bool): consider a pattern collection invalid (giving it very low fitness) if its patterns are not disjoint

  • random_seed (int [-1, infinity]): Set to -1 (default) to use the global random number generator. Set to any other value to use a local random number generator with the given seed.

Note: This pattern generation method uses the zero/one pattern database heuristic.

Implementation Notes

The standard genetic algorithm procedure as described in the paper is implemented in Fast Downward. The implementation is close to the paper.

  1. Initialization
    In Fast Downward bin-packing with the next-fit strategy is used. A bin corresponds to a pattern which contains variables up to pdb_max_size. With this method each variable occurs exactly in one pattern of a collection. There are num_collections collections created.

  2. Mutation
    With probability mutation_probability a bit is flipped meaning that either a variable is added to a pattern or deleted from a pattern.

  3. Recombination
    Recombination isn't implemented in Fast Downward. In the paper recombination is described but not used.

  4. Evaluation
    For each pattern collection the mean heuristic value is computed. For a single pattern database the mean heuristic value is the sum of all pattern database entries divided through the number of entries. Entries with infinite heuristic values are ignored in this calculation. The sum of these individual mean heuristic values yield the mean heuristic value of the collection.

  5. Selection
    The higher the mean heuristic value of a pattern collection is, the more likely this pattern collection should be selected for the next generation. Therefore the mean heuristic values are normalized and converted into probabilities and Roulette Wheel Selection is used.

Language features supported:

  • action costs: supported

  • conditional effects: not supported

  • axioms: not supported

Hill climbing

This algorithm uses hill climbing to generate patterns optimized for the Canonical PDB heuristic. It it described in the following paper:

For implementation notes, see:

hillclimbing(pdb_max_size=2000000, collection_max_size=20000000, num_samples=1000, min_improvement=10, max_time=infinity, random_seed=-1)
  • pdb_max_size (int [1, infinity]): maximal number of states per pattern database

  • collection_max_size (int [1, infinity]): maximal number of states in the pattern collection

  • num_samples (int [1, infinity]): number of samples (random states) on which to evaluate each candidate pattern collection

  • min_improvement (int [1, infinity]): minimum number of samples on which a candidate pattern collection must improve on the current one to be considered as the next pattern collection

  • max_time (double [0.0, infinity]): maximum time in seconds for improving the initial pattern collection via hill climbing. If set to 0, no hill climbing is performed at all. Note that this limit only affects hill climbing. Use max_time_dominance_pruning to limit the time spent for pruning dominated patterns.

  • random_seed (int [-1, infinity]): Set to -1 (default) to use the global random number generator. Set to any other value to use a local random number generator with the given seed.

Note: The pattern collection created by the algorithm will always contain all patterns consisting of a single goal variable, even if this violates the pdb_max_size or collection_max_size limits.

Note: This pattern generation method generates patterns optimized for use with the canonical pattern database heuristic.

Implementation Notes

The following will very briefly describe the algorithm and explain the differences between the original implementation from 2007 and the new one in Fast Downward.

The aim of the algorithm is to output a pattern collection for which the Canonical PDB yields the best heuristic estimates.

The algorithm is basically a local search (hill climbing) which searches the "pattern neighbourhood" (starting initially with a pattern for each goal variable) for improving the pattern collection. This is done as described in the section "pattern construction as search" in the paper, except for the corrected search neighbourhood discussed below. For evaluating the neighbourhood, the "counting approximation" as introduced in the paper was implemented. An important difference however consists in the fact that this implementation computes all pattern databases for each candidate pattern rather than using A* search to compute the heuristic values only for the sample states for each pattern.

Also the logic for sampling the search space differs a bit from the original implementation. The original implementation uses a random walk of a length which is binomially distributed with the mean at the estimated solution depth (estimation is done with the current pattern collection heuristic). In the Fast Downward implementation, also a random walk is used, where the length is the estimation of the number of solution steps, which is calculated by dividing the current heuristic estimate for the initial state by the average operator costs of the planning task (calculated only once and not updated during sampling!) to take non-unit cost problems into account. This yields a random walk of an expected lenght of np = 2 * estimated number of solution steps. If the random walk gets stuck, it is being restarted from the initial state, exactly as described in the original paper.

The section "avoiding redundant evaluations" describes how the search neighbourhood of patterns can be restricted to variables that are relevant to the variables already included in the pattern by analyzing causal graphs. There is a mistake in the paper that leads to some relevant neighbouring patterns being ignored. See the errata for details. This mistake has been addressed in this implementation. The second approach described in the paper (statistical confidence interval) is not applicable to this implementation, as it doesn't use A* search but constructs the entire pattern databases for all candidate patterns anyway. The search is ended if there is no more improvement (or the improvement is smaller than the minimal improvement which can be set as an option), however there is no limit of iterations of the local search. This is similar to the techniques used in the original implementation as described in the paper.

manual_patterns

manual_patterns(patterns)
  • patterns (list of list of int): list of patterns (which are lists of variable numbers of the planning task).

Multiple CEGAR

This pattern collection generator implements the multiple CEGAR algorithm described in the paper

multiple_cegar(total_max_time=100.0, stagnation_limit=20.0, blacklist_trigger_percentage=0.75, enable_blacklist_on_stagnation=true, max_pdb_size=2000000, max_collection_size=20000000, use_wildcard_plans=true, max_time=infinity, verbosity=normal, random_seed=-1)
  • total_max_time (double [0.0, infinity]): maximum time in seconds for the multiple CEGAR algorithm. The algorithm will always execute at least one iteration, i.e., call the CEGAR algorithm once. This limit possibly overrides the limit specified for the CEGAR algorithm.

  • stagnation_limit (double [1.0, infinity]): maximum time in seconds the multiple CEGAR algorithm allows without generating a new pattern through the CEGAR algorithm. The multiple CEGAR algorithm terminates prematurely if this limit is hit unless enable_blacklist_on_stagnation is enabled.

  • blacklist_trigger_percentage (double [0.0, 1.0]): percentage of total_max_time after which the multiple CEGAR algorithm enables blacklisting for diversification

  • enable_blacklist_on_stagnation (bool): If true, the multiple CEGAR algorithm will enable blacklisting for diversification when stagnation_limit is hit for the first time (unless it was already enabled due to blacklist_trigger_percentage) and terminate when stagnation_limit is hit for the second time.

  • max_pdb_size (int [1, infinity]): maximum number of states per pattern database (ignored for the initial collection consisting of singleton patterns for each goal variable)

  • max_collection_size (int [1, infinity]): maximum number of states in the pattern collection (ignored for the initial collection consisting of singleton patterns for each goal variable)

  • use_wildcard_plans (bool): if true, compute wildcard plans which are sequences of sets of operators that induce the same transition; otherwise compute regular plans which are sequences of single operators

  • max_time (double [0.0, infinity]): maximum time in seconds for the CEGAR algorithm (ignored forcomputing initial collection)

  • verbosity ({silent, normal, verbose, debug}): Option to specify the verbosity level.

    • silent: only the most basic output

    • normal: relevant information to monitor progress

    • verbose: full output

    • debug: like full with additional debug output

  • random_seed (int [-1, infinity]): Set to -1 (default) to use the global random number generator. Set to any other value to use a local random number generator with the given seed.

Implementation Notes

The following describes differences of the implementation to the original implementation used and described in the paper.

Conceptually, there is one larger difference which concerns the computation of (regular or wildcard) plans for PDBs. The original implementation used an enforced hill-climbing (EHC) search with the PDB as the perfect heuristic, which ensured finding strongly optimal plans, i.e., optimal plans with a minimum number of zero-cost operators, in domains with zero-cost operators. The original implementation also slightly modified EHC to search for a best-improving successor, chosen uniformly at random among all best-improving successors.

In contrast, the current implementation computes a plan alongside the computation of the PDB itself. A modification to Dijkstra's algorithm for computing the PDB values stores, for each state, the operator leading to that state (in a regression search). This generating operator is updated only if the algorithm found a cheaper path to the state. After Dijkstra finishes, the plan computation starts at the initial state and iteratively follows the generating operator, computes all operators of the same cost inducing the same transition, until reaching a goal. This constitutes a wildcard plan. It is turned into a regular one by randomly picking a single operator for each transition.

Note that this kind of plan extraction does not consider all successors of a state uniformly at random but rather uses the previously deterministically chosen generating operator to settle on one successor state, which is biased by the number of operators leading to the same successor from the given state. Further note that in the presence of zero-cost operators, this procedure does not guarantee that the computed plan is strongly optimal because it does not minimize the number of used zero-cost operators leading to the state when choosing a generating operator. Experiments have shown (issue1007) that this speeds up the computation significantly while not having a strongly negative effect on heuristic quality due to potentially computing worse plans.

Two further changes fix bugs of the original implementation to match the description in the paper. The first concerns the single CEGAR algorithm: raise a flaw for all goal variables of the task if the plan for a PDB can be executed on the concrete task but does not lead to a goal state. Previously, such flaws would not have been raised because all goal variables are part of the collection from the start on and therefore not considered. This means that the original implementation accidentally disallowed merging patterns due to goal violation flaws. The second bug fix is to actually randomize the order of parallel operators in wildcard plan steps.

Single CEGAR

This pattern collection generator implements the single CEGAR algorithm described in the paper

single_cegar(max_pdb_size=2000000, max_collection_size=20000000, use_wildcard_plans=true, max_time=infinity, verbosity=normal, random_seed=-1)
  • max_pdb_size (int [1, infinity]): maximum number of states per pattern database (ignored for the initial collection consisting of singleton patterns for each goal variable)

  • max_collection_size (int [1, infinity]): maximum number of states in the pattern collection (ignored for the initial collection consisting of singleton patterns for each goal variable)

  • use_wildcard_plans (bool): if true, compute wildcard plans which are sequences of sets of operators that induce the same transition; otherwise compute regular plans which are sequences of single operators

  • max_time (double [0.0, infinity]): maximum time in seconds for the CEGAR algorithm (ignored forcomputing initial collection)

  • verbosity ({silent, normal, verbose, debug}): Option to specify the verbosity level.

    • silent: only the most basic output

    • normal: relevant information to monitor progress

    • verbose: full output

    • debug: like full with additional debug output

  • random_seed (int [-1, infinity]): Set to -1 (default) to use the global random number generator. Set to any other value to use a local random number generator with the given seed.

Implementation Notes

The following describes differences of the implementation to the original implementation used and described in the paper.

Conceptually, there is one larger difference which concerns the computation of (regular or wildcard) plans for PDBs. The original implementation used an enforced hill-climbing (EHC) search with the PDB as the perfect heuristic, which ensured finding strongly optimal plans, i.e., optimal plans with a minimum number of zero-cost operators, in domains with zero-cost operators. The original implementation also slightly modified EHC to search for a best-improving successor, chosen uniformly at random among all best-improving successors.

In contrast, the current implementation computes a plan alongside the computation of the PDB itself. A modification to Dijkstra's algorithm for computing the PDB values stores, for each state, the operator leading to that state (in a regression search). This generating operator is updated only if the algorithm found a cheaper path to the state. After Dijkstra finishes, the plan computation starts at the initial state and iteratively follows the generating operator, computes all operators of the same cost inducing the same transition, until reaching a goal. This constitutes a wildcard plan. It is turned into a regular one by randomly picking a single operator for each transition.

Note that this kind of plan extraction does not consider all successors of a state uniformly at random but rather uses the previously deterministically chosen generating operator to settle on one successor state, which is biased by the number of operators leading to the same successor from the given state. Further note that in the presence of zero-cost operators, this procedure does not guarantee that the computed plan is strongly optimal because it does not minimize the number of used zero-cost operators leading to the state when choosing a generating operator. Experiments have shown (issue1007) that this speeds up the computation significantly while not having a strongly negative effect on heuristic quality due to potentially computing worse plans.

Two further changes fix bugs of the original implementation to match the description in the paper. The first concerns the single CEGAR algorithm: raise a flaw for all goal variables of the task if the plan for a PDB can be executed on the concrete task but does not lead to a goal state. Previously, such flaws would not have been raised because all goal variables are part of the collection from the start on and therefore not considered. This means that the original implementation accidentally disallowed merging patterns due to goal violation flaws. The second bug fix is to actually randomize the order of parallel operators in wildcard plan steps.

Systematically generated patterns

Generates all (interesting) patterns with up to pattern_max_size variables. For details, see

systematic(pattern_max_size=1, only_interesting_patterns=true)
  • pattern_max_size (int [1, infinity]): max number of variables per pattern

  • only_interesting_patterns (bool): Only consider the union of two disjoint patterns if the union has more information than the individual patterns.

FastDownward: Doc/PatternCollectionGenerator (last edited 2024-01-11 22:26:38 by XmlRpcBot)