2323
Comment:
|
14174
|
Deletions are marked like this. | Additions are marked like this. |
Line 6: | Line 6: |
An experiment is conducted in three stages: Generation of experiments, fetching of results and production of reports. Each stage has its own generic main module: `experiments.py`, `resultfetcher.py` and `reports.py`. These modules provide useful classes and methods and can be imported by scripts that actually define concrete actions. For the fast downward planning system the example scripts that use these modules are `downward-experiments.py`, `downward-resultfetcher.py` and `downward-reports.py`. The first one can be seen as a reference example for own experiments, the other two can be used as they are from the commandline. Passing `-h` on the commandline gives you an overview of each script's commands. | An experiment is conducted in three stages: Generation of experiments, fetching of results and production of reports. Each stage has its own generic main module: `experiments.py`, `resultfetcher.py` and `reports.py`. These modules provide useful classes and methods and can be imported by scripts that actually define concrete actions. For the fast downward planning system the example scripts that use these modules are `downward-experiments.py`, `downward-resultfetcher.py` and `downward-reports.py`. Together they can be used to conduct fast downward experiments. Passing `-h` on the commandline gives you an overview of each script's commands. |
Line 11: | Line 11: |
./downward-experiments.py test-exp -c cea -s TEST }}} Generates a simple planning experiment with the configuration cea and the suite TEST in the directory "test-exp". |
./downward-experiments.py test-exp -c downward_configs.py:cfg2 -s TEST }}} Generates a simple planning experiment for the suite TEST in the directory "test-exp". The planner will use the configuration string `cfg2` found in the file `downward_configs.py`. Let's have a look at the file `downward-experiments.py` to learn about the methods that `experiments.py` provides (Note: For easier understanding this is actually an earlier version of the file that doesn't support loading configurations and instead has one hardcoded configuration): {{{#!highlight python #! /usr/bin/env python """ Simple script to demonstrate the use of experiments.py for planning experiments """ import os import experiments import planning_suites # We can add our own commandline parameters parser = experiments.ExpArgParser() parser.add_argument('-s', '--suite', default=[], nargs='+', help='tasks, domains or suites') # Factory for experiments. # # Parses cmd-line options to decide whether this is a gkigrid # experiment or a local experiment. # NOTE: All parameters are given to the experiment instance exp = experiments.build_experiment(parser=parser) # Includes a "global" file, i.e., one needed for all runs, into the # experiment archive. In case of GkiGridExperiment, copies it to the # main directory of the experiment. The name "PLANNER" is an ID for # this resource that can also be used to refer to it in shell scripts. exp.add_resource("PLANNER", "../downward/search/downward", "downward") problems = planning_suites.build_suite(exp.suite) for problem in problems: # Adds a new run to the experiment and returns it run = exp.add_run() # Make the planner resource available for this run. # In environments like the argo cluster, this implies # copying the planner into each task. For the gkigrid, we merely # need to set up the PLANNER environment variable. run.require_resource('PLANNER') domain_file = problem.domain_file() problem_file = problem.problem_file() # Copy "../benchmarks/domain/domain.pddl" into the run # directory under name "domain.pddl" and make it available as # resource "DOMAIN" (usable as environment variable $DOMAIN). run.add_resource('DOMAIN', domain_file, 'domain.pddl') run.add_resource('PROBLEM', problem_file, 'problem.pddl') translator_path = '../downward/translate/translate.py' translator_path = os.path.abspath(translator_path) translate_cmd = '%s %s %s' % (translator_path, domain_file, problem_file) preprocessor_path = '../downward/preprocess/preprocess' preprocessor_path = os.path.abspath(preprocessor_path) preprocess_cmd = '%s < %s' % (preprocessor_path, 'output.sas') # Optionally, can use run.set_preprocess() and # run.set_postprocess() to specify code that should be run # before the main command, i.e., outside the part for which we # restrict runtime and memory. For example, post-processing # could be used to rename result files or zipping them up. The # postprocessing code can find out whether the command succeeded # or was aborted via the environment variable $RETURNCODE run.set_preprocess('%s; %s' % (translate_cmd, preprocess_cmd)) # A bash fragment that gives the code to be run when invoking # this job. run.set_command("$PLANNER --search 'astar(cea())' < output") # Specifies that all files names "plan.soln*" (using # shell-style glob patterns) are part of the experiment output. # There's a corresponding declare_required_output for output # files that must be present at the end or we have an error. A # specification like this is e.g. necessary for the Argo # cluster. On the gkigrid, this wouldn't do anything, although # the declared outputs are stored so that we # can later verify that all went according to plan. run.declare_optional_output('*.groups') run.declare_optional_output('output') run.declare_optional_output('output.sas') run.declare_optional_output('sas_plan') # Set some properties to be able to analyze the run correctly # The properties are written into the "properties" file run.set_property('config', 'astar-cea') run.set_property('domain', problem.domain) run.set_property('problem', problem.problem) # The run's id determines the directory it will be copied to by resultfetcher run.set_property('id', ['astar-cea', problem.domain, problem.problem]) # Actually write and copy all the files exp.build() }}} The file can also be seen as a reference example for your own experiments. As you can see not many lines are needed to conduct a full-fledged experiment. When you invoke the script you can specify on the command line whether you want the experiment to be run locally or on the gkigrid. You can also directly set the timeout, memory limit, number of processes, etc. Local experiments can be started by running {{{ ./test-exp/run }}} Gkigrid experiments are submitted to the queue by running {{{ qsub test-exp/test-exp.q }}} |
Line 24: | Line 136: |
=== Combine results of multiple experiments === It is possible to combine the results of multiple experiments by running the above command on all the experiment directories while specifying the target evaluation directory. If you don't specify the evaluation directory it defaults to "exp-name-eval". An example would be {{{ ./downward-resultfetcher.py exp1 --dest my-eval-dir ./downward-resultfetcher.py exp2 --dest my-eval-dir }}} |
|
Line 32: | Line 152: |
=== Show significant changes only === You can also compare '''two''' configs/revisions and only show the rows that have changed significantly. To do so select a relative report (`-r rel`) and specify a percent number (`--change number`). The report will then only contain those rows for which the two values in the row have changed by more than `number` percent. Here is an example: {{{#!highlight python ./downward-reports.py exp-eval/ -a expanded -r rel --change 5 }}} This will only show the rows where the number of expansions has changed by more than five percent. === Making a problem suite from results === The `downward-reports.py` also gives you the possibility to create a new problem suite based on the results of an experiment. To select a subset of problems you can specify filters for the set of the runs. E.g. to get a list of problems that had more than 1000 states expanded in the `ou` config, you could issue the following command: {{{#!highlight bash ./downward-reports.py your-exp-name --filter config:eq:ou expanded:gt:1000 }}} (Remember to pass the correct name for config, it might not be just its nickname) As you can see the format of a filter is '''<attribute_name>:<operator from the [[http://docs.python.org/library/operator.html|operator]] module>:<value>'''. If the expression '''operator(run[attribute], value)''' evaluates to True, the run's planning problem is '''not''' removed from the result list. == Comparing different revisions == If you want to compare different revisions of fast-downward, you can use the python module `downward_comparisons.py`. It provides an easy way to select and compare specific revisions of the three subsystems (translate, preprocess and search). It does so by using the `experiments.py` module. The usage is pretty simple. As an example we will look at the code that has been used to get some information about [[http://issues.fast-downward.org/issue69|issue69]] from the issue tracker (The code resides in issue69.py): {{{#!highlight python from downward_comparisons import * combinations = [ (TranslatorCheckout(), PreprocessorCheckout(), PlannerCheckout(rev=3612)), (TranslatorCheckout(), PreprocessorCheckout(), PlannerCheckout(rev=3613)), (TranslatorCheckout(), PreprocessorCheckout(), PlannerCheckout(rev='HEAD')), ] build_comparison_exp(combinations) }}} This code builds an experiment that compares three revisions of the search component; rev 3612, rev 3613 and the latest (HEAD) revision. As you can see, the translation and preprocessing components have been assigned no explicit revision. This can be done since all different Checkouts default to the HEAD revision. The different Checkout classes also have another keyword parameter called `repo_url` that can be used when you don't want to checkout a subsystem from trunk. One combination of three checkouts results in one run of the fast-downward system (translate -> preprocess -> search) for each problem and configuration. Obviously you should checkout different revisions of the subsystems you want to compare and let the other subsystems have the same revisions in all runs. As another example, if you want to compare your modified translator in your own branch with the one from trunk, you could do: {{{#!highlight python combinations = [ (TranslatorCheckout(repo_url='svn+ssh://downward/branches/my-own-translator/downward/translate', rev=1234), PreprocessorCheckout(), PlannerCheckout()), (TranslatorCheckout(), PreprocessorCheckout(), PlannerCheckout()), ] }}} When running your script, you'll be prompted to specify the suites and configurations. You have the same options here as for the `downward-experiments.py` script. === Example: Issue 7 === You don't have to supply new Checkout instances for each combination. See the comparison experiment for [[http://issues.fast-downward.org/issue7|issue7]] as an example. This code compares three different revisions of the translator in an external branch with the checked-in translator from trunk. The revisions for the preprocessor and the search component remain the same for all combinations. {{{#!highlight python from downward_comparisons import * branch = 'svn+ssh://downward/branches/translate-andrew/downward/translate' preprocessor = PreprocessorCheckout() planner = PlannerCheckout(rev=3842) combinations = [ (TranslatorCheckout(repo_url=branch, rev=3827), preprocessor, planner), (TranslatorCheckout(repo_url=branch, rev=3829), preprocessor, planner), (TranslatorCheckout(repo_url=branch, rev=3840), preprocessor, planner), (TranslatorCheckout(rev=4283), preprocessor, planner), ] build_comparison_exp(combinations) }}} === Example: Checking impacts of your changes === If you want to check the impacts of your changes in your working copy with the latest checked-in revision you don't even have to write your own python script. All you have to do is to invoke the comparisons module and supply the appropriate command line parameters. An example could be: {{{#!highlight bash ./downward_comparisons.py your-exp-name -s TEST -c yY }}} This command creates an experiment that compares the subsystems in the working copy with the versions of the subsystems in the head revision for the TEST suite using the yY configuration. All you have to do to evaluate the results after that is {{{#!highlight bash qsub your-exp-name/your_exp_name.q OR ./your-exp-name/run ./downward-resultfetcher.py your-exp-name ./downward-reports.py your-exp-name }}} You find the report in the reports directory. === Behind the scenes of this example === The code that is called by the `downward_comparisons.py` module when invoked as a script is pretty simple and shows more usage options of the module: {{{#!highlight python combinations = [ get_same_rev_combo('HEAD'), get_same_rev_combo('WORK'), ] build_comparison_exp(combinations) }}} As you can see instead of writing {{{#!highlight python (TranslatorCheckout(rev='HEAD'), PreprocessorCheckout(rev='HEAD'), PlannerCheckout(rev='HEAD')) }}} we can write {{{#!highlight python get_same_rev_combo('HEAD') }}} You can also see that you can set `rev` to `'WORK'` to use the working copy. |
Back to HomePage.
Experiment scripts
In the directory "new-scripts" you find some scripts that facilitate conducting experiments. An experiment is conducted in three stages: Generation of experiments, fetching of results and production of reports. Each stage has its own generic main module: experiments.py, resultfetcher.py and reports.py. These modules provide useful classes and methods and can be imported by scripts that actually define concrete actions. For the fast downward planning system the example scripts that use these modules are downward-experiments.py, downward-resultfetcher.py and downward-reports.py. Together they can be used to conduct fast downward experiments. Passing -h on the commandline gives you an overview of each script's commands.
Generate an experiment
./downward-experiments.py test-exp -c downward_configs.py:cfg2 -s TEST
Generates a simple planning experiment for the suite TEST in the directory "test-exp". The planner will use the configuration string cfg2 found in the file downward_configs.py.
Let's have a look at the file downward-experiments.py to learn about the methods that experiments.py provides (Note: For easier understanding this is actually an earlier version of the file that doesn't support loading configurations and instead has one hardcoded configuration):
1 #! /usr/bin/env python
2 """
3 Simple script to demonstrate the use of experiments.py for planning experiments
4 """
5 import os
6
7 import experiments
8 import planning_suites
9
10 # We can add our own commandline parameters
11 parser = experiments.ExpArgParser()
12 parser.add_argument('-s', '--suite', default=[], nargs='+',
13 help='tasks, domains or suites')
14
15 # Factory for experiments.
16 #
17 # Parses cmd-line options to decide whether this is a gkigrid
18 # experiment or a local experiment.
19 # NOTE: All parameters are given to the experiment instance
20 exp = experiments.build_experiment(parser=parser)
21
22 # Includes a "global" file, i.e., one needed for all runs, into the
23 # experiment archive. In case of GkiGridExperiment, copies it to the
24 # main directory of the experiment. The name "PLANNER" is an ID for
25 # this resource that can also be used to refer to it in shell scripts.
26 exp.add_resource("PLANNER", "../downward/search/downward",
27 "downward")
28
29 problems = planning_suites.build_suite(exp.suite)
30
31 for problem in problems:
32 # Adds a new run to the experiment and returns it
33 run = exp.add_run()
34
35 # Make the planner resource available for this run.
36 # In environments like the argo cluster, this implies
37 # copying the planner into each task. For the gkigrid, we merely
38 # need to set up the PLANNER environment variable.
39 run.require_resource('PLANNER')
40
41 domain_file = problem.domain_file()
42 problem_file = problem.problem_file()
43
44 # Copy "../benchmarks/domain/domain.pddl" into the run
45 # directory under name "domain.pddl" and make it available as
46 # resource "DOMAIN" (usable as environment variable $DOMAIN).
47 run.add_resource('DOMAIN', domain_file, 'domain.pddl')
48 run.add_resource('PROBLEM', problem_file, 'problem.pddl')
49
50 translator_path = '../downward/translate/translate.py'
51 translator_path = os.path.abspath(translator_path)
52 translate_cmd = '%s %s %s' % (translator_path, domain_file, problem_file)
53
54 preprocessor_path = '../downward/preprocess/preprocess'
55 preprocessor_path = os.path.abspath(preprocessor_path)
56 preprocess_cmd = '%s < %s' % (preprocessor_path, 'output.sas')
57
58 # Optionally, can use run.set_preprocess() and
59 # run.set_postprocess() to specify code that should be run
60 # before the main command, i.e., outside the part for which we
61 # restrict runtime and memory. For example, post-processing
62 # could be used to rename result files or zipping them up. The
63 # postprocessing code can find out whether the command succeeded
64 # or was aborted via the environment variable $RETURNCODE
65 run.set_preprocess('%s; %s' % (translate_cmd, preprocess_cmd))
66
67 # A bash fragment that gives the code to be run when invoking
68 # this job.
69 run.set_command("$PLANNER --search 'astar(cea())' < output")
70
71 # Specifies that all files names "plan.soln*" (using
72 # shell-style glob patterns) are part of the experiment output.
73 # There's a corresponding declare_required_output for output
74 # files that must be present at the end or we have an error. A
75 # specification like this is e.g. necessary for the Argo
76 # cluster. On the gkigrid, this wouldn't do anything, although
77 # the declared outputs are stored so that we
78 # can later verify that all went according to plan.
79 run.declare_optional_output('*.groups')
80 run.declare_optional_output('output')
81 run.declare_optional_output('output.sas')
82 run.declare_optional_output('sas_plan')
83
84 # Set some properties to be able to analyze the run correctly
85 # The properties are written into the "properties" file
86 run.set_property('config', 'astar-cea')
87 run.set_property('domain', problem.domain)
88 run.set_property('problem', problem.problem)
89 # The run's id determines the directory it will be copied to by resultfetcher
90 run.set_property('id', ['astar-cea', problem.domain, problem.problem])
91
92 # Actually write and copy all the files
93 exp.build()
The file can also be seen as a reference example for your own experiments. As you can see not many lines are needed to conduct a full-fledged experiment. When you invoke the script you can specify on the command line whether you want the experiment to be run locally or on the gkigrid. You can also directly set the timeout, memory limit, number of processes, etc.
Local experiments can be started by running
./test-exp/run
Gkigrid experiments are submitted to the queue by running
qsub test-exp/test-exp.q
Fetch and parse results
./downward-resultfetcher.py test-exp
Traverses the directory tree under "test-exp" and parses each run's experiment files. The results are written into a new directory structure under "test-exp-eval". In the process each run's properties file is read and its "id" determines the run's destination directory in the new directory tree. By default only the properties file is copied and the parsed values are added to it. To copy all files you can pass the "-c" option.
Combine results of multiple experiments
It is possible to combine the results of multiple experiments by running the above command on all the experiment directories while specifying the target evaluation directory. If you don't specify the evaluation directory it defaults to "exp-name-eval". An example would be
./downward-resultfetcher.py exp1 --dest my-eval-dir ./downward-resultfetcher.py exp2 --dest my-eval-dir
Make reports
./downward-reports.py test-exp-eval
Reads all properties files found under "test-exp-eval" and generates a big dataset from them. This dataset is serialized into the "test-exp-eval" directory for faster future reports. If you want to reload the information directly from the properties files, pass the "--reload" parameter.
The dataset is then used to generate a report. By default this report contains absolute numbers, writes a Latex file and analyzes all numeric attributes found in the dataset. You can however choose only a subset of attributes and filter by configurations or suites, too. A detailed description of the available parameters can be obtained by invoking downward-reports.py -h.
Show significant changes only
You can also compare two configs/revisions and only show the rows that have changed significantly. To do so select a relative report (-r rel) and specify a percent number (--change number). The report will then only contain those rows for which the two values in the row have changed by more than number percent. Here is an example:
1 ./downward-reports.py exp-eval/ -a expanded -r rel --change 5
This will only show the rows where the number of expansions has changed by more than five percent.
Making a problem suite from results
The downward-reports.py also gives you the possibility to create a new problem suite based on the results of an experiment. To select a subset of problems you can specify filters for the set of the runs. E.g. to get a list of problems that had more than 1000 states expanded in the ou config, you could issue the following command:
1 ./downward-reports.py your-exp-name --filter config:eq:ou expanded:gt:1000
(Remember to pass the correct name for config, it might not be just its nickname) As you can see the format of a filter is <attribute_name>:<operator from the operator module>:<value>. If the expression operator(run[attribute], value) evaluates to True, the run's planning problem is not removed from the result list.
Comparing different revisions
If you want to compare different revisions of fast-downward, you can use the python module downward_comparisons.py. It provides an easy way to select and compare specific revisions of the three subsystems (translate, preprocess and search). It does so by using the experiments.py module. The usage is pretty simple. As an example we will look at the code that has been used to get some information about issue69 from the issue tracker (The code resides in issue69.py):
1 from downward_comparisons import *
2
3 combinations = [
4 (TranslatorCheckout(), PreprocessorCheckout(), PlannerCheckout(rev=3612)),
5 (TranslatorCheckout(), PreprocessorCheckout(), PlannerCheckout(rev=3613)),
6 (TranslatorCheckout(), PreprocessorCheckout(), PlannerCheckout(rev='HEAD')),
7 ]
8
9 build_comparison_exp(combinations)
This code builds an experiment that compares three revisions of the search component; rev 3612, rev 3613 and the latest (HEAD) revision. As you can see, the translation and preprocessing components have been assigned no explicit revision. This can be done since all different Checkouts default to the HEAD revision. The different Checkout classes also have another keyword parameter called repo_url that can be used when you don't want to checkout a subsystem from trunk.
One combination of three checkouts results in one run of the fast-downward system (translate -> preprocess -> search) for each problem and configuration. Obviously you should checkout different revisions of the subsystems you want to compare and let the other subsystems have the same revisions in all runs.
As another example, if you want to compare your modified translator in your own branch with the one from trunk, you could do:
When running your script, you'll be prompted to specify the suites and configurations. You have the same options here as for the downward-experiments.py script.
Example: Issue 7
You don't have to supply new Checkout instances for each combination. See the comparison experiment for issue7 as an example. This code compares three different revisions of the translator in an external branch with the checked-in translator from trunk. The revisions for the preprocessor and the search component remain the same for all combinations.
1 from downward_comparisons import *
2
3 branch = 'svn+ssh://downward/branches/translate-andrew/downward/translate'
4
5 preprocessor = PreprocessorCheckout()
6 planner = PlannerCheckout(rev=3842)
7
8 combinations = [
9 (TranslatorCheckout(repo_url=branch, rev=3827), preprocessor, planner),
10 (TranslatorCheckout(repo_url=branch, rev=3829), preprocessor, planner),
11 (TranslatorCheckout(repo_url=branch, rev=3840), preprocessor, planner),
12 (TranslatorCheckout(rev=4283), preprocessor, planner),
13 ]
14
15 build_comparison_exp(combinations)
Example: Checking impacts of your changes
If you want to check the impacts of your changes in your working copy with the latest checked-in revision you don't even have to write your own python script. All you have to do is to invoke the comparisons module and supply the appropriate command line parameters. An example could be:
1 ./downward_comparisons.py your-exp-name -s TEST -c yY
This command creates an experiment that compares the subsystems in the working copy with the versions of the subsystems in the head revision for the TEST suite using the yY configuration. All you have to do to evaluate the results after that is
You find the report in the reports directory.
Behind the scenes of this example
The code that is called by the downward_comparisons.py module when invoked as a script is pretty simple and shows more usage options of the module:
As you can see instead of writing
1 (TranslatorCheckout(rev='HEAD'), PreprocessorCheckout(rev='HEAD'), PlannerCheckout(rev='HEAD'))
we can write
1 get_same_rev_combo('HEAD')
You can also see that you can set rev to 'WORK' to use the working copy.