Draft:Quality Diversity Algorithms

From Wikipedia, the free encyclopedia

Quality-Diversity (QD) algorithms are a class of evolutionary algorithms that simultaneously aim for high-quality and diverse solutions. Unlike traditional optimization algorithms that solely focus on finding the best solution to a problem, QD algorithms explore a wide variety of solutions across a problem space and keep those that are not just high performing, but also diverse and unique. The essence of these algorithms is to identify and maintain a balance between exploration (searching for diverse solutions) and exploitation (refining high-quality solutions).

Background[edit]

Traditionally, optimization algorithms were primarily designed to discover the single best solution to a problem. However, in many real-world scenarios, there exists a need for a diverse set of solutions that are of high quality. Quality-Diversity algorithms emerged from this need, providing a method to navigate large search spaces to uncover varied solutions that are also highly effective.

Drawing a parallel from nature can help illustrate this concept. Consider the domain of moving fast. Animals like the cheetah, ant, and hummingbird each excel at rapid motion, but in distinctly different contexts. A cheetah achieves remarkable speeds on the African plains, using its lithe body and powerful muscles to chase prey. In contrast, an ant, despite its diminutive size, can cover ground quickly relative to its body length, navigating intricate terrains with agility. Meanwhile, the hummingbird's rapid wingbeats allow it to hover and dart through the air with precision. Directly comparing their speeds is not only impractical but also overlooks the unique challenges and environments each has optimized for. Similarly, Quality-Diversity algorithms aim to find a range of solutions optimized for varying conditions, rather than a single "best" one.

Principles of QD Algorithms[edit]

Selection

The primary goal of QD algorithms is to select solutions based on both their quality and diversity. Typically, these algorithms maintain an archive of solutions, and when a new solution is generated, it is compared to those in the archive in terms of both its quality and diversity.

  • Quality Evaluation: Each solution's effectiveness or suitability is assessed using an objective or fitness function. The higher the quality, the more desirable the solution.
  • Diversity Evaluation: Alongside quality, the uniqueness of the solution concerning other solutions is assessed. This ensures that the algorithm doesn't converge prematurely to a narrow region of the solution space.
  • Archive Maintenance: The selection process interacts closely with an archive—a stored set of solutions. Solutions that excel in both quality and diversity are more likely to be added to the archive, potentially replacing others.

Diversity Measurement

Diversity ensures that the algorithm explores a broad section of the solution space, avoiding convergence to a single solution. It's vital for QD algorithms since the aim is to find multiple, varied solutions that are all of high quality.

  • Spatial Metrics: These measure the "distance" between solutions. In a simple numeric representation, this might be Euclidean distance. The idea is that solutions further apart in the solution space are more diverse.
  • Behavioral Characteristics: For more complex problems, especially in domains like robotics or AI, diversity might be measured in terms of behavior. For instance, two AI agents might solve a problem differently, and their methods (or behaviors) can be a basis for measuring diversity.
  • Phenotypic Differences: In some contexts, especially those borrowing from genetic algorithms, the "phenotype" (or expressed traits) of a solution might be used to gauge diversity. Two solutions with different phenotypes are considered diverse, even if their underlying "genetic" representation is similar.

Search Strategy

The search strategy dictates how new potential solutions are generated from existing ones. It's the mechanism of exploration and innovation in QD algorithms.

  • Mutation: This involves introducing small, random changes to a solution to produce a new variant. It's akin to genetic mutations in biology. Through mutation, the algorithm can explore the solution space around known solutions.
  • Crossover or Recombination: Borrowed from evolutionary algorithms, crossover combines parts of two or more "parent" solutions to produce "offspring" solutions. It's a way of blending the characteristics of known solutions to potentially discover new, high-quality solutions.
  • Exploration vs. Exploitation: The balance between exploring new regions of the solution space (through diverse solutions) and exploiting known good regions (focusing on high-quality solutions) is crucial. The search strategy, through its mutation and crossover parameters, can influence this balance.

Applications[edit]

Quality-Diversity algorithms find significant applications in fields like robotics[1][2][3][4][5][6], games[7][8][9], open-ended learning[10][11][12][13], and artificial intelligence[14][15][16][17], where they are used to generate diverse sets of high-performing strategies, designs, or behaviors.

Notable Algorithms[edit]

  • Novelty Search[18]: This algorithm operates on the premise that pursuing objective performance can sometimes prevent finding the global optimum. Instead of rewarding performance, it rewards novelty, encouraging the exploration of the problem space.
  • MAP-Elites[19]: This algorithm, short for Multi-dimensional Archive of Phenotypic Elites, constructs a map of the highest-performing solutions in different regions of the problem space. Each region represents a unique combination of characteristics.
  • NEAT (NeuroEvolution of Augmenting Topologies)[20]: While NEAT is not a QD algorithm per se, it has been used in conjunction with QD methods. NEAT evolves neural networks by starting with small, simple networks and expanding the network structure over generations. Coupling it with QD algorithms promotes the evolution of diverse, high-quality network topologies.
  • There has been a rise in QD-RL (Quality Diversity-Reinforcement Learning) algorithms, integrating aspects of Reinforcement Learning into Quality-Diversity algorithms[21][22][23].
  • Many works have been incorporating Foundation Models into Quality-Diversity algorithms[24][25][26].

See also[edit]

References[edit]

  1. ^ Cully, Antoine; Clune, Jeff; Tarapore, Danesh; Mouret, Jean-Baptiste (2015-05-27). "Robots that can adapt like animals". Nature. 521 (7553): 503–507. arXiv:1407.3501. Bibcode:2015Natur.521..503C. doi:10.1038/nature14422. ISSN 0028-0836. PMID 26017452. S2CID 3467239.
  2. ^ Cully, Antoine; Mouret, Jean-Baptiste (2013-07-06). "Behavioral repertoire learning in robotics". Proceedings of the 15th annual conference on Genetic and evolutionary computation (PDF). New York, NY, USA: ACM. pp. 175–182. doi:10.1145/2463372.2463399. hdl:10044/1/48639. ISBN 9781450319638. S2CID 3476705.
  3. ^ Lehman, Joel; Stanley, Kenneth O. (2011-07-12). "Evolving a diversity of virtual creatures through novelty search and local competition". Proceedings of the 13th annual conference on Genetic and evolutionary computation. New York, NY, USA: ACM. pp. 211–218. doi:10.1145/2001576.2001606. ISBN 9781450305570. S2CID 17338175.
  4. ^ Cully, A.; Mouret, J.-B. (March 2016). "Evolving a Behavioral Repertoire for a Walking Robot". Evolutionary Computation. 24 (1): 59–88. arXiv:1308.3689. doi:10.1162/evco_a_00143. ISSN 1063-6560. PMID 25585055. S2CID 15575131.
  5. ^ Kasap, Zerrin; Magnenat-Thalmann, Nadia (September 2010). "Towards episodic memory-based long-term affective interaction with a human-like robot". 19th International Symposium in Robot and Human Interactive Communication. IEEE. pp. 452–457. doi:10.1109/roman.2010.5598644. ISBN 978-1-4244-7991-7. S2CID 2416683.
  6. ^ Gaier, Adam; Asteroth, Alexander; Mouret, Jean-Baptiste (2017-06-02). "Aerodynamic Design Exploration through Surrogate-Assisted Illumination". 18th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference. Reston, Virginia: American Institute of Aeronautics and Astronautics. doi:10.2514/6.2017-3330. ISBN 978-1-62410-507-4. S2CID 114951289.
  7. ^ Liapis, Antonios; Yannakakis, Georgios N.; Togelius, Julian (March 2015). "Constrained Novelty Search: A Study on Game Content Generation". Evolutionary Computation. 23 (1): 101–129. doi:10.1162/evco_a_00123. ISSN 1063-6560. PMID 24605847. S2CID 2614242.
  8. ^ Khalifa, Ahmed; Lee, Scott; Nealen, Andy; Togelius, Julian (2018-07-02). "Talakat: Bullet hell generation through constrained map-elites". Proceedings of the Genetic and Evolutionary Computation Conference. New York, NY, USA: ACM. pp. 1047–1054. doi:10.1145/3205455.3205470. ISBN 9781450356183. S2CID 49182728.
  9. ^ Gravina, Daniele; Khalifa, Ahmed; Liapis, Antonios; Togelius, Julian; Yannakakis, Georgios N. (August 2019). "Procedural Content Generation through Quality Diversity". 2019 IEEE Conference on Games (CoG). IEEE. pp. 1–8. arXiv:1907.04053. doi:10.1109/cig.2019.8848053. ISBN 978-1-7281-1884-0. S2CID 195848208.
  10. ^ Nguyen, A.; Yosinski, J.; Clune, J. (September 2016). "Understanding Innovation Engines: Automated Creativity and Improved Stochastic Optimization via Deep Learning". Evolutionary Computation. 24 (3): 545–572. doi:10.1162/evco_a_00189. ISSN 1063-6560. PMID 27367139. S2CID 27515105.
  11. ^ Wang, Rui; Lehman, Joel; Clune, Jeff; Stanley, Kenneth O. (2019-07-13). "POET: Open-ended coevolution of environments and their optimized solutions". Proceedings of the Genetic and Evolutionary Computation Conference. New York, NY, USA: ACM. pp. 142–151. doi:10.1145/3321707.3321799. ISBN 9781450361118.
  12. ^ Haserbek, Tastulek; Wen, Ziyang; Xie, Xiaobin; Zhao, Peng; An, Wenfu (2022-07-17). "Model-free End-to-end Learning of Agile Quadrupedal Locomotion over Challenging Terrain". 2022 IEEE International Conference on Real-time Computing and Robotics (RCAR). IEEE. pp. 699–703. doi:10.1109/rcar54675.2022.9872190. ISBN 978-1-6654-6983-8. S2CID 252103573.
  13. ^ Team, Open Ended Learning (2021). "Open-ended learning leads to generally capable agents". arXiv:2107.12808 [cs.LG].
  14. ^ Costa, Victor; Lourenço, Nuno; Correia, João; Machado, Penousal (2020-06-25). "Exploring the evolution of GANs through quality diversity". Proceedings of the 2020 Genetic and Evolutionary Computation Conference. New York, NY, USA: ACM. pp. 297–305. doi:10.1145/3377930.3389824. ISBN 9781450371285.
  15. ^ Gaier, Adam; Asteroth, Alexander; Mouret, Jean-Baptiste (July 2017). "Data-efficient exploration, optimization, and modeling of diverse designs through surrogate-assisted illumination". Proceedings of the Genetic and Evolutionary Computation Conference. New York, NY, USA: ACM. pp. 99–106. doi:10.1145/3071178.3071282. ISBN 9781450349208.
  16. ^ Vassiliades, Vassilis; Chatzilygeroudis, Konstantinos; Mouret, Jean-Baptiste (2017-07-15). "A comparison of illumination algorithms in unbounded spaces". Proceedings of the Genetic and Evolutionary Computation Conference Companion (PDF). New York, NY, USA: ACM. pp. 1578–1581. doi:10.1145/3067695.3082531. ISBN 9781450349390. S2CID 28054250.
  17. ^ Cully, Antoine; Demiris, Yiannis (2018-07-02). "Hierarchical behavioral repertoires with unsupervised descriptors". Proceedings of the Genetic and Evolutionary Computation Conference. New York, NY, USA: ACM. pp. 69–76. arXiv:1804.07127. doi:10.1145/3205455.3205571. hdl:10044/1/58755. ISBN 9781450356183. S2CID 4997173.
  18. ^ Lehman, Joel; Stanley, Kenneth O. (2011), "Novelty Search and the Problem with Objectives", Genetic and Evolutionary Computation, New York, NY: Springer New York, pp. 37–56, doi:10.1007/978-1-4614-1770-5_3, ISBN 978-1-4614-1769-9, retrieved 2023-07-24
  19. ^ Mouret, Jean-Baptiste; Clune, Jeff (2015). "Illuminating search spaces by mapping elite". arXiv:1504.04909 [cs.AI].
  20. ^ Stanley, Kenneth O.; Miikkulainen, Risto (June 2002). "Evolving Neural Networks through Augmenting Topologies". Evolutionary Computation. 10 (2): 99–127. doi:10.1162/106365602320169811. ISSN 1063-6560. PMID 12180173. S2CID 498161.
  21. ^ Nilsson, Olle; Cully, Antoine (2021-06-26). "Policy gradient assisted MAP-Elites". Proceedings of the Genetic and Evolutionary Computation Conference (PDF). New York, NY, USA: ACM. pp. 866–875. doi:10.1145/3449639.3459304. ISBN 9781450383509. S2CID 235495145.
  22. ^ Tjanaka, Bryon; Fontaine, Matthew C.; Togelius, Julian; Nikolaidis, Stefanos (2022-07-08). "Approximating gradients for differentiable quality diversity in reinforcement learning". Proceedings of the Genetic and Evolutionary Computation Conference. New York, NY, USA: ACM. pp. 1102–1111. doi:10.1145/3512290.3528705. ISBN 9781450392372.
  23. ^ Lim, Bryan; Grillotti, Luca; Bernasconi, Lorenzo; Cully, Antoine (2022-05-23). "Dynamics-Aware Quality-Diversity for Efficient Learning of Skill Repertoires". 2022 International Conference on Robotics and Automation (ICRA). IEEE. pp. 5360–5366. arXiv:2109.08522. doi:10.1109/icra46639.2022.9811559. ISBN 978-1-7281-9681-7. S2CID 237344265.
  24. ^ Ding, Li; Zhang, Jenny; Clune, Jeff; Spector, Lee; Lehman, Joel (18 Oct 2023). "Quality Diversity through Human Feedback". arXiv:2310.12103 [cs.AI].
  25. ^ Bradley, Herbie; Dai, Andrew; Teufel, Hannah; Zhang, Jenny; Oostermeijer, Koen; Bellagente, Marco; Clune, Jeff; Stanley, Kenneth; Schott, Gregory; Lehman, Joel (19 Oct 2023). "Quality-Diversity through AI Feedback". arXiv:2310.13032 [cs.CL].
  26. ^ Wang, Ren-Jian; Xue, Ke; Wang, Yutong; Yang, Peng; Fu, Haobo; Fu, Qiang; Qian, Chao (2023). "Diversity from Human Feedback". arXiv:2310.06648 [cs.LG].