1СʱӮ1000

文章正文

发布时间：2024-09-28 23:45

����Ԫԭ��

��Դ��Reddit

��༭��

��Ԫ��˱��Ϊ��Ѷ�Զ��Ϸ��˹��ս��Ŀ��Ȼ��CMU��Facebook��ϴ��AI��Pluribus��ѵ��ɱ�150��Ԫ��8��ѵ��ʱ�伴��ְҵѡ�֣�ÿСʱӮ1000��Ԫ��ǿ��AI��ɵģ�Ҫ�о��㷨��֣��Ļ��о��ԱΪ��

��Ƿ��ֹ��ȷ��Ϸ��ʹ��Ǽ��ս�ֶ��ļ�ʱս��Ϸ��Ҳ�޷�սʤӵ��ѹ��Ƶļ��˼�ϣ��ڴ��ɷ֡��Ҫ��ս�ĵ��˿ˡ�

��˿� ��Texas hold'em��ʱҲ��ΪHold'em��Holdem��Ƶ��ˣ��еĹ��˿��Ϸ��Ҳ�ǹ��˿˱��ʽ��Ŀ֮һ��˿��λ��˳��Ӱ��˿��Ϸ֮һ��Ϊ��ע��ά�ֲ��䡣��Ҳ��ĳ��ܻ�ӭ��˿��Ϸ��ĵ��Ҳʮ��У��һ��ͬʱ��22λ��Ϊ23λ��֣��һ��Ƕ��ʮ��һ��https://zh.wikipedia.org/wiki/%E5%BE%B7%E5%B7%9E%E6%92%B2%E5%85%8B

��·��1СʱӮ7000�飬һ��ɰ��

��˿��ǵ��͵Ĳ��Ϣ��Ϸ��˿��У��޷��֪�ѷ��¼��ȫ��Ϣ��һ��һ��ע�а��10^160��ߵ㣨decision points��

��ÿ��Ҫ��ݳ��Ʒ��⣬��ͬ��·��ֲ��Ϣ��ʣ�ʹ�õ��˿˳�Ϊ�Ѷ�Զ��Ϸ��˹��ս��Ŀ��

��Ȼ��ʵ��ڰ��б�ע��ˡ�40��ѧ�Ҿ�һֱû��ֹͣ��Ե��ݵ��о��

��10��ǰ��һ��Ƶĵ��˿��Ϸ�У�սʤ��ඥ��ѡ�֣�4��ǰ��Լ��ô󰢶��ѧ��о��Ŷӿ��Cepheus(��)��һ��ų��޷�սʤ��˿˻��ˣ�2��ǰ��Ҳ��2017�꣬��ô�ͽݿ˵Ŀ�ѧ��arXiv�Ϸ��ģ��ΪDeepStack��㷨��ƿ��˹��ڱ��ӵ�С�ֱ��

��ǰ��죬��CMU��ѧ�ҵ�Ŭ��£��˹��Ѿ��ע��˱��ϻ��ඥ��ҡ�ֻ��ڵ�Ӱ��Ӿ��еĶ��ʵ�Ĵ��ʵ��ˣ�

��https://www.nature.com/articles/d41586-019-02156-9

��https://science.sciencemag.org/content/early/2019/07/10/science.aay2400

��https://www.techmeme.com/

��Pluribus�ġ��ɡ�ȴ��һ��ӵܺ��ͻϮ�Ĺ��£��ѵ��Pluribus�ĵ��1000��Ҳ��2��CPU��ʵʩ��С�

��ͼ��ʾ��64��CPUѵ��ڼ䣬Pluribus��ͼ��ԵĸĽ��̡��Ч�Ǹ��ѵ��տ��ġ�

��ƾ��ô��ª��װ��PluribusһСʱӮ��ཫ��7000��ҡ��ٶȣ�AIͨ��ݳ�Ϊ��̣�ֻ��Ҫ��һ�ܵ�ʱ�䡣

��

��Ƶ��չʾ��Pluribus �ڶ��λְҵ��ʱ��õ��ƾֲ��ԡ�(��ѹ��չʾ)

��ô��ɵģ�Ļ��ѧ��ߴ��

��Ȼ��AI��˴��Ա��ĵģ��ӮǮ�⣬��¾��ˡ�

��գ��λ��AI�� Pluribus��Ļ��֣�Facebook AI Research�о��ѧ�ҡ�CMU��ѧ��ʿ�ڶ�Noam Brown��Լ�CMU��Tuomas Sandholm��ͬ��reddit��ض��AIĻ��ش��ʡ��ǣ��˳��130��

����˿��վ��Ӱ��

��Ϊȫ��ܻ�ӭ��˿��Ϸ֮һ��˿��緶Χ��ӵ�д��ҡ��ҷǳ��AI��Ժ󣬻᲻��ڶ�ʱ��ڶ��ϵ��˿˲��Ӱ�죨��֮�⣺�Ƿ��ǧ��˹��ð��û��⣬Reddit�û�DlC3R��һ��Һܹ��ĵ��⣺�㷨֮��Ĳ��ĺ�ʱ��ʼ��

��Noam��Ϊ��˿��վ�ϣ��Ƚ��Ļ��˼�⼼��Ѿ��ǳ��죬�û��˳��ǧ�ķ��̫��һ�㶼��ֵ��϶��ְҵ�˿ˣ��ѡ�֡��ҵ��ֲ��ȣ��Ӱ�죬��ֲ��ʹ��˹��ѵ��ְҵ�˿�ѡ�֡�

��Noam��һ�䣺��ֻ��ע�˹��ܶ��˿ˣ��֮�⣬��ֻ�ǳ��뼼��е��ˣ��ģ�Ҳ��ʵû��ʱ��;��ȥ�˼��

����һ��ʹ��AIVAT��ٷ��

��Noam��ǹ��ƻ��˵�ʤ��Ϊ5bb/100��Ҳ��˵��50��Ԫ/100��Ԫ��äע��10000��Ԫ�ĳ��£��ÿ��ֵ1��Ԫ��Pluribusƽ��ÿ��Ӯ��5��Ԫ�Ľ��Ļ�ÿСʱ��׬��1000��Ԫ��Լ��7000��ң��

��˿�ӯ��㵥λ�ǡ�ÿ�پ�Ӯ��äע��BB/100��pֵΪ0.021��ְҵѡ��ܴﵽ3-7BB/100�֣��ȻAI��ʤ��Ѿ��ǳ��ˣ�

��û�з��٣��ôרҵ��ʿ��Ҫ��4��ڣ�ÿ��5�졢ÿ��8Сʱ��ƣ��ܻ��м�ֵ��

��л��ѧ�Ͳ��˹��ѧ��о��Ա��ΪAIVAT��˿˷��㷨��ռ��Լ12.5��

��AIVAT��Ч�ļ��ĳɷ֣��磬��һ��Ʒǳ�ǿ��AIVAT�ͻ�ӽ��ȥһ��ֵ��ɷ֡�

��

��Ƶ��ʾ��ؿ��CFR�㷨ͨ��ʵ�ʺͼ��ж�ֵ��±��߲��ԵĹ��̡��Pluribus�У��Ż�Ŀ�ģ��ֱ��ʵ��ȵķ�ʽ��ɵġ�

���о�Pluribus�㷨Ӧ�ôӺδ��֣�

��һλ��Ϊsmoke_carrot��Ȼ�Ǹ��ȽϺ�ѧ��ˡ��Ҫ��о�Pluribus��㷨��Pluribus��ʹ�õķ��ƽʱ�Ӵ��Ĳ�һ��ϣ��о��Ա�ܸ�һЩָ��飬��ô��Ķ��֣��ÿ��ķ��鼮��

��Tuomas��ڿ϶��λsmoke_carrot��۶ϣ�ȷʵPluribus��㷨��ǿ��ѧϰ��MCTS��ȫ��ͬ��ң�Ŀǰ�ڽ��Ϣ��Ϸ�ⷽ�棬û�кܺõĽ̲ġ��֮��չ��Ѹ�٣��2010�굽2015��Ķ��ʱ�ˡ�

��Ȥ��о��ͬѧ��Ӧ��ȥ�Ķ��о��ġ�Ŀǰ��·��Ļ��ǿ��ѻ�ȡ��Ҫ��ж��ģ�

��Tuomas��ھ��ѡ��һЩ��Լ��棬��ҽ��ѧϰ�о��

��Keynote ��New Results for Solving Imperfect-Information Games�� at the Association for the Advancement of Artificial Intelligence Annual Conference (AAAI), 2019, available on Vimeo. (https://vimeo.com/313942390)

��Keynote ��Super-Human AI for Strategic Reasoning: Beating Top Pros in Heads-Up No-Limit Texas Hold��em�� at the International Joint Conference on Artificial Intelligence (IJCAI), available on YouTube. (https://www.youtube.com/watch?v=xrWulRY_t1o)

��Solving Imperfect-Information Games. (~sandholm/Solving%20games.Science-2015.pdf) Science 347(6218), 122-123, 2015.

��Abstraction for Solving Large Incomplete-Information Games. (~sandholm/game%20abstraction.aaai15SMT.pdf) In AAAI, Senior Member Track, 2015.

��The State of Solving Large Incomplete-Information Games, and Application to Poker. (~sandholm/solving%20games.aimag11.pdf) AI Magazine, special issue on Algorithmic Game Theory, Winter, 13-32, 2010.

��Brown, N. and Sandholm, T. 2019. Superhuman AI for multiplayer poker. (https://science.sciencemag.org/content/early/2019/07/10/science.aay2400) Science, July 11th.

��Farina, G., Kroer, C., and Sandholm, T. 2019. Regret Circuits: Composability of Regret Minimizers. In Proceedings of the International Conference on Machine Learning (ICML), 2019. arXiv version. (https://arxiv.org/abs/1811.02540)

��Farina, G., Kroer, C., Brown, N., and Sandholm, T. 2019. Stable-Predictive Optimistic Counterfactual Regret Minimization. In ICML. arXiv version. (https://arxiv.org/pdf/1902.04982.pdf)

��Brown, N, Lerer, A., Gross, S., and Sandholm, T. 2019. Deep Counterfactual Regret Minimization In ICML. Early version (https://arxiv.org/pdf/1811.00164.pdf) in NeurIPS-18 Deep RL Workshop, 2018.

��Brown, N. and Sandholm, T. 2019. Solving Imperfect-Information Games via Discounted Regret Minimization (https://arxiv.org/pdf/1809.04040.pdf). In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). Outstanding Paper Honorable Mention, one of four papers receiving special recognition out of 1,150 accepted papers and 7,095 submissions.

��Farina, G., Kroer, C., and Sandholm, T. 2019. Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games (~gfarina/2018/laminar-regret-aaai19/). In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI).

��Marchesi, A., Farina, G., Kroer, C., Gatti, N., and Sandholm, T. 2019. Quasi-Perfect Stackelberg Equilibrium (~gfarina/2018/qp-stackelberg-aaai19/). In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI).

��Farina, G., Kroer, C., Brown, N., and Sandholm, T. 2019. Stable-Predictive Optimistic Counterfactual Regret Minimization (https://arxiv.org/pdf/1902.04982.pdf). arXiv.

��Brown, N. and Sandholm, T. 2018. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals. () Science, full Research Article.

��Brown, N., Lerer, A., Gross, S., and Sandholm, T. 2018. Deep Counterfactual Regret Minimization (https://arxiv.org/pdf/1811.00164.pdf). NeurIPS Deep Reinforcement Learning Workshop. *Oral Presentation*.

��Kroer, C., Waugh, K., Kilinc-Karzan, F., and Sandholm, T. 2018. Faster algorithms for extensive-form game solving via improved smoothing functions. (https://rdcu.be/8EyP) Mathematical Programming, Series A. Abstract published in EC-17.

��Brown, N., Sandholm, T., and Amos, B. 2018. Depth-Limited Solving for Imperfect-Information Games. (https://arxiv.org/pdf/1805.08195.pdf) In Proc. Neural Information Processing Systems (NeurIPS).

��Kroer, C. and Sandholm, T. 2018. A Unified Framework for Extensive-Form Game Abstraction with Bounds. In NIPS. Early version (~ckroer/papers/unified_abstraction_framework_ai_cubed.pdf) in IJCAI-18 AI^3 workshop.

��Farina, G., Gatti, N., and Sandholm, T. 2018. Practical Exact Algorithm for Trembling-Hand Equilibrium Refinements in Games. (~gfarina/2017/trembling-lp-refinements-nips18/) In NeurIPS.

��Kroer, C., Farina, G., and Sandholm, T. 2018. Solving Large Sequential Games with the Excessive Gap Technique. (https://arxiv.org/abs/1810.03063) In NeurIPS. Also Spotlight presentation.

��Farina, G., Celli, A., Gatti, N., and Sandholm, T. 2018. Ex Ante Coordination and Collusion in Zero-Sum Multi-Player Extensive-Form Games. (~gfarina/2018/collusion-3players-nips18/) In NeurIPS.

��Farina, G., Marchesi, A., Kroer, C., Gatti, N., and Sandholm, T. 2018. Trembling-Hand Perfection in Extensive-Form Games with Commitment. (~ckroer/papers/stackelberg_perfection_ijcai18.pdf) In IJCAI.

��Kroer, C., Farina, G., and Sandholm, T*. 2018. *Robust Stackelberg Equilibria in Extensive-Form Games and Extension to Limited Lookahead. (~ckroer/papers/robust.aaai18.pdf) In Proc. AAAI Conference on AI (AAAI).

��Brown, N., and Sandholm, T. 2017. Safe and Nested Subgame Solving for Imperfect-Information Games. (https://www.cs.cmu.edu/~noamb/papers/17-NIPS-Safe.pdf) In NIPS. *Best Paper Award, out of 3,240 submissions.

��Farina, G., Kroer, C., Sandholm, T. 2017. Regret Minimization in Behaviorally-Constrained Zero-Sum Games. (~sandholm/behavioral.icml17.pdf) In Proc. International Conference on Machine Learning (ICML).

��Brown, N. and Sandholm, T. 2017. Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning. (~sandholm/reducedSpace.icml17.pdf) In ICML.

��Kroer, C., Farina, G., Sandholm, T. 2017. Smoothing Method for Approximate Extensive-Form Perfect Equilibrium. (~sandholm/smoothingEFPE.ijcai17.pdf) In IJCAI. ArXiv version. ()

��Brown, N., Kroer, C., and Sandholm, T. 2017. Dynamic Thresholding and Pruning for Regret Minimization. (~sandholm/dynamicThresholding.aaai17.pdf) In AAAI.

��Kroer, C. and Sandholm, T. 2016. Imperfect-Recall Abstractions with Bounds in Games. (~sandholm/imperfect-recall-abstraction-with-bounds.ec16.pdf) In Proc. ACM Conference on Economics and Computation (EC).

��Noam Brown and Tuomas Sandholm. 2016. Strategy-Based Warm Starting for Regret Minimization in Games. In AAAI. Extended version with appendix. (~sandholm/warmStart.aaai16.withAppendixAndTypoFix.pdf)

��Noam Brown and Tuomas Sandholm. 2015. Regret-Based Pruning in Extensive-Form Games. (~sandholm/cs15-892F15) In NIPS. Extended version. (~sandholm/regret-basedPruning.nips15.withAppendix.pdf)

��Brown, N. and Sandholm, T. 2015. Simultaneous Abstraction and Equilibrium Finding in Games. (~sandholm/simultaneous.ijcai15.pdf) In IJCAI.

��Kroer, C. & Sandholm, T. 2015. Limited Lookahead in Imperfect-Information Games. (~sandholm/limited-look-ahead.ijcai15.pdf) IJCAI.

��Kroer, C., Waugh, K., Kilinc-Karzan, F., and Sandholm, T. 2015. Faster First-Order Methods for Extensive-Form Game Solving. (~sandholm/faster.ec15.pdf) In EC.

��Brown, N., Ganzfried, S., and Sandholm, T. 2015. Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Hold��em Agent. (~sandholm/hierarchical.aamas15.pdf) In Proc. Internat. Conference on Autonomous Agents and Multiagent Systems (AAMAS).

��Kroer, C. and Sandholm, T. 2015. Discretization of Continuous Action Spaces in Extensive-Form Games. (~sandholm/discretization.aamas15.fromACM.pdf) In AAMAS.

��Ganzfried, S. and Sandholm, T. 2015. Endgame Solving in Large Imperfect-Information Games. (~sandholm/endgame.aamas15.fromACM.pdf) In AAMAS.

��Kroer, C. and Sandholm, T. 2014. Extensive-Form Game Abstraction With Bounds. (~sandholm/extensiveGameAbstraction.ec14.pdf) In EC.

��Brown, N. and Sandholm, T. 2014. Regret Transfer and Parameter Optimization. (~sandholm/regret_transfer.aaai14.pdf) In AAAI.

��Ganzfried, S. and Sandholm, T. 2014. Potential-Aware Imperfect-Recall Abstraction with Earth Mover��s Distance in Imperfect-Information Games. (~sandholm/potential-aware_imperfect-recall.aaai14.pdf) In AAAI.

��Ganzfried, S. and Sandholm, T. 2013. Action Translation in Extensive-Form Games with Large Action Spaces: Axioms, Paradoxes, and the Pseudo-Harmonic Mapping. (~sandholm/reverse%20mapping.ijcai13.pdf) In IJCAI.

��Sandholm, T. and Singh, S. 2012. Lossy Stochastic Game Abstraction with Bounds. (~sandholm/lossyStochasticGameAbstractionWBounds.ec12.pdf) In EC.

��Gilpin, A., Pe?a, J., and Sandholm, T. 2012. First-Order Algorithm with O(ln(1/epsilon)) Convergence for epsilon-Equilibrium in Two-Person Zero-Sum Games. (~sandholm/restart.MathProg12.pdf) Mathematical Programming 133(1-2), 279-298. Subsumes our AAAI-08 paper.

��Ganzfried, S., Sandholm, T., and Waugh, K. 2012. Strategy Purification and Thresholding: Effective Non-Equilibrium Approaches for Playing Large Games. (~sandholm/StrategyPurification_AAMAS2012_camera_ready_2.pdf) In AAMAS.

��Ganzfried, S. and Sandholm, T. 2012. Tartanian5: A Heads-Up No-Limit Texas Hold'em Poker-Playing Program. (~sandholm/Tartanian_ACPC12_CR.pdf) Computer Poker Symposium at AAAI.

��Hoda, S., Gilpin, A., Pe?a, J., and Sandholm, T. 2010. Smoothing techniques for computing Nash equilibria of sequential games. (~sandholm/proxtreeplex.MathOfOR.pdf) Mathematics of Operations Research 35(2), 494-512.

��Ganzfried, S. and Sandholm, T. 2010 Computing Equilibria by Incorporating Qualitative Models (~sandholm/qualitative.aamas10.pdf). In AAMAS. Extended version (~sandholm/qualitative.TR10.pdf): CMU technical report CMU-CS-10-105.

��Gilpin, A. and Sandholm, T. 2010. Speeding Up Gradient-Based Algorithms for Sequential Games (Extended Abstract) (~sandholm/speedup.aamas10.pdf). In AAMAS.

��Ganzfried, S. and Sandholm, T. 2009. Computing Equilibria in Multiplayer Stochastic Games of Imperfect Information (~sandholm/stochgames.ijcai09.pdf). In IJCAI.

��2008��Լ�֮ǰ��ڲ��ȫ��Ϣ��Ϸ�ļ��ľ�ѡ��ģ�

��Gilpin, A. and Sandholm, T. 2008. Expectation-Based Versus Potential-Aware Automated Abstraction in Imperfect Information Games: An Experimental Comparison Using Poker. (~sandholm/expectation-basedVsPotential-Aware.AAAI08.pdf) In AAAI.

��Ganzfried, S. and Sandholm, T. 2008. Computing an Approximate Jam/Fold Equilibrium for 3-Agent No-Limit Texas Hold'em Tournaments. (~sandholm/3-player%20jam-fold.AAMAS08.pdf) In AAMAS.

��Gilpin, A., Sandholm, T., and S?rensen, T. 2008. A heads-up no-limit Texas Hold'em poker player: Discretized betting models and automatically generated equilibrium-finding programs. (~sandholm/tartanian.AAMAS08.pdf) In AAMAS.

��Gilpin, A. and Sandholm, T. 2007. Lossless abstraction of imperfect information games (~sandholm/extensive.jacm07.pdf). Journal of the ACM, 54 (5). Early versions in EC-06.

��Gilpin, A., Sandholm, T., and S?rensen, T. 2007. Potential-Aware Automated Abstraction of Sequential Games, and Holistic Equilibrium Analysis of Texas Hold'em Poker. (~sandholm/gs3.aaai07.pdf) In AAAI.

��Gilpin, A. and Sandholm, T. 2007. Better automated abstraction techniques for imperfect information games, with application to Texas Hold'em poker. (~sandholm/gs2.aamas07.pdf) In AAMAS.

��Gilpin, A. and Sandholm, T. 2006. A competitive Texas Hold'em Poker player via automated abstraction and real-time equilibrium computation. (~sandholm/texas.aaai06.pdf) In AAAI.

��Դ˸��Ȥ��ϣ��ۣ��Ʋ�Reddit��

��https://www.reddit.com/r/MachineLearning/comments/ceece3/ama_we_are_noam_brown_and_tuomas_sandholm/

��׷��΢�Ź��ںţ��Ԫ��߸��˹۵㣬��Ѷ��Ͷ��߾ݴ˲��Ե��

��α༭�� HN003��