1СʱӮ1000

文章正文
发布时间:2024-09-28 23:45

Ԫԭ

ԴReddit

Ԫ˱ΪѶԶϷ˹սĿȻCMUFacebookϴAIPluribusѵɱ150Ԫ8ѵʱ伴ְҵѡ֣ÿСʱӮ1000ԪǿAIɵģҪо㷨֣ĻоԱΪ

ǷֹȷϷʹǼսֶļʱսϷҲ޷սʤӵѹƵļ˼ϣڴɷ֡Ҫսĵ˿ˡ

˿ Texas hold'emʱҲΪHold'emHoldemƵˣеĹ˿ϷҲǹ˿˱ʽĿ֮һ˿λ˳Ӱ˿Ϸ֮һΪעάֲ䡣Ҳijܻӭ˿ϷĵҲʮУһͬʱ22λΪ23λ֣һǶʮһhttps://zh.wikipedia.org/wiki/%E5%BE%B7%E5%B7%9E%E6%92%B2%E5%85%8B

·1СʱӮ7000飬һɰ

˿ǵ͵IJϢϷ˿У޷֪ѷ¼ȫϢһһעа10^160ߵ㣨decision points

ÿҪݳƷ⣬ͬ·ֲϢʣʹõ˿˳ΪѶԶϷ˹սĿ

Ȼʵڰбעˡ40ѧҾһֱûֹͣԵݵо

10ǰһƵĵ˿ϷУսʤඥѡ֣4ǰԼô󰢶ѧоŶӿCepheus()һų޷սʤ˿˻ˣ2ǰҲ2017꣬ôͽݿ˵ĿѧarXivϷģΪDeepStack㷨ƿ˹ڱӵСֱ

ǰ죬CMUѧҵŬ£˹Ѿע˱ϻඥҡֻڵӰӾеĶʵĴʵˣ

https://www.nature.com/articles/d41586-019-02156-9

https://science.sciencemag.org/content/early/2019/07/10/science.aay2400

https://www.techmeme.com/

PluribusġɡȴһӵܺͻϮĹ£ѵPluribusĵ1000Ҳ2CPUʵʩС

ͼʾ64CPUѵڼ䣬PluribusͼԵĸĽ̡ЧǸѵտġ

ƾôªװPluribusһСʱӮཫ7000ҡٶȣAIͨݳΪֻ̣Ҫһܵʱ䡣

ƵչʾPluribus ڶλְҵʱõƾֲԡ(ѹչʾ)

ôɵģĻѧߴ

ȻAI˴ԱĵģӮǮ⣬¾ˡ

գλAI PluribusĻ֣Facebook AI ResearchоѧҡCMUѧʿڶNoam BrownԼCMUTuomas SandholmͬredditضAIĻشʡǣ˳130

˿վӰ

Ϊȫܻӭ˿Ϸ֮һ˿緶ΧӵдҡҷdzAIԺ󣬻᲻ڶʱڶϵ˿˲Ӱ죨֮⣺Ƿǧ˹ðû⣬RedditûDlC3RһҺܹĵ⣺㷨֮IJĺʱʼ

NoamΪ˿վϣȽĻ˼⼼Ѿdz죬û˳ǧķ̫һ㶼ֵ϶ְҵ˿ˣѡ֡ҵֲȣӰ죬ֲʹ˹ѵְҵ˿ѡ֡

Noamһ䣺ֻע˹ܶ˿ˣ֮⣬ֻdz뼼еˣģҲʵûʱ;ȥ˼

һʹAIVATٷ

Noamǹƻ˵ʤΪ5bb/100Ҳ˵50Ԫ/100Ԫäע10000Ԫij£ÿֵ1ԪPluribusƽÿӮ5ԪĽĻÿСʱ׬1000ԪԼ7000ң

˿ӯ㵥λǡÿپӮäעBB/100pֵΪ0.021ְҵѡܴﵽ3-7BB/100֣ȻAIʤѾdzˣ

ûз٣ôרҵʿҪ4ڣÿ5졢ÿ8Сʱƣܻмֵ

лѧͲ˹ѧоԱΪAIVAT˿˷㷨ռԼ12.5

AIVATЧļijɷ֣磬һƷdzǿAIVATͻӽȥһֵɷ֡

ƵʾؿCFR㷨ͨʵʺͼжֵ±߲ԵĹ̡PluribusУŻĿģֱʵȵķʽɵġ

оPluribus㷨ӦôӺδ֣

һλΪsmoke_carrotȻǸȽϺѧˡҪоPluribus㷨PluribusʹõķƽʱӴIJһϣоԱܸһЩָ飬ôĶ֣ÿķ鼮

Tuomasڿ϶λsmoke_carrot۶ϣȷʵPluribus㷨ǿѧϰMCTSȫͬңĿǰڽϢϷⷽ棬ûкܺõĽ̲ġ֮չѸ٣2010굽2015Ķʱˡ

ȤоͬѧӦȥĶоġĿǰ·ĻǿѻȡҪжģ

TuomasھѡһЩԼ棬ҽѧϰо

Keynote New Results for Solving Imperfect-Information Games at the Association for the Advancement of Artificial Intelligence Annual Conference (AAAI), 2019, available on Vimeo. (https://vimeo.com/313942390)

Keynote Super-Human AI for Strategic Reasoning: Beating Top Pros in Heads-Up No-Limit Texas Holdem at the International Joint Conference on Artificial Intelligence (IJCAI), available on YouTube. (https://www.youtube.com/watch?v=xrWulRY_t1o)

Solving Imperfect-Information Games. (~sandholm/Solving%20games.Science-2015.pdf) Science 347(6218), 122-123, 2015.

Abstraction for Solving Large Incomplete-Information Games. (~sandholm/game%20abstraction.aaai15SMT.pdf) In AAAI, Senior Member Track, 2015.

The State of Solving Large Incomplete-Information Games, and Application to Poker. (~sandholm/solving%20games.aimag11.pdf) AI Magazine, special issue on Algorithmic Game Theory, Winter, 13-32, 2010.

Brown, N. and Sandholm, T. 2019. Superhuman AI for multiplayer poker. (https://science.sciencemag.org/content/early/2019/07/10/science.aay2400) Science, July 11th.

Farina, G., Kroer, C., and Sandholm, T. 2019. Regret Circuits: Composability of Regret Minimizers. In Proceedings of the International Conference on Machine Learning (ICML), 2019. arXiv version. (https://arxiv.org/abs/1811.02540)

Farina, G., Kroer, C., Brown, N., and Sandholm, T. 2019. Stable-Predictive Optimistic Counterfactual Regret Minimization. In ICML. arXiv version. (https://arxiv.org/pdf/1902.04982.pdf)

Brown, N, Lerer, A., Gross, S., and Sandholm, T. 2019. Deep Counterfactual Regret Minimization In ICML. Early version (https://arxiv.org/pdf/1811.00164.pdf) in NeurIPS-18 Deep RL Workshop, 2018.

Brown, N. and Sandholm, T. 2019. Solving Imperfect-Information Games via Discounted Regret Minimization (https://arxiv.org/pdf/1809.04040.pdf). In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). Outstanding Paper Honorable Mention, one of four papers receiving special recognition out of 1,150 accepted papers and 7,095 submissions.

Farina, G., Kroer, C., and Sandholm, T. 2019. Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games (~gfarina/2018/laminar-regret-aaai19/). In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI).

Marchesi, A., Farina, G., Kroer, C., Gatti, N., and Sandholm, T. 2019. Quasi-Perfect Stackelberg Equilibrium (~gfarina/2018/qp-stackelberg-aaai19/). In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI).

Farina, G., Kroer, C., Brown, N., and Sandholm, T. 2019. Stable-Predictive Optimistic Counterfactual Regret Minimization (https://arxiv.org/pdf/1902.04982.pdf). arXiv.

Brown, N. and Sandholm, T. 2018. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals. () Science, full Research Article.

Brown, N., Lerer, A., Gross, S., and Sandholm, T. 2018. Deep Counterfactual Regret Minimization (https://arxiv.org/pdf/1811.00164.pdf). NeurIPS Deep Reinforcement Learning Workshop. *Oral Presentation*.

Kroer, C., Waugh, K., Kilinc-Karzan, F., and Sandholm, T. 2018. Faster algorithms for extensive-form game solving via improved smoothing functions. (https://rdcu.be/8EyP) Mathematical Programming, Series A. Abstract published in EC-17.

Brown, N., Sandholm, T., and Amos, B. 2018. Depth-Limited Solving for Imperfect-Information Games. (https://arxiv.org/pdf/1805.08195.pdf) In Proc. Neural Information Processing Systems (NeurIPS).

Kroer, C. and Sandholm, T. 2018. A Unified Framework for Extensive-Form Game Abstraction with Bounds. In NIPS. Early version (~ckroer/papers/unified_abstraction_framework_ai_cubed.pdf) in IJCAI-18 AI^3 workshop.

Farina, G., Gatti, N., and Sandholm, T. 2018. Practical Exact Algorithm for Trembling-Hand Equilibrium Refinements in Games. (~gfarina/2017/trembling-lp-refinements-nips18/) In NeurIPS.

Kroer, C., Farina, G., and Sandholm, T. 2018. Solving Large Sequential Games with the Excessive Gap Technique. (https://arxiv.org/abs/1810.03063) In NeurIPS. Also Spotlight presentation.

Farina, G., Celli, A., Gatti, N., and Sandholm, T. 2018. Ex Ante Coordination and Collusion in Zero-Sum Multi-Player Extensive-Form Games. (~gfarina/2018/collusion-3players-nips18/) In NeurIPS.

Farina, G., Marchesi, A., Kroer, C., Gatti, N., and Sandholm, T. 2018. Trembling-Hand Perfection in Extensive-Form Games with Commitment. (~ckroer/papers/stackelberg_perfection_ijcai18.pdf) In IJCAI.

Kroer, C., Farina, G., and Sandholm, T*. 2018. *Robust Stackelberg Equilibria in Extensive-Form Games and Extension to Limited Lookahead. (~ckroer/papers/robust.aaai18.pdf) In Proc. AAAI Conference on AI (AAAI).

Brown, N., and Sandholm, T. 2017. Safe and Nested Subgame Solving for Imperfect-Information Games. (https://www.cs.cmu.edu/~noamb/papers/17-NIPS-Safe.pdf) In NIPS. *Best Paper Award, out of 3,240 submissions.

Farina, G., Kroer, C., Sandholm, T. 2017. Regret Minimization in Behaviorally-Constrained Zero-Sum Games. (~sandholm/behavioral.icml17.pdf) In Proc. International Conference on Machine Learning (ICML).

Brown, N. and Sandholm, T. 2017. Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning. (~sandholm/reducedSpace.icml17.pdf) In ICML.

Kroer, C., Farina, G., Sandholm, T. 2017. Smoothing Method for Approximate Extensive-Form Perfect Equilibrium. (~sandholm/smoothingEFPE.ijcai17.pdf) In IJCAI. ArXiv version. ()

Brown, N., Kroer, C., and Sandholm, T. 2017. Dynamic Thresholding and Pruning for Regret Minimization. (~sandholm/dynamicThresholding.aaai17.pdf) In AAAI.

Kroer, C. and Sandholm, T. 2016. Imperfect-Recall Abstractions with Bounds in Games. (~sandholm/imperfect-recall-abstraction-with-bounds.ec16.pdf) In Proc. ACM Conference on Economics and Computation (EC).

Noam Brown and Tuomas Sandholm. 2016. Strategy-Based Warm Starting for Regret Minimization in Games. In AAAI. Extended version with appendix. (~sandholm/warmStart.aaai16.withAppendixAndTypoFix.pdf)

Noam Brown and Tuomas Sandholm. 2015. Regret-Based Pruning in Extensive-Form Games. (~sandholm/cs15-892F15) In NIPS. Extended version. (~sandholm/regret-basedPruning.nips15.withAppendix.pdf)

Brown, N. and Sandholm, T. 2015. Simultaneous Abstraction and Equilibrium Finding in Games. (~sandholm/simultaneous.ijcai15.pdf) In IJCAI.

Kroer, C. & Sandholm, T. 2015. Limited Lookahead in Imperfect-Information Games. (~sandholm/limited-look-ahead.ijcai15.pdf) IJCAI.

Kroer, C., Waugh, K., Kilinc-Karzan, F., and Sandholm, T. 2015. Faster First-Order Methods for Extensive-Form Game Solving. (~sandholm/faster.ec15.pdf) In EC.

Brown, N., Ganzfried, S., and Sandholm, T. 2015. Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Holdem Agent. (~sandholm/hierarchical.aamas15.pdf) In Proc. Internat. Conference on Autonomous Agents and Multiagent Systems (AAMAS).

Kroer, C. and Sandholm, T. 2015. Discretization of Continuous Action Spaces in Extensive-Form Games. (~sandholm/discretization.aamas15.fromACM.pdf) In AAMAS.

Ganzfried, S. and Sandholm, T. 2015. Endgame Solving in Large Imperfect-Information Games. (~sandholm/endgame.aamas15.fromACM.pdf) In AAMAS.

Kroer, C. and Sandholm, T. 2014. Extensive-Form Game Abstraction With Bounds. (~sandholm/extensiveGameAbstraction.ec14.pdf) In EC.

Brown, N. and Sandholm, T. 2014. Regret Transfer and Parameter Optimization. (~sandholm/regret_transfer.aaai14.pdf) In AAAI.

Ganzfried, S. and Sandholm, T. 2014. Potential-Aware Imperfect-Recall Abstraction with Earth Movers Distance in Imperfect-Information Games. (~sandholm/potential-aware_imperfect-recall.aaai14.pdf) In AAAI.

Ganzfried, S. and Sandholm, T. 2013. Action Translation in Extensive-Form Games with Large Action Spaces: Axioms, Paradoxes, and the Pseudo-Harmonic Mapping. (~sandholm/reverse%20mapping.ijcai13.pdf) In IJCAI.

Sandholm, T. and Singh, S. 2012. Lossy Stochastic Game Abstraction with Bounds. (~sandholm/lossyStochasticGameAbstractionWBounds.ec12.pdf) In EC.

Gilpin, A., Pe?a, J., and Sandholm, T. 2012. First-Order Algorithm with O(ln(1/epsilon)) Convergence for epsilon-Equilibrium in Two-Person Zero-Sum Games. (~sandholm/restart.MathProg12.pdf) Mathematical Programming 133(1-2), 279-298. Subsumes our AAAI-08 paper.

Ganzfried, S., Sandholm, T., and Waugh, K. 2012. Strategy Purification and Thresholding: Effective Non-Equilibrium Approaches for Playing Large Games. (~sandholm/StrategyPurification_AAMAS2012_camera_ready_2.pdf) In AAMAS.

Ganzfried, S. and Sandholm, T. 2012. Tartanian5: A Heads-Up No-Limit Texas Hold'em Poker-Playing Program. (~sandholm/Tartanian_ACPC12_CR.pdf) Computer Poker Symposium at AAAI.

Hoda, S., Gilpin, A., Pe?a, J., and Sandholm, T. 2010. Smoothing techniques for computing Nash equilibria of sequential games. (~sandholm/proxtreeplex.MathOfOR.pdf) Mathematics of Operations Research 35(2), 494-512.

Ganzfried, S. and Sandholm, T. 2010 Computing Equilibria by Incorporating Qualitative Models (~sandholm/qualitative.aamas10.pdf). In AAMAS. Extended version (~sandholm/qualitative.TR10.pdf): CMU technical report CMU-CS-10-105.

Gilpin, A. and Sandholm, T. 2010. Speeding Up Gradient-Based Algorithms for Sequential Games (Extended Abstract) (~sandholm/speedup.aamas10.pdf). In AAMAS.

Ganzfried, S. and Sandholm, T. 2009. Computing Equilibria in Multiplayer Stochastic Games of Imperfect Information (~sandholm/stochgames.ijcai09.pdf). In IJCAI.

2008Լ֮ǰڲȫϢϷļľѡģ

Gilpin, A. and Sandholm, T. 2008. Expectation-Based Versus Potential-Aware Automated Abstraction in Imperfect Information Games: An Experimental Comparison Using Poker. (~sandholm/expectation-basedVsPotential-Aware.AAAI08.pdf) In AAAI.

Ganzfried, S. and Sandholm, T. 2008. Computing an Approximate Jam/Fold Equilibrium for 3-Agent No-Limit Texas Hold'em Tournaments. (~sandholm/3-player%20jam-fold.AAMAS08.pdf) In AAMAS.

Gilpin, A., Sandholm, T., and S?rensen, T. 2008. A heads-up no-limit Texas Hold'em poker player: Discretized betting models and automatically generated equilibrium-finding programs. (~sandholm/tartanian.AAMAS08.pdf) In AAMAS.

Gilpin, A. and Sandholm, T. 2007. Lossless abstraction of imperfect information games (~sandholm/extensive.jacm07.pdf). Journal of the ACM, 54 (5). Early versions in EC-06.

Gilpin, A., Sandholm, T., and S?rensen, T. 2007. Potential-Aware Automated Abstraction of Sequential Games, and Holistic Equilibrium Analysis of Texas Hold'em Poker. (~sandholm/gs3.aaai07.pdf) In AAAI.

Gilpin, A. and Sandholm, T. 2007. Better automated abstraction techniques for imperfect information games, with application to Texas Hold'em poker. (~sandholm/gs2.aamas07.pdf) In AAMAS.

Gilpin, A. and Sandholm, T. 2006. A competitive Texas Hold'em Poker player via automated abstraction and real-time equilibrium computation. (~sandholm/texas.aaai06.pdf) In AAAI.

Դ˸ȤϣۣƲReddit

https://www.reddit.com/r/MachineLearning/comments/ceece3/ama_we_are_noam_brown_and_tuomas_sandholm/

׷΢ŹںţԪ߸˹۵㣬ѶͶ߾ݴ˲Ե

α༭ HN003