Pokémon Team Optimization

(nchagnet.pages.dev)

93 points | by nchagnet 5 days ago

17 comments

  • HelloUsername 1 hour ago
    Right on time for 30th anniversary! https://xcancel.com/Pokemon_cojp/status/2006379822012911872

    Translated text:

    The 30th anniversary of Pokémon begins! It's been 30 years since the release of "Pokémon Red and Green." Pokémon will celebrate its 30th anniversary on Friday, February 27, 2026. This year is going to be the best year yet! Stay tuned! #Pokémon30thAnniversary

  • huevosabio 2 hours ago
    Love this!

    My general problem with Pokémon (at least the older versions, haven't played the latest) is that when playing against others it frequently just boils down to the same set of legendary and overpowered mons.

    You sort of addressed this running the milp without certain mons as options, which makes sense.

    But you already have the machinery for a better constraint: max total base stat. You could think of it as "weight classes" in box.

    So, for a given weight class, your team can only add up to Y in total base stat. You can squeeze one of the OP mons, but then the rest are slackers. Or you could balance them.

    It makes it a lot more interesting and invites for diversity. And you could run it for many different values of Y.

    • j-bos 49 minutes ago
      It's a good idea but base stats aren't everything: https://youtu.be/gEkMi_y3Wzo

      That's why the competitive scene maintains a listing of tiers across generations derived from analyzing the actual usages across thoughful battles. https://www.smogon.com/sm/articles/sm_tiers

    • oceansky 1 hour ago
      In competitive Pokémon there are usually different tiers of which Pokémon are accepted. In most legendaries are limited or fully banned.

      For the mainline games it usually does not matter. You can beat it with any single Pokémon pretty much.

    • nchagnet 2 hours ago
      That's a great idea!
    • TZubiri 1 hour ago
      That property is true in most games though, unless it is heavily optimized for 'balance'. Consider Chess, or Poker, or Soccer, if both players play correctly a huge range of strategies are just easily exploitable and thus unplayed.

      That said, complexity emerges and explodes in the tiny differences, even if there's like 8 pokemon that are in 95% of teams, and 10 situational pokemon that appear 5% of the times, that's like C(6,8) teams which is like 56 possible teams of Uber pokemon, and a buttload of possible teams with situational pokemon like choice scarf ditto, eviolite Chansey, nuzzle u-turn super fang pachirisu, etc...

      Even if the teams were the same, just the possible differences in movesets create a lot of different sets, suppose 6 possible moves for each mon, you have C(4,6) sets for each mon, each with their own probability weight as well.

    • huevosabio 2 hours ago
      Follow up thought: it would be cool if you imagine a match to consist of three sets each for a different class.

      So each player comes with three teams of small, mid and large base stat classes. You can't repeat monster across teams. Whoever wins 2/3 wins the match.

      ...

      And if this was my college house, we would have a price system for the mons so you wouldn't be able to repeat mons even between players. But that's a different thing altogether.

  • reddalo 4 hours ago
    An interesting thing of this article is that the SVG image of the type matchup [1] has embedded automatic translation.

    The type labels will be displayed in the language your browser is set to. I didn't even know this was possible.

    [1] https://upload.wikimedia.org/wikipedia/commons/9/97/Pokemon_...

    • nchagnet 4 hours ago
      Oh that's really cool, I didn't know about this! I just linked to the wikimedia-hosted illustration, but that's a good perk too.
    • coolness 3 hours ago
      Wow thats very cool, i was puzzled at first as to why the pokemon types were in Finnish!
    • scrollaway 3 hours ago
      It's using the <switch> tag for this

      https://developer.mozilla.org/en-US/docs/Web/SVG/Reference/E...

      However, like with many of these obscure features, I am not so sure it works well in practice. I have the Windows 11 laptop I'm viewing that SVG from set with support enabled for english, french and russian, and I'm getting, among most of the English tags, a few stray "Psychique" and "Привидение" types in the svg. I have no idea how it chooses which one to show, there.

  • oceansky 3 hours ago
    Base stat total alone is a bad metric, because stat distribution is equally as important.

    If the stats are distributed heavily both on attack and special attack, it's usually bad because you generally want specialist attackers and these stats could be better somewhere else like speed.

    • nchagnet 3 hours ago
      Absolutely! In general I would expect a better model to incorporate a lot of weighed terms in the objective to choose less "extreme" solutions, but here I was mostly interested in illustrating the method.
      • oceansky 3 hours ago
        It was very impressive at that, congratulations.
    • TZubiri 1 hour ago
      You can go for smogon tiers as a proxy for pokemon strength.
  • yunusabd 2 hours ago
    If you find the base game too easy, I can recommend the IronMON challenge: You can only use one mon, permadeath, stats are randomized, all trainer levels are buffed by 1.5x and you can't level up on wilds. Along with numerous other rules to make it harder. There are variants that are borderline impossible to beat, like Super Kaizo IronMON. Out of hundreds of thousands of attempts, it has only been beaten once. Would make for an interesting optimization problem.

    https://github.com/PyroMikeGit/SuperKaizoIronMON

  • s_trumpet 2 hours ago
    Great article but I’d say you’re optimizing for the wrong metric here. For in game playthroughs, offense > defense and especially speedy offense beats anything else.

    I’d state it as, Given any type, we should be able to hit it for super-effective damage with at least 1 move. And instead of taking raw BST, I’d take Max(SPD+ATK, SPD+SPA) to favour speedy offense.

    Of course this does not take into question the thorny question of availability. Metagross is a top tier but only available post game in its debut. On the other hand Crobat and Gyarados are readily available in many of the games early on and evolve fairly quickly.

    Please look into the competitive Nuzlocke community, there are a lot of damage calculations and viability spreadsheets all around, you’ll find it interesting.

    • nchagnet 2 hours ago
      Thank you for your suggestion, I agree with you (and another commenter) that base stat is not that useful, and availability is actually what I would prioritise on in a next iteration. I tried to keep it simple here, mostly because it was interesting enough as an analysis. But if I were to redo this to get _the best_ team in a generation, I'd definitely go with what you suggested!
  • tweakimp 5 hours ago
    Why is y+2x optimal at (0,3) with a value of 3? Isnt it (3,0) with a value of 6?
    • nchagnet 5 hours ago
      Good catch! Especially since I ended up drawing y - x = C but didn't update the legend. I updated it!
      • tomtom1337 4 hours ago
        Haha, I started reading this, got interrupted, came back and got confused by the graph. Then came to the comments, saw your comment, reloaded the post and voila!

        Thank you for a lovely post!

    • abhishekbasu 5 hours ago
      you're right, it should be (3,0) with optimal obj value of 6.
  • mparnisari 3 hours ago
    My uni course on optimization was so much fun but I forgot all of it. This was a nice reminder that I should probably revisit the basics :)
  • saagarjha 2 hours ago
    I'm curious how it would rank existing teams–for example, are there trainers who pick better teams (of course, I am sure the bug catchers get soundly trounced). Surely Cynthia or Red have a strong team?
  • the-smug-one 1 hour ago
    The SVG chart has internationalization built-in, with multiple languages available. I thought that was cool.
  • abhishekbasu 5 hours ago
    this was a great read to start the new year! having worked extensively with mixed integer programs, it is always a bit disheartening to see them not used enough for everyday decision-making. one of my goals this year is to create a layer to make it easier to formulate mips and test them, via plain text input. this would hopefully increase adoption through a lower barrier to entry.
  • stevekemp 5 hours ago
    Lots of people working in IT have tattoos, I like to see what theme/image overlap they have.

    Three people in my current workplace have a balloon tattoo (interestingly all of them are red balloons). Five people in my current workplace have a Pokémon tattoo that is easily visible.

    Edit: Including myself, on both counts, I should have said.

    • u8080 4 hours ago
      >balloon tattoo

      What does it mean?

      • stevekemp 3 hours ago
        A tattoo of a balloon! Unless you meant what the meaning of the design was, and in that case different people have different associations and meanings.

        One of my forearms is covered in things my son used to be obsessed by when he was young, which is why I have a lego figure, a pikachu, and a red balloon as depicted in the book "Goodnight Moon" which I read to him every night for 3+ years.

    • 867-5309 4 hours ago
      which Pokémon? gotta name them all! (5)
      • stevekemp 3 hours ago
        I wish I could remember, but offhand all I can say is that we definitely have two pikachus and one snorlax.
  • arealaccount 2 hours ago
    Slaking can only attack every other turn making it a bad choice outside of niche teams.
    • nchagnet 2 hours ago
      I do comment on that in the article, I think it's a nice example of how your model can only know what you tell it (the one I used in the article doesn't know about abilities).
  • HelloUsername 4 hours ago
    I would've liked to see in conclusion a recommended starter team per generation! Very nice article!
    • nchagnet 4 hours ago
      I was planning in a future sequel/update to do this but with "better" constraints like only including Pokémon available in a game, etc... Maybe even separate it into early/mid/late-game availability since most optimal Pokémon are late-game anyway.
  • TZubiri 1 hour ago
    For games that are as complex as pokemon, it's usually necessary to restrict analysis to some subset. In this case team typing was used.

    I personally like restricting to generation 1, as it is very cannonical, very static, and one of the simplest.

    Furthermore I like the 1v1 format, which instead of a team, it's just 1 pokemon vs the other. Otherwise you have to resort to heuristics.

    But even with a 1v1 and generation 1 restriction it still isn't solved!

    Even a single matchup it's very complex to arrive to a theoretical mathematical problem, and still quite burdensome to write a montecarlo simulation.

    For example:

    Tauros vs gengar (Not an uncommon matchup in competitive gen 1)

    Hypnosis has a 60% accuracy, tauros can sleep for 1 to 6 turns with equal probability. Tauros can 2HKO with Earthquake, but can also crit. Gengar can 4HKO, with each crit counting as a double hit (both crits having roughly 20% chance).

    The question of who has the advantage is to my knowledge unsolved (also consider that in 1v1 the answer is different, as in teams you only have 1 sleep, so Gengar wastes it). It's also different from the problem of choosing the actual correct move, not only do you need to find the best first move, but in the game decision tree, you need a decision for each node. For example, if Tauros has 60% HP and Gengar has 100%HP, is it still better to go for hypnosis, or better to go for damage and hope for 1 out of 2 crits. This is all made more complex by the fact that both mons have a speed tie, so it's yet another probability event of who will attack first.

    https://www.smogon.com/forums/threads/gengar-vs-tauros-1v1-w...

    For a simple gen 1 with hidden teams, I think there's a bigger game tree than chess, and even Poker. The fact that it's non-stochastic with hidden information makes it very similar to poker analysis wise, I bet Counter Factual Regret Minimization approaches would work as well.

  • yjftsjthsd-h 5 hours ago
    Small typo(?):

    > Mewtwo (#151)

    Should be 150

    • nchagnet 5 hours ago
      Thank you, you're right! For some reason I always forget mew comes after mewtwo in the pokedex...
  • unpopularopp 5 hours ago
    Now all we need is a quick vibe coded web GUI front end