Scratch Chess Engine - Game of Kings

HasiLover_Test

birdracerthree wrote:
HasiLover_Test wrote:
Scurious 2.1 just Drew Element 5+8!!! https://lichess.org/zaloM9Gg#97 Scurious 2.1 was black.
Interesting game… Element should have played Qd8+ on move 39 to force a queen trade. This required a depth of 6, but Element gets a ply extension here (at least on my device). I’ll have to see why Element didn’t force a queen trade here.

Funnily enough, I actually thought Element was black because it doesn’t play a 4 knights Spanish on 6+8. I saw the Petrov and then I realized that Element was white in the game.

This Lucky Draw by Scurious 2.1 inspired me to finally implement Quiescence Search, I will be working on it now.

ArnoHu

ArnoHu wrote:
ArnoHu wrote:
S_P_A_R_T wrote:
birdracerthree wrote:
S_P_A_R_T wrote:
White Dove v7.06 Has Been Released!

After fixing the progression & eval bars, WD v7.06 improved the S3 performance massively!

https://scratch.mit.edu/projects/858052938/
White Dove is scary fast on S3 runtime! Why is the NPS so low on Scratch and Turbowarp?

I’ll work on stalemate detection relatively soon. I was going to remove mat evaluations, but I don’t think that’s an option. It looks like a virtual king captured is cut off by a non-mate evaluation somehow. I’m clueless…

There are cat blocks on Turbowarp now

Although I'm not 100% sure why WD is so slow NPS-wise (it gets like 14k NPS on the starting board with d3 as black, and keep in mind WD considers q-search nodes in this count!), but I have a feeling it's with how I'm implementing legal move generation. As every time full legal moves are generated, it's actually doing 2 pseudo-legal move gens.

And according to the profiler, the actual logic of the legality check is only 10% total, wheras the other 40-50% is the pseudo-legal move gen. This means that if only one set of pseudo-legal move gens was used, you can expect to nearly double performance (NPS-wise).

Overall, the depth WD gets (ply 7-9 during middlegame) is only because of the LMR and NMP, which thankfully, don't seem to be have super negative influence of performance.

Wow, what a performance improvement on S3, congrats - and it was there the whole time, just the progress bar slowed it done?!

It runs at the same search depth as GoK now on S3, and IMHO is just a question of time until it will win games. Here is one, against GoK (Medium, white), WD was even until move 21, then it started blundering: https://lichess.org/WPMXBRA3#59

For GoK to stay ahead, I am afraid I might have to sacrifice some nice but runtime-intense features, like q-search check sequences, some types of search extensions that might have spectacular results, but that only seldomly, and always cost time, and even some evaluation features, to regain the advantage on search depth, which is more essential for winning in this mode.

Game #2, GoK black, similar story: https://lichess.org/FKfFL5D8#64 It took GoK a near-perfect game to win, and that will not always be the case.

I might also have to disable some pruning on low depth, as it might be more costly than beneficiary there.

Game #3, GoK (Medium, white) at the brink of defeat against WD (P2) on Scratch 3, but WD blundered during endgame: https://lichess.org/HWfVTMNB#73

birdracerthree

ArnoHu wrote:
ArnoHu wrote:
ArnoHu wrote:
S_P_A_R_T wrote:
birdracerthree wrote:
S_P_A_R_T wrote:
White Dove v7.06 Has Been Released!

After fixing the progression & eval bars, WD v7.06 improved the S3 performance massively!

https://scratch.mit.edu/projects/858052938/
White Dove is scary fast on S3 runtime! Why is the NPS so low on Scratch and Turbowarp?

Although I'm not 100% sure why WD is so slow NPS-wise (it gets like 14k NPS on the starting board with d3 as black, and keep in mind WD considers q-search nodes in this count!), but I have a feeling it's with how I'm implementing legal move generation. As every time full legal moves are generated, it's actually doing 2 pseudo-legal move gens.

*snip*

Wow, what a performance improvement on S3, congrats - and it was there the whole time, just the progress bar slowed it done?!

It runs at the same search depth as GoK now on S3, and IMHO is just a question of time until it will win games. Here is one, against GoK (Medium, white), WD was even until move 21, then it started blundering: https://lichess.org/WPMXBRA3#59

*snip*

Game #2, GoK black, similar story: https://lichess.org/FKfFL5D8#64 It took GoK a near-perfect game to win, and that will not always be the case.

I might also have to disable some pruning on low depth, as it might be more costly than beneficiary there.

Game #3, GoK (Medium, white) at the brink of defeat against WD (P2) on Scratch 3, but WD blundered during endgame: https://lichess.org/HWfVTMNB#73

I believe that a White Dove win will take longer than you expect. White Dove has a really bad habit of blundering winning endgames on Turbowarp. S3 runtime makes the situation a lot worse.

I was really hoping Element would beat GoK on S3 first. If only I could get the TTable to work…

Last edited by birdracerthree (April 2, 2024 19:54:12)

ArnoHu

birdracerthree wrote:
ArnoHu wrote:
ArnoHu wrote:
ArnoHu wrote:
S_P_A_R_T wrote:
birdracerthree wrote:
S_P_A_R_T wrote:
White Dove v7.06 Has Been Released!

After fixing the progression & eval bars, WD v7.06 improved the S3 performance massively!

https://scratch.mit.edu/projects/858052938/
White Dove is scary fast on S3 runtime! Why is the NPS so low on Scratch and Turbowarp?

Although I'm not 100% sure why WD is so slow NPS-wise (it gets like 14k NPS on the starting board with d3 as black, and keep in mind WD considers q-search nodes in this count!), but I have a feeling it's with how I'm implementing legal move generation. As every time full legal moves are generated, it's actually doing 2 pseudo-legal move gens.

*snip*

Wow, what a performance improvement on S3, congrats - and it was there the whole time, just the progress bar slowed it done?!

It runs at the same search depth as GoK now on S3, and IMHO is just a question of time until it will win games. Here is one, against GoK (Medium, white), WD was even until move 21, then it started blundering: https://lichess.org/WPMXBRA3#59

*snip*

Game #2, GoK black, similar story: https://lichess.org/FKfFL5D8#64 It took GoK a near-perfect game to win, and that will not always be the case.

I might also have to disable some pruning on low depth, as it might be more costly than beneficiary there.

Game #3, GoK (Medium, white) at the brink of defeat against WD (P2) on Scratch 3, but WD blundered during endgame: https://lichess.org/HWfVTMNB#73
I believe that a White Dove win will take longer than you expect. White Dove has a really bad habit of blundering winning endgames on Turbowarp. S3 runtime makes the situation a lot worse.

I was really hoping Element would beat GoK on S3 first. If only I could get the TTable to work…

True, it could be Element as well. Here Element (3+8, black) plays GoK (Medium) on Scratch3 - balanced game until move 27: https://lichess.org/lyMvRAWj#89

And Element (3+8) just defeated WD (P2) on S3, in a very close game. Unfortunately I don't have the complete PGN data, as I had to re-import once. Here is the data export from WD:
FEN “r1bq1rk1/pp1pBppp/4pn2/8/Q3P3/2PB1N2/P1P2PPP/R3K2R b KQhq - 0 4”
1. Qd8e7 O-O 2. Nf6g4 e4e5 3. f7f5 h2h3 4. Ng4h6 Qa4b4 5. Qe7b4 c3b4 6. a7a5 b4b5 7. Nh6f7 b5b6 8. h7h5 a2a3 9. g7g5 Ra1e1 10. f5f4 Bd3g6 11. g5g4 h3g4 12. h5g4 Nf3d4 13. Ra8a6 Bg6f7 14. Rf8f7 Re1b1 15. Rf7h7 Nd4e2 16. f4f3 g2f3 17. g4f3 Ne2d4 18. Rh7g7 Kg1h2 19. Rg7g2 Kh2h1 20. Rg2g4 Nd4f3 21. a5a4 Kh1h2 22. Kg8f7 Rf1d1 23. Kf7e7 Rd1d6 24. Ke7f8 Nf3d4 25. Ra6a5 f2f3 26. Rg4h4 Kh2g3 27. Rh4h5 f3f4 28. Ra5a8 Kg3g4 29. Rh5h2 Kg4g3 30. Rh2h5 Kg3f2 31. Rh5h3 Rb1a1 32. Rh3h4 Kf2g3 33. Rh4h5 Kg3g4 34. Rh5h6 Kg4f3 35. Rh6h3 Kf3e4 36. Kf8g8 c2c4 37. Rh3c3 Ra1g1 38. Kg8h7 Rg1h1 39. Kh7g8 Rh1a1 40. Kg8h7 Ra1a2 41. Rc3c4 Ke4d3 42. Rc4c1 Ra2g2 43. Ra8a5 Rg2g3 44. Ra5c5 Rg3h3 45. Kh7g8 Rh3g3 46. Kg8h7 Kd3d2 47. Kh7h8 Rg3g4 48. Rc1c3 Rg4h4 49. Kh8g8 Rh4g4 50. Kg8f8 Nd4b5 51. Rc3b3 Nb5a7 52. Rb3a3 Na7c8 53. Rc5c8 Rd6d7 54. Ra3a2 Kd2d3 55. Ra2a3 Kd3d4 56. Ra3c3 Rd7d8 57. Rc8d8 Kd4c3 58. a4a3 Rg4g2 59. Kf8f7 Rg2h2 60. Kf7g6 Rh2a2 61. Rd8f8 Ra2a3 62. Rf8f4 Ra3a7 63. Rf4f7 Kc3d3 64. Kg6f5 Ra7a5 65. Kf5f4 Ra5c5 66. Kf4g3 Rc5c7 67. Rf7f3 Kd3e4 68. Rf3f4 Ke4e3 69. Rf4f3 Ke3d4 70. Rf3f4 Kd4c5 71. Rf4f5 Kc5d6 72. Kg3f4 Rc7b7 73. Kf4e4 Rb7e7 74. Ke4d4 b6b7 75. Rf5f8 Kd6e6 76. Rf8g8 Ke6d6 77. Kd4e4 Kd6c7 78. Ke4e3 b7b8=Q 79. Rg8b8 Kc7b8 80. Ke3e4 Kb8c7 81. Ke4d3 Kc7d6 82. Kd3e2 Re7h7 83. Ke2e1 Rh7h2 84. Ke1f1 Kd6d5 85. Kf1g1 Rh2a2 86. Kg1h1 Kd5e4 87. Kh1g1 e5e6 88. Kg1h1 e6e7 89. Kh1g1 e7e8=Q 90. Kg1f1 Ke4f3 91. Kf1g1 Qe8e1

Last edited by ArnoHu (April 2, 2024 21:39:34)

birdracerthree

ArnoHu wrote:
birdracerthree wrote:
ArnoHu wrote:
ArnoHu wrote:
ArnoHu wrote:
S_P_A_R_T wrote:
birdracerthree wrote:
S_P_A_R_T wrote:
White Dove v7.06 Has Been Released!

After fixing the progression & eval bars, WD v7.06 improved the S3 performance massively!

https://scratch.mit.edu/projects/858052938/
White Dove is scary fast on S3 runtime! Why is the NPS so low on Scratch and Turbowarp?

Although I'm not 100% sure why WD is so slow NPS-wise (it gets like 14k NPS on the starting board with d3 as black, and keep in mind WD considers q-search nodes in this count!), but I have a feeling it's with how I'm implementing legal move generation. As every time full legal moves are generated, it's actually doing 2 pseudo-legal move gens.

*snip*

Wow, what a performance improvement on S3, congrats - and it was there the whole time, just the progress bar slowed it done?!

It runs at the same search depth as GoK now on S3, and IMHO is just a question of time until it will win games. Here is one, against GoK (Medium, white), WD was even until move 21, then it started blundering: https://lichess.org/WPMXBRA3#59

*snip*

Game #2, GoK black, similar story: https://lichess.org/FKfFL5D8#64 It took GoK a near-perfect game to win, and that will not always be the case.

I might also have to disable some pruning on low depth, as it might be more costly than beneficiary there.

Game #3, GoK (Medium, white) at the brink of defeat against WD (P2) on Scratch 3, but WD blundered during endgame: https://lichess.org/HWfVTMNB#73
I believe that a White Dove win will take longer than you expect. White Dove has a really bad habit of blundering winning endgames on Turbowarp. S3 runtime makes the situation a lot worse.

I was really hoping Element would beat GoK on S3 first. If only I could get the TTable to work…

True, it could be Element as well. Here Element (3+8, black) plays GoK (Medium) on Scratch3 - balanced game until move 27: https://lichess.org/lyMvRAWj#89

And Element (3+8) just defeated WD (P2) on S3, in a very close game. Unfortunately I don't have the complete PGN data, as I had to re-import once. Here is the data export from WD:
*snip*

Element has PGN. Hit the j key to toggle, k to remove
What was the opening? Why the re-import?

Edit : I recovered the opening!
1. e2e4 c7c5 2. Ng1f3 e7e6 3. d2d4 c5d4 4. Qd1d4 Nb8c6 5. Qd4a4 Ng8f6 6. Bf1d3 Bf8b4 7. Nb1c3 Bb4c3 8. b2c3 O-O 9. Bc1a3 Nc6e7 10. Ba3e7

Last edited by birdracerthree (April 2, 2024 22:43:22)

ArnoHu

birdracerthree wrote:
ArnoHu wrote:
birdracerthree wrote:
ArnoHu wrote:
ArnoHu wrote:
ArnoHu wrote:
S_P_A_R_T wrote:
birdracerthree wrote:
S_P_A_R_T wrote:
White Dove v7.06 Has Been Released!

After fixing the progression & eval bars, WD v7.06 improved the S3 performance massively!

https://scratch.mit.edu/projects/858052938/
White Dove is scary fast on S3 runtime! Why is the NPS so low on Scratch and Turbowarp?

Although I'm not 100% sure why WD is so slow NPS-wise (it gets like 14k NPS on the starting board with d3 as black, and keep in mind WD considers q-search nodes in this count!), but I have a feeling it's with how I'm implementing legal move generation. As every time full legal moves are generated, it's actually doing 2 pseudo-legal move gens.

*snip*

Wow, what a performance improvement on S3, congrats - and it was there the whole time, just the progress bar slowed it done?!

It runs at the same search depth as GoK now on S3, and IMHO is just a question of time until it will win games. Here is one, against GoK (Medium, white), WD was even until move 21, then it started blundering: https://lichess.org/WPMXBRA3#59

*snip*

Game #2, GoK black, similar story: https://lichess.org/FKfFL5D8#64 It took GoK a near-perfect game to win, and that will not always be the case.

I might also have to disable some pruning on low depth, as it might be more costly than beneficiary there.

Game #3, GoK (Medium, white) at the brink of defeat against WD (P2) on Scratch 3, but WD blundered during endgame: https://lichess.org/HWfVTMNB#73
I believe that a White Dove win will take longer than you expect. White Dove has a really bad habit of blundering winning endgames on Turbowarp. S3 runtime makes the situation a lot worse.

I was really hoping Element would beat GoK on S3 first. If only I could get the TTable to work…

True, it could be Element as well. Here Element (3+8, black) plays GoK (Medium) on Scratch3 - balanced game until move 27: https://lichess.org/lyMvRAWj#89

And Element (3+8) just defeated WD (P2) on S3, in a very close game. Unfortunately I don't have the complete PGN data, as I had to re-import once. Here is the data export from WD:
*snip*
Element has PGN. Hit the j key to toggle, k to remove
What was the opening? Why the re-import?

Edit : I recovered the opening!
1. e2e4 c7c5 2. Ng1f3 e7e6 3. d2d4 c5d4 4. Qd1d4 Nb8c6 5. Qd4a4 Ng8f6 6. Bf1d3 Bf8b4 7. Nb1c3 Bb4c3 8. b2c3 O-O 9. Bc1a3 Nc6e7 10. Ba3e7

Thanks, didn't know. Re-import because I mis-clicked.

ArnoHu

Nice 98% accuracy game by GoK (Medium) on S3, of all levels, against WD (P3) on S3 - this one was decided during early midgame: https://lichess.org/wUGoIzHY#86

birdracerthree

ArnoHu wrote:
Nice 98% accuracy game by GoK (Medium) on S3, of all levels, against WD (P3) on S3 - this one was decided during early midgame: https://lichess.org/wUGoIzHY#86

I was surprised to see White Dove’s trapped queen, so I tested the position r1bq1rk1/ppp3pp/3pp3/2nP1p2/1bPQ4/2N1P3/PP1N1PPP/R3KB1R w KQ - 0 11 and WD played Qf4 on P3 and P2 (panic time was allocated on P2).

ArnoHu

birdracerthree wrote:
ArnoHu wrote:
Nice 98% accuracy game by GoK (Medium) on S3, of all levels, against WD (P3) on S3 - this one was decided during early midgame: https://lichess.org/wUGoIzHY#86
I was surprised to see White Dove’s trapped queen, so I tested the position r1bq1rk1/ppp3pp/3pp3/2nP1p2/1bPQ4/2N1P3/PP1N1PPP/R3KB1R w KQ - 0 11 and WD played Qf4 on P3 and P2 (panic time was allocated on P2).

Strange - some time during that game I had another re-import, maybe it reacts differently then, or maybe reacts differently due to TT state (GoK sometimes sees that too, but just minor eval differences, no blunders, hopefully). It was running on a fast system.

Update: I replayed, and WD played Be2 again.

Last edited by ArnoHu (April 3, 2024 04:03:22)

birdracerthree

ArnoHu wrote:
birdracerthree wrote:
ArnoHu wrote:
Nice 98% accuracy game by GoK (Medium) on S3, of all levels, against WD (P3) on S3 - this one was decided during early midgame: https://lichess.org/wUGoIzHY#86
I was surprised to see White Dove’s trapped queen, so I tested the position r1bq1rk1/ppp3pp/3pp3/2nP1p2/1bPQ4/2N1P3/PP1N1PPP/R3KB1R w KQ - 0 11 and WD played Qf4 on P3 and P2 (panic time was allocated on P2).

Strange - some time during that game I had another re-import, maybe it reacts differently then. It was running of a fast system.

That is very strange. There is a very small period (1/2 a second) where WD has Be2 selected. If the game was run on P2, there’s a chance White Dove may have selected it. This shouldn’t happen on P3, especially on your extremely fast system (although it could happen).

Last edited by birdracerthree (April 3, 2024 03:56:06)

ArnoHu

birdracerthree wrote:
ArnoHu wrote:
birdracerthree wrote:
ArnoHu wrote:
Nice 98% accuracy game by GoK (Medium) on S3, of all levels, against WD (P3) on S3 - this one was decided during early midgame: https://lichess.org/wUGoIzHY#86
I was surprised to see White Dove’s trapped queen, so I tested the position r1bq1rk1/ppp3pp/3pp3/2nP1p2/1bPQ4/2N1P3/PP1N1PPP/R3KB1R w KQ - 0 11 and WD played Qf4 on P3 and P2 (panic time was allocated on P2).

Strange - some time during that game I had another re-import, maybe it reacts differently then. It was running on a fast system.
That is very strange. There is a very small period (1/2 a second) where WD has Be2 selected. If the game was run on P2, there’s a chance White Dove may have selected it. This shouldn’t happen on P3, especially on your extremely fast system (although it could happen).

Please read my edited post. I replayed, and WD made the same mistake, on P3/S3.

Last edited by ArnoHu (April 3, 2024 04:03:43)

birdracerthree

ArnoHu wrote:
birdracerthree wrote:
ArnoHu wrote:
birdracerthree wrote:
ArnoHu wrote:
Nice 98% accuracy game by GoK (Medium) on S3, of all levels, against WD (P3) on S3 - this one was decided during early midgame: https://lichess.org/wUGoIzHY#86
I was surprised to see White Dove’s trapped queen, so I tested the position r1bq1rk1/ppp3pp/3pp3/2nP1p2/1bPQ4/2N1P3/PP1N1PPP/R3KB1R w KQ - 0 11 and WD played Qf4 on P3 and P2 (panic time was allocated on P2).

Strange - some time during that game I had another re-import, maybe it reacts differently then, or maybe reacts differently due to TT state (GoK sometimes sees that too, but just minor eval differences, no blunders, hopefully). It was running of a fast system.
That is very strange. There is a very small period (1/2 a second) where WD has Be2 selected. If the game was run on P2, there’s a chance White Dove may have selected it. This shouldn’t happen on P3, especially on your extremely fast system (although it could happen).

Please read my edited post. I replayed, and WD made the same mistake, on P3/S3.

I tested WD 3 times; all 3 times it played Qf4. Ply 5 is completed in less than 16 seconds. I don’t know how this is possible (unless Be2 is found on ply 6 somehow).

ArnoHu

birdracerthree wrote:
ArnoHu wrote:
birdracerthree wrote:
ArnoHu wrote:
birdracerthree wrote:
ArnoHu wrote:
Nice 98% accuracy game by GoK (Medium) on S3, of all levels, against WD (P3) on S3 - this one was decided during early midgame: https://lichess.org/wUGoIzHY#86
I was surprised to see White Dove’s trapped queen, so I tested the position r1bq1rk1/ppp3pp/3pp3/2nP1p2/1bPQ4/2N1P3/PP1N1PPP/R3KB1R w KQ - 0 11 and WD played Qf4 on P3 and P2 (panic time was allocated on P2).

Strange - some time during that game I had another re-import, maybe it reacts differently then, or maybe reacts differently due to TT state (GoK sometimes sees that too, but just minor eval differences, no blunders, hopefully). It was running of a fast system.
That is very strange. There is a very small period (1/2 a second) where WD has Be2 selected. If the game was run on P2, there’s a chance White Dove may have selected it. This shouldn’t happen on P3, especially on your extremely fast system (although it could happen).

Please read my edited post. I replayed, and WD made the same mistake, on P3/S3.
I tested WD 3 times; all 3 times it played Qf4. Ply 5 is completed in less than 16 seconds. I don’t know how this is possible (unless Be2 is found on ply 6 somehow).

I still had the browser window open. At the end of search at 20 seconds, WD is still running 5+8, and reached move 7 out of 39, evaluation -0.7.

There were several browser windows running, and external cooler was turned off. So I re-ran now, and WD reached move 38 of 39 for 5+8, and had switched to Qf4. Evaluation -0.69. Seems more like good luck, as it did not see the potential queen loss before, maybe due to LMR.

Last edited by ArnoHu (April 3, 2024 04:17:22)

birdracerthree

ArnoHu wrote:
birdracerthree wrote:
ArnoHu wrote:
birdracerthree wrote:
ArnoHu wrote:
birdracerthree wrote:
ArnoHu wrote:
Nice 98% accuracy game by GoK (Medium) on S3, of all levels, against WD (P3) on S3 - this one was decided during early midgame: https://lichess.org/wUGoIzHY#86
I was surprised to see White Dove’s trapped queen, so I tested the position r1bq1rk1/ppp3pp/3pp3/2nP1p2/1bPQ4/2N1P3/PP1N1PPP/R3KB1R w KQ - 0 11 and WD played Qf4 on P3 and P2 (panic time was allocated on P2).

Strange - some time during that game I had another re-import, maybe it reacts differently then, or maybe reacts differently due to TT state (GoK sometimes sees that too, but just minor eval differences, no blunders, hopefully). It was running of a fast system.
That is very strange. There is a very small period (1/2 a second) where WD has Be2 selected. If the game was run on P2, there’s a chance White Dove may have selected it. This shouldn’t happen on P3, especially on your extremely fast system (although it could happen).

Please read my edited post. I replayed, and WD made the same mistake, on P3/S3.
I tested WD 3 times; all 3 times it played Qf4. Ply 5 is completed in less than 16 seconds. I don’t know how this is possible (unless Be2 is found on ply 6 somehow).

I still had the browser window open. At the end of search at 20 seconds, WD is still running 5+8, and reached move 7 out of 39, evaluation -0.7.

I had several browser windows open, and external cooler was turned off. So I re-ran now, and WD reached move 38 of 39 for 5+8, and had switched to Qf4. Evaluation -0.69. Seems more like good luck, as it did not see the potential queen loss before, maybe due to LMR.

I managed to finish 5+8, so your device is slower. However, White Dove should still see the queen trap before then (Element 3+8 plays Nf3).
That is a good point, Qf4 might be a lucky move.

Last edited by birdracerthree (April 3, 2024 04:18:00)

ArnoHu

birdracerthree wrote:
ArnoHu wrote:
birdracerthree wrote:
ArnoHu wrote:
birdracerthree wrote:
ArnoHu wrote:
birdracerthree wrote:
ArnoHu wrote:
Nice 98% accuracy game by GoK (Medium) on S3, of all levels, against WD (P3) on S3 - this one was decided during early midgame: https://lichess.org/wUGoIzHY#86
I was surprised to see White Dove’s trapped queen, so I tested the position r1bq1rk1/ppp3pp/3pp3/2nP1p2/1bPQ4/2N1P3/PP1N1PPP/R3KB1R w KQ - 0 11 and WD played Qf4 on P3 and P2 (panic time was allocated on P2).

Strange - some time during that game I had another re-import, maybe it reacts differently then, or maybe reacts differently due to TT state (GoK sometimes sees that too, but just minor eval differences, no blunders, hopefully). It was running of a fast system.
That is very strange. There is a very small period (1/2 a second) where WD has Be2 selected. If the game was run on P2, there’s a chance White Dove may have selected it. This shouldn’t happen on P3, especially on your extremely fast system (although it could happen).

Please read my edited post. I replayed, and WD made the same mistake, on P3/S3.
I tested WD 3 times; all 3 times it played Qf4. Ply 5 is completed in less than 16 seconds. I don’t know how this is possible (unless Be2 is found on ply 6 somehow).

I still had the browser window open. At the end of search at 20 seconds, WD is still running 5+8, and reached move 7 out of 39, evaluation -0.7.
I managed to finish 5+8, so your device is slower. However, White Dove should still see the queen trap before then (Element 3+8 plays Nf3).

There were several browser windows running, and external cooler was turned off. So I re-ran now, and WD reached move 38 of 39 for 5+8, and had switched to Qf4. Evaluation -0.69. Seems more like good luck, as it did not see the potential queen loss before, maybe due to LMR.

ArnoHu

HasiLover_Test wrote:
birdracerthree wrote:
HasiLover_Test wrote:
Scurious 2.1 just Drew Element 5+8!!! https://lichess.org/zaloM9Gg#97 Scurious 2.1 was black.
Interesting game… Element should have played Qd8+ on move 39 to force a queen trade. This required a depth of 6, but Element gets a ply extension here (at least on my device). I’ll have to see why Element didn’t force a queen trade here.

Funnily enough, I actually thought Element was black because it doesn’t play a 4 knights Spanish on 6+8. I saw the Petrov and then I realized that Element was white in the game.
This Lucky Draw by Scurious 2.1 inspired me to finally implement Quiescence Search, I will be working on it now.

That might turn out to be a great improvement, looking forward to the new version!

S_P_A_R_T

What NPS should WD be aiming for on S3?

Because right now, even though it's search depth can get pretty far, it's mostly because of the aggressive LMR. I'm working on an update to hopefully add specialized extensions to help out in this area, but at the end of the day, NPS / Move Gen Speed, is going to be more important.

Check out Space Program Simulator!

In it, you can build your own rockets from a variety of parts!
Then fly it with realistic orbital mechanics.

Go to orbit, explore different planets, share your save codes, and do so much more!

If you would like to help out on the project or chat about space or really anything else, check out the offical SPS Studio!

For more information & tutorials, check out the offical forum post!

birdracerthree

S_P_A_R_T wrote:
What NPS should WD be aiming for on S3?

Because right now, even though it's search depth can get pretty far, it's mostly because of the aggressive LMR. I'm working on an update to hopefully add specialized extensions to help out in this area, but at the end of the day, NPS / Move Gen Speed, is going to be more important.

I would guess double of what it is now to start. WD's NPS is already poor (<20k NPS on TW is not good, 100 NPS on S3 isn't any better), maybe you should look into the double move generation. That should not be necessary.

Update on mate eval TT bug : The incorrect TT value is coming from the same depth as the correct one. No more info is known.

Last edited by birdracerthree (April 3, 2024 22:14:52)

ArnoHu

birdracerthree wrote:
S_P_A_R_T wrote:
What NPS should WD be aiming for on S3?

Because right now, even though it's search depth can get pretty far, it's mostly because of the aggressive LMR. I'm working on an update to hopefully add specialized extensions to help out in this area, but at the end of the day, NPS / Move Gen Speed, is going to be more important.
I would guess double of what it is now to start. WD's NPS is already poor (<20k NPS on TW is not good, 100 NPS on S3 isn't any better), maybe you should look into the double move generation. That should not be necessary.

Update on mate eval TT bug : The incorrect TT value is coming from the same depth as the correct one. No more info is known.

Good question, as mentioned before, NPS must be taken with a grain of salt. The whole goal of pruning is to have as little nodes visited as necessary for reaching a high search depth. An engine with none or very limited pruning will have high NPS without getting to any significant search depth. When you compare NPS of professional engines, you know they have great pruning in place, so no need to discuss.

E.g. I just imported “r3kb1r/pp1n1ppp/q1p1p3/3pPn2/3P2P1/1QN2N2/PPPB1P1P/2KR3R b kq - 0 11” to our engines on TurboWarp. Scurious is known to have the highest NPS. It was done searching ply 5 in 3.2 seconds for that board, that is without quiescence. It was “only” running on 175k NPS here. GoK also had 175k NPS, but was done with ply 5 search including quiescence in 0.3 seconds. And that gap will widen with any additional ply. Element was done in 2.2 seconds at 85k NPS, WD in 0.8 seconds at 16k NPS (although I doubt our NPS calculation is equivalent).

If you refer to improving move generator, moves-per-second is an important metric. As a baseline I can tell you that GoK's core move generator, when running standalone (no other code executed, no staged generation, no hash moves, etc), will be at 800,000 generated moves per second for the start board. So MPS in the 100,000s would be a good starting point.

When it comes to pruning via good move ordering, I would look at the average move list index of the best move found. I think you have that already in WD, right? And if there is a staged move generator, you can compare NPS and MPS. You want NPS to be close to MPS, because then you did not generate a lot of moves that were never applied.

With all that in place, just verify at the search depth you reach. You are right that LMR can be misleading, depth might be great, but it could prune things it should not. For that we can compare depth with LMR disabled, or having the same LMR configuration in place.

We must also keep in mind that system speed and current board do have a lot of impact on NPS, and so does quiescence search. In any way, based on where it is now, I guess 50k NPS might a be a good initial goal for WD. Why do you think NPS is low now? How can WD reach its search depth now? Are we sure it is not also about the way how NPS is calculated?

@birdracerthree, why don't you simply disable checkmate evals in TT for the time being? I don't think they have such a high impact. And I even have seen professional engines disabling them for quiescence search at least, and so does GoK.

Last edited by ArnoHu (April 4, 2024 05:09:30)

S_P_A_R_T

ArnoHu wrote:
birdracerthree wrote:
S_P_A_R_T wrote:
What NPS should WD be aiming for on S3?

Because right now, even though it's search depth can get pretty far, it's mostly because of the aggressive LMR. I'm working on an update to hopefully add specialized extensions to help out in this area, but at the end of the day, NPS / Move Gen Speed, is going to be more important.
I would guess double of what it is now to start. WD's NPS is already poor (<20k NPS on TW is not good, 100 NPS on S3 isn't any better), maybe you should look into the double move generation. That should not be necessary.

Update on mate eval TT bug : The incorrect TT value is coming from the same depth as the correct one. No more info is known.

Good question, as mentioned before, NPS must be taken with a grain of salt. The whole goal of pruning is to have as little nodes visited as necessary for reaching a high search depth. An engine with none or very limited pruning will have high NPS without getting to any significant search depth. When you compare NPS of professional engines, you know they have great pruning in place, so no need to discuss.

E.g. I just imported “r3kb1r/pp1n1ppp/q1p1p3/3pPn2/3P2P1/1QN2N2/PPPB1P1P/2KR3R b kq - 0 11” to our engines on TurboWarp. Scurious is known to have the highest NPS. It was done searching ply 5 in 3.2 seconds for that board, that is without quiescence. It was “only” running on 175k NPS here. GoK also had 175k NPS, but was done with ply 5 search including quiescence in 0.3 seconds. And that gap will widen with any additional ply. Element was done in 2.2 seconds at 85k NPS, WD in 0.8 seconds at 16k NPS (although I doubt our NPS calculation is equivalent).

If you refer to improving move generator, moves-per-second is an important metric. As a baseline I can tell you that GoK's core move generator, when running standalone (no other code executed, no staged generation, no hash moves, etc), will be at 800,000 generated moves per second for the start board. So MPS in the 100,000s would be a good starting point.

When it comes to pruning via good move ordering, I would look at the average move list index of the best move found. I think you have that already in WD, right? And if there is a staged move generator, you can compare NPS and MPS. You want NPS to be close to MPS, because then you did not generate a lot of moves that were never applied.

With all that in place, just verify at the search depth you reach. You are right that LMR can be misleading, depth might be great, but it could prune things it should not. For that we can compare depth with LMR disabled, or having the same LMR configuration in place.

We must also keep in mind that system speed and current board do have a lot of impact on NPS, and so does quiescence search. In any way, based on where it is now, I guess 50k NPS might a be a good initial goal for WD. Why do you think NPS is low now? How can WD reach its search depth now? Are we sure it is not also about the way how NPS is calculated?

@birdracerthree, why don't you simply disable checkmate evals in TT for the time being? I don't think they have such a high impact. And I even have seen professional engines disabling them for quiescence search at least, and so does GoK.

Alright, thanks!

WD on the starting board has an average-move-list-index of around 1.5-1.6.

WD also calculates the NPS as the total “normal nodes” + the q-search nodes, and then divide that by the time used. (Normal nodes is increased by one at the start of every “minmax”, and the q-search nodes counter is increased at the start of every “q-search”.)

I'm not totally sure why the NPS is just so low, (even doubling the move gen speed would put it at less than half of Element's), but I have a feeling it's just some really poor code somewhere (similar to the progress/eval bar fix on S3).

(Also, how do you calculate Moves Per Second on the starting board?)

(Also also, here's a really cool 99% accuracy game I played yesterday IRL https://lichess.org/nwfYC30o#65 )

Check out Space Program Simulator!

In it, you can build your own rockets from a variety of parts!
Then fly it with realistic orbital mechanics.

Go to orbit, explore different planets, share your save codes, and do so much more!

If you would like to help out on the project or chat about space or really anything else, check out the offical SPS Studio!

For more information & tutorials, check out the offical forum post!

Discuss Scratch