Discuss Scratch

birdracerthree
Scratcher
500+ posts

Scratch Chess Engine - Game of Kings

S_P_A_R_T wrote:

ArnoHu wrote:

Scratch Chess Engine Ranking (Scratch 3 Runtime)

Rank	Name		Elo	+	-	games	score	oppo.	draws
1 GoK 1713 160 124 25 100% 28 0%
2 Element 1542 210 246 6 33% 283 0%
3 Bonsai 1524 180 168 10 60% 125 0%
4 White Dove 1422 180 185 9 44% 122 0%
5 Archimedes 1400 167 165 10 50% 87 20%
6 HarleyK 1340 260 311 4 25% 184 0%
7 The Turk 1335 204 238 7 29% 128 0%
8 Shallow Blue 1331 204 204 5 50% 9 20%
9 Frenchgamerlol 1317 242 251 4 38% 51 25%
10 LowDoor 1315 220 228 5 40% 46 0%
11 Chip 1307 196 229 6 25% 96 17%
12 Scurious 1299 190 190 5 50% -49 60%
13 Wolverine 1275 305 470 3 0% 231 0%
14 Pseudo 1271 331 479 2 0% 172 0%
15 U0 1237 342 481 2 0% 164 0%
16 Mystery 1185 273 402 3 0% 107 0%
17 Midecah 1136 253 410 4 0% 87 0%

Scratch Chess Engine Ranking (TurboWarp Runtime)
Rank	Name		Elo	+	-	games	score	oppo.	draws
1 GoK 2114 118 91 52 90% 59 4%
2 White Dove 1796 64 66 72 40% 184 17%
3 Element 1766 64 66 72 38% 172 15%
4 Bonsai 1649 156 144 11 59% -53 27%
5 Thundershark 1543 146 163 12 25% 50 17%
6 Shallow Blue 1529 258 298 3 17% 69 33%
7 Scurious 1502 143 164 10 25% -59 30%

Interesting stuff! I wonder how WD vs Bonsai games will go, especially considering that I've fixed a few S3 WD related issues.
We’ll see, but those WD 6.5 games against Element really lowered its rating (pre 6+7 WD didn’t have limited quiescence on S3 runtime, causing losing captures). I’m currently running some S3 runtime games right now. I think that some S3 vs Turbowarp games might help to adjust the ratings on GoK on S3

Last edited by birdracerthree (March 26, 2024 20:59:38)

ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

birdracerthree wrote:

S_P_A_R_T wrote:

ArnoHu wrote:

Scratch Chess Engine Ranking (Scratch 3 Runtime)

Rank	Name		Elo	+	-	games	score	oppo.	draws
1 GoK 1713 160 124 25 100% 28 0%
2 Element 1542 210 246 6 33% 283 0%
3 Bonsai 1524 180 168 10 60% 125 0%
4 White Dove 1422 180 185 9 44% 122 0%
5 Archimedes 1400 167 165 10 50% 87 20%
6 HarleyK 1340 260 311 4 25% 184 0%
7 The Turk 1335 204 238 7 29% 128 0%
8 Shallow Blue 1331 204 204 5 50% 9 20%
9 Frenchgamerlol 1317 242 251 4 38% 51 25%
10 LowDoor 1315 220 228 5 40% 46 0%
11 Chip 1307 196 229 6 25% 96 17%
12 Scurious 1299 190 190 5 50% -49 60%
13 Wolverine 1275 305 470 3 0% 231 0%
14 Pseudo 1271 331 479 2 0% 172 0%
15 U0 1237 342 481 2 0% 164 0%
16 Mystery 1185 273 402 3 0% 107 0%
17 Midecah 1136 253 410 4 0% 87 0%

Scratch Chess Engine Ranking (TurboWarp Runtime)
Rank	Name		Elo	+	-	games	score	oppo.	draws
1 GoK 2114 118 91 52 90% 59 4%
2 White Dove 1796 64 66 72 40% 184 17%
3 Element 1766 64 66 72 38% 172 15%
4 Bonsai 1649 156 144 11 59% -53 27%
5 Thundershark 1543 146 163 12 25% 50 17%
6 Shallow Blue 1529 258 298 3 17% 69 33%
7 Scurious 1502 143 164 10 25% -59 30%

Interesting stuff! I wonder how WD vs Bonsai games will go, especially considering that I've fixed a few S3 WD related issues.
We’ll see, but those WD 6.5 games against Element really lowered its rating (pre 6+7 WD didn’t have limited quiescence on S3 runtime, causing losing captures). I’m currently running some S3 runtime games right now. I think that some S3 vs Turbowarp games might help to adjust the ratings on GoK on S3

I will apply a sliding window approach. Over time and more games, I will age out older ones. GoK also has some draws in there which were caused by a regression.
ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

I constantly run into the missing-a1-rook-on-startup problem with Element. It is a race condition during startup (when-flag-clicked). As a quick fix I added a wait(1) there in the Element v1.4862 sprite, which is ugly but works. Maybe you want to consider to introduce a structured startup handling by one central controller, broadcasting a certain order of startup messages?

Two good games by both engines, GoK and Element:

Game #1 (Scratch 3): GoK (Medium) vs. Element (3+8), GoK wins in 58 moves, 96% vs. 89% accuracy: https://lichess.org/study/v3EKTlR2/lxH2g7Gh
Game #2 (TurboWarp): GoK (Medium) vs. Element (5+8), GoK wins in 37 moves, 98% vs. 88% accuracy: https://lichess.org/study/oWyPldeN/F482kGTb

Last edited by ArnoHu (March 27, 2024 01:59:18)

birdracerthree
Scratcher
500+ posts

Scratch Chess Engine - Game of Kings

ArnoHu wrote:

I constantly run into the missing-a1-rook-on-startup problem with Element. It is a race condition during startup (when-flag-clicked). As a quick fix I added a wait(1) there in the Element v1.4862 sprite, which is ugly but works. Maybe you want to consider to introduce a structured startup handling by one central controller, broadcasting a certain order of startup messages?

Two good games by both engines, GoK and Element:

Game #1 (Scratch 3): GoK (Medium) vs. Element (3+8), GoK wins in 58 moves, 96% vs. 89% accuracy: https://lichess.org/study/v3EKTlR2/lxH2g7Gh
Game #2 (TurboWarp): GoK (Medium) vs. Element (5+8), GoK wins in 37 moves, 98% vs. 88% accuracy: https://lichess.org/study/oWyPldeN/F482kGTb
I’ll fix it tomorrow. I should be able to generate the evaluation board first on startup; that will be enough to stop the condition.
As promised, I have fixed the issue.

Why Element 5+8 over 6+8?

I have put 3 Element vs WD games into the Element vs Engines study (S3 runtime). Result is 1.5-1.5, a lot better than last time. 6.93 helped WD a lot.

Last edited by birdracerthree (March 27, 2024 13:15:57)

ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

birdracerthree wrote:

ArnoHu wrote:

I constantly run into the missing-a1-rook-on-startup problem with Element. It is a race condition during startup (when-flag-clicked). As a quick fix I added a wait(1) there in the Element v1.4862 sprite, which is ugly but works. Maybe you want to consider to introduce a structured startup handling by one central controller, broadcasting a certain order of startup messages?

Two good games by both engines, GoK and Element:

Game #1 (Scratch 3): GoK (Medium) vs. Element (3+8), GoK wins in 58 moves, 96% vs. 89% accuracy: https://lichess.org/study/v3EKTlR2/lxH2g7Gh
Game #2 (TurboWarp): GoK (Medium) vs. Element (5+8), GoK wins in 37 moves, 98% vs. 88% accuracy: https://lichess.org/study/oWyPldeN/F482kGTb
I’ll fix it tomorrow. I should be able to generate the evaluation board first on startup; that will be enough to stop the condition.

Why Element 5+8 over 6+8?

Because GoK was also on Medium, hence similar think time. 6+8 frequently runs into browser timeouts (although it recovers), I will let it play against GoK Difficult next.

Here it is:

Game #3 (TurboWarp): GoK (Difficult) vs. Element (6+8), GoK wins in 44 moves, 95% vs. 87% accuracy: https://lichess.org/study/oWyPldeN/gDiUUu4Z

I paid closer attention, Element is faster now than it used to be.

Last edited by ArnoHu (March 27, 2024 02:57:10)

ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

Hi all,

I finally was able to address the issue with dynamic evaluations (AKA eval fixups, that mainly compensate for limited search depth, like throwing bishops and knights against pawn shelters (maybe capturing pawn + rook), with short-term positional gains but long-term material loss), and the way they bubble up the node evaluation tree and are stored in the transposition table. Any evaluation containing such dynamic components cannot be re-used in the next search run (e.g. as the capture sequence might have started already). The main issue was with standing pat, which by definition often is dynamic.

I fixed that now in the GoK Dev version at https://scratch.mit.edu/projects/828094886 - the change was not trivial, and it is still undergoing testing. So I would appreciated data on any test games you might be doing in the meantime, resp. mistakes you might encounter

The gain so far seems to be ~30% speedup during midgame on TurboWarp (mainly when there are enough capture-sequences at some point during search), which translate to 0,5 plies of search depth gained (in average), thanks to improved transposition table node eval cache hits.

Thank you!
HasiLover
Scratcher
100+ posts

Scratch Chess Engine - Game of Kings

Scurious 2.1 May just beat GoK Difficult, its up a Queen against Rook and Bishop and Pawn. Stockfish says its -5 for GoK
It sadly blundered a Pawn and GoK will have 3 strong Passed Pawns 3 Squares away from Promotion

New GoK Version beats Scurious 2.1 in 50 Moves without a Queen:https://lichess.org/SSI22lCx#99

Last edited by HasiLover (March 27, 2024 10:02:21)

ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

HasiLover wrote:

Scurious 2.1 May just beat GoK Difficult, its up a Queen against Rook and Bishop and Pawn. Stockfish says its -5 for GoK
It sadly blundered a Pawn and GoK will have 3 strong Passed Pawns 3 Squares away from Promotion

Oh no! I am very interested in PGN data. Scratch 3 I suppose? Scurious on 4 or 5 plies?

Material-wise RBP > Q. Stockfish likely sees something far beyond their search horizon.

Last edited by ArnoHu (March 27, 2024 10:00:31)

HasiLover
Scratcher
100+ posts

Scratch Chess Engine - Game of Kings

ArnoHu wrote:

HasiLover wrote:

Scurious 2.1 May just beat GoK Difficult, its up a Queen against Rook and Bishop and Pawn. Stockfish says its -5 for GoK
It sadly blundered a Pawn and GoK will have 3 strong Passed Pawns 3 Squares away from Promotion

Oh no! I am very interested in PGN data. Scratch 3 I suppose? Scurious on 4 or 5 plies?

Material-wise RBP > Q. Stockfish likely sees something far beyond their search horizon.
I ran it on TW, but with my PC thats like the same thing as S3
ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

HasiLover wrote:

ArnoHu wrote:

HasiLover wrote:

Scurious 2.1 May just beat GoK Difficult, its up a Queen against Rook and Bishop and Pawn. Stockfish says its -5 for GoK
It sadly blundered a Pawn and GoK will have 3 strong Passed Pawns 3 Squares away from Promotion

Oh no! I am very interested in PGN data. Scratch 3 I suppose? Scurious on 4 or 5 plies?

Material-wise RBP > Q. Stockfish likely sees something far beyond their search horizon.
I ran it on TW, but with my PC thats like the same thing as S3

Do you have PGN export?
HasiLover
Scratcher
100+ posts

Scratch Chess Engine - Game of Kings

ArnoHu wrote:

HasiLover wrote:

ArnoHu wrote:

HasiLover wrote:

Scurious 2.1 May just beat GoK Difficult, its up a Queen against Rook and Bishop and Pawn. Stockfish says its -5 for GoK
It sadly blundered a Pawn and GoK will have 3 strong Passed Pawns 3 Squares away from Promotion

Oh no! I am very interested in PGN data. Scratch 3 I suppose? Scurious on 4 or 5 plies?

Material-wise RBP > Q. Stockfish likely sees something far beyond their search horizon.
I ran it on TW, but with my PC thats like the same thing as S3

Do you have PGN export?
What? The Link is in the old Message.https://lichess.org/SSI22lCx#99
birdracerthree
Scratcher
500+ posts

Scratch Chess Engine - Game of Kings

ArnoHu wrote:

birdracerthree wrote:

ArnoHu wrote:

I constantly run into the missing-a1-rook-on-startup problem with Element. It is a race condition during startup (when-flag-clicked). As a quick fix I added a wait(1) there in the Element v1.4862 sprite, which is ugly but works. Maybe you want to consider to introduce a structured startup handling by one central controller, broadcasting a certain order of startup messages?

Two good games by both engines, GoK and Element:

Game #1 (Scratch 3): GoK (Medium) vs. Element (3+8), GoK wins in 58 moves, 96% vs. 89% accuracy: https://lichess.org/study/v3EKTlR2/lxH2g7Gh
Game #2 (TurboWarp): GoK (Medium) vs. Element (5+8), GoK wins in 37 moves, 98% vs. 88% accuracy: https://lichess.org/study/oWyPldeN/F482kGTb
I’ll fix it tomorrow. I should be able to generate the evaluation board first on startup; that will be enough to stop the condition.

Why Element 5+8 over 6+8?

Because GoK was also on Medium, hence similar think time. 6+8 frequently runs into browser timeouts (although it recovers), I will let it play against GoK Difficult next.

Here it is:

Game #3 (TurboWarp): GoK (Difficult) vs. Element (6+8), GoK wins in 44 moves, 95% vs. 87% accuracy: https://lichess.org/study/oWyPldeN/gDiUUu4Z

I paid closer attention, Element is faster now than it used to be.
That’s strange… I checked my notes and it looks like the last speed upgrade to Element was v1.48 Full Release (that was months ago). I’ll have to look into better king’s gambit lines in the meantime
ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

HasiLover wrote:

ArnoHu wrote:

HasiLover wrote:

ArnoHu wrote:

HasiLover wrote:

Scurious 2.1 May just beat GoK Difficult, its up a Queen against Rook and Bishop and Pawn. Stockfish says its -5 for GoK
It sadly blundered a Pawn and GoK will have 3 strong Passed Pawns 3 Squares away from Promotion

Oh no! I am very interested in PGN data. Scratch 3 I suppose? Scurious on 4 or 5 plies?

Material-wise RBP > Q. Stockfish likely sees something far beyond their search horizon.
I ran it on TW, but with my PC thats like the same thing as S3

Do you have PGN export?
What? The Link is in the old Message.https://lichess.org/SSI22lCx#99

Thanks, I didn't re-read.

GoK's blunder at move 9 would have taken another 19 plies to lead to any material loss, far beyond search horizon of any Scratch chess engine. The worst deficit was 3.2, Element and WD have held similar and larger advantages during midgame.

Last edited by ArnoHu (March 27, 2024 21:21:55)

ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

ArnoHu wrote:

Hi all,

I finally was able to address the issue with dynamic evaluations (AKA eval fixups, that mainly compensate for limited search depth, like throwing bishops and knights against pawn shelters (maybe capturing pawn + rook), with short-term positional gains but long-term material loss), and the way they bubble up the node evaluation tree and are stored in the transposition table. Any evaluation containing such dynamic components cannot be re-used in the next search run (e.g. as the capture sequence might have started already). The main issue was with standing pat, which by definition often is dynamic.

I fixed that now in the GoK Dev version at https://scratch.mit.edu/projects/828094886 - the change was not trivial, and it is still undergoing testing. So I would appreciated data on any test games you might be doing in the meantime, resp. mistakes you might encounter

The gain so far seems to be ~30% speedup during midgame on TurboWarp (mainly when there are enough capture-sequences at some point during search), which translate to 0,5 plies of search depth gained (in average), thanks to improved transposition table node eval cache hits.

Thank you!

Game #1: GoK 6.405 (TW. Medium) wins against GoK 6.404 in 31 (!) moves, 96% vs. 86% accuracy: https://lichess.org/UPPIDAq5#62
Game #2: GoK 6.405 (S3. Medium) wins against GoK 6.404 in 62 moves, 93% vs. 89% accuracy: https://lichess.org/N9hmF029#123

Last edited by ArnoHu (March 27, 2024 20:52:28)

S_P_A_R_T
Scratcher
500+ posts

Scratch Chess Engine - Game of Kings

White Dove v7.0 has been released!

This version tuned LMR (now it's only forced to search the first 3 moves at full depth), and also fixed the NMP by changing it to fail-soft, instead of fail-hard, because WD itself is fail-soft.

This should hopefully make WD stronger on both S3 and TW!

Check out Space Program Simulator!





In it, you can build your own rockets from a variety of parts!
Then fly it with realistic orbital mechanics.

Go to orbit, explore different planets, share your save codes, and do so much more!

If you would like to help out on the project or chat about space or really anything else, check out the offical SPS Studio!

For more information & tutorials, check out the offical forum post!

birdracerthree
Scratcher
500+ posts

Scratch Chess Engine - Game of Kings

S_P_A_R_T wrote:

White Dove v7.0 has been released!

This version tuned LMR (now it's only forced to search the first 3 moves at full depth), and also fixed the NMP by changing it to fail-soft, instead of fail-hard, because WD itself is fail-soft.

This should hopefully make WD stronger on both S3 and TW!
You don't think that the LMR is too aggressive with WD's (relatively) poor move ordering?

Flashback : 1r1k2r1/p2b2b1/Q1np3p/1ppN1p1B/2P1P2P/1P4P1/Pq1N1P2/3KRR2 b - - 2 35

Last edited by birdracerthree (March 27, 2024 22:11:52)

S_P_A_R_T
Scratcher
500+ posts

Scratch Chess Engine - Game of Kings

birdracerthree wrote:

S_P_A_R_T wrote:

White Dove v7.0 has been released!

This version tuned LMR (now it's only forced to search the first 3 moves at full depth), and also fixed the NMP by changing it to fail-soft, instead of fail-hard, because WD itself is fail-soft.

This should hopefully make WD stronger on both S3 and TW!
You don't think that the LMR is too aggressive with WD's (relatively) poor move ordering?

Flashback : 1r1k2r1/p2b2b1/Q1np3p/1ppN1p1B/2P1P2P/1P4P1/Pq1N1P2/3KRR2 b - - 2 35

WD never really was able to solve this position, so I'm not super concerned. I also think that this more aggressive LMR (hopefully) shouldn't have that much of an impact on tactical positions, but I guess only time will tell…

Check out Space Program Simulator!





In it, you can build your own rockets from a variety of parts!
Then fly it with realistic orbital mechanics.

Go to orbit, explore different planets, share your save codes, and do so much more!

If you would like to help out on the project or chat about space or really anything else, check out the offical SPS Studio!

For more information & tutorials, check out the offical forum post!

ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

S_P_A_R_T wrote:

White Dove v7.0 has been released!

This version tuned LMR (now it's only forced to search the first 3 moves at full depth), and also fixed the NMP by changing it to fail-soft, instead of fail-hard, because WD itself is fail-soft.

This should hopefully make WD stronger on both S3 and TW!

Congrats, zero-mistake game by WD (black) until move 51: https://lichess.org/study/oWyPldeN/adThgqCX . GoK won it at the end at 96% vs. 92% accuracy.
S_P_A_R_T
Scratcher
500+ posts

Scratch Chess Engine - Game of Kings

ArnoHu wrote:

S_P_A_R_T wrote:

White Dove v7.0 has been released!

This version tuned LMR (now it's only forced to search the first 3 moves at full depth), and also fixed the NMP by changing it to fail-soft, instead of fail-hard, because WD itself is fail-soft.

This should hopefully make WD stronger on both S3 and TW!

Congrats, zero-mistake game by WD (black) until move 51: https://lichess.org/study/oWyPldeN/adThgqCX . GoK won it at the end at 96% vs. 92% accuracy.

Cool game!

(Also, could you turn on analysis & export on this study too? Thx!)

Check out Space Program Simulator!





In it, you can build your own rockets from a variety of parts!
Then fly it with realistic orbital mechanics.

Go to orbit, explore different planets, share your save codes, and do so much more!

If you would like to help out on the project or chat about space or really anything else, check out the offical SPS Studio!

For more information & tutorials, check out the offical forum post!

ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

S_P_A_R_T wrote:

ArnoHu wrote:

S_P_A_R_T wrote:

White Dove v7.0 has been released!

This version tuned LMR (now it's only forced to search the first 3 moves at full depth), and also fixed the NMP by changing it to fail-soft, instead of fail-hard, because WD itself is fail-soft.

This should hopefully make WD stronger on both S3 and TW!

Congrats, zero-mistake game by WD (black) until move 51: https://lichess.org/study/oWyPldeN/adThgqCX . GoK won it at the end at 96% vs. 92% accuracy.

Cool game!

(Also, could you turn on analysis & export on this study too? Thx!)

Done!

Powered by DjangoBB