Discuss Scratch

HasiLover_Test
Scratcher
100+ posts

Scratch Chess Engine - Game of Kings

Gok Logfile: Already sent

Last edited by HasiLover_Test (March 15, 2024 15:22:41)


I am deeply regretting naming my Chess Engine SCURIOUS????? The Name is so bad and I am forced to stare at it everytime I try to make Progress. What was I thinking a few Months ago.
ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

HasiLover_Test wrote:

Gok Logfile: Already sent

Thanks for logfile. Wow, it finished ply 9 in 22 seconds. That takes 3.5 seconds on my system. Are you sure you don't have other stuff running? Another parallel user session maybe? High memory / CPU usage by other processes? Or is the engine's browser window overlapped during execution? Must be in foreground, without anything else running, no other windows, no other tabs. Power cable plugged in? Which browser are you using?

About the blunder, the move is not considered that bad at ply 9, and internal caching might lead to differences of +/- 10 centipawns, that is why I cannot reproduce even on ply 9.

Last edited by ArnoHu (March 15, 2024 15:52:37)

HasiLover_Test
Scratcher
100+ posts

Scratch Chess Engine - Game of Kings

ArnoHu wrote:

HasiLover_Test wrote:

Gok Logfile: Already sent

Thanks for logfile. Wow, it finished ply 9 in 22 seconds. That takes 3.5 seconds on my system. Are you sure you don't have other stuff running? Another parallel user session maybe? High memory / CPU usage by other processes? Or is the engine's browser window overlapped during execution? Must be in foreground, without anything else running, no other windows, no other tabs. Power cable plugged in? Which browser are you using?

About the blunder, the move is not considered that bad at ply 9, and internal caching might lead to differences of +/- 10 centipawns, that is why I cannot reproduce even on ply 9.
I still use an old Laptop, It even sometimes takes a few Minutes to open a Browser Tab. My Wifi also isnt the best.

I am deeply regretting naming my Chess Engine SCURIOUS????? The Name is so bad and I am forced to stare at it everytime I try to make Progress. What was I thinking a few Months ago.
ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

HasiLover_Test wrote:

ArnoHu wrote:

HasiLover_Test wrote:

Gok Logfile: Already sent

Thanks for logfile. Wow, it finished ply 9 in 22 seconds. That takes 3.5 seconds on my system. Are you sure you don't have other stuff running? Another parallel user session maybe? High memory / CPU usage by other processes? Or is the engine's browser window overlapped during execution? Must be in foreground, without anything else running, no other windows, no other tabs. Power cable plugged in? Which browser are you using?

About the blunder, the move is not considered that bad at ply 9, and internal caching might lead to differences of +/- 10 centipawns, that is why I cannot reproduce even on ply 9.
I still use an old Laptop, It even sometimes takes a few Minutes to open a Browser Tab. My Wifi also isnt the best.

Well that's OK, engines should also work fine on a slow system, but doesn't it take a long time for projects that have hardcoded depth instead of think time management?

Nice tournament / games BTW!

And: How is your engine-ELO-rating project going?

Last edited by ArnoHu (March 17, 2024 14:05:58)

HasiLover_Test
Scratcher
100+ posts

Scratch Chess Engine - Game of Kings

ArnoHu wrote:

HasiLover_Test wrote:

ArnoHu wrote:

HasiLover_Test wrote:

Gok Logfile: Already sent

Thanks for logfile. Wow, it finished ply 9 in 22 seconds. That takes 3.5 seconds on my system. Are you sure you don't have other stuff running? Another parallel user session maybe? High memory / CPU usage by other processes? Or is the engine's browser window overlapped during execution? Must be in foreground, without anything else running, no other windows, no other tabs. Power cable plugged in? Which browser are you using?

About the blunder, the move is not considered that bad at ply 9, and internal caching might lead to differences of +/- 10 centipawns, that is why I cannot reproduce even on ply 9.
I still use an old Laptop, It even sometimes takes a few Minutes to open a Browser Tab. My Wifi also isnt the best.

Well that's OK, engines should also work fine on small system, but doesn't it take a long time for projects that have hardcoded depth instead of think time management?

Nice tournament / games BTW!

And: How is your engine-ELO-rating project going?
I will need more Games, thats why im gonna be hosting more Tournaments.

I am deeply regretting naming my Chess Engine SCURIOUS????? The Name is so bad and I am forced to stare at it everytime I try to make Progress. What was I thinking a few Months ago.
birdracerthree
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

ArnoHu wrote:

HasiLover_Test wrote:

Gok Logfile: Already sent

Thanks for logfile. Wow, it finished ply 9 in 22 seconds. That takes 3.5 seconds on my system. Are you sure you don't have other stuff running? Another parallel user session maybe? High memory / CPU usage by other processes? Or is the engine's browser window overlapped during execution? Must be in foreground, without anything else running, no other windows, no other tabs. Power cable plugged in? Which browser are you using?

About the blunder, the move is not considered that bad at ply 9, and internal caching might lead to differences of +/- 10 centipawns, that is why I cannot reproduce even on ply 9.
22 seconds is really bad. The first time I ran GoK, it did consider Qb5 midway through ply 9, but it switched to Kf8 on ply 10 in the final few seconds (23.932).

The second time, I played music on YT and re-ran the position on a new GoK instance. This time, it switched to Rg7 during ply 9 instead of Qb5 and it stayed that way. GoK reached ply 11 “21.891: 11 : Search start, depth = 11”. The second instance started ply 9 faster than the first (3.681 instead of 3.472).

Quick note : I opening a new tab and went to GoK's URL immediately to minimize the memory footprint from the saved history.

Last edited by birdracerthree (March 15, 2024 22:25:17)


Hello, I am @birdracerthree , the creator of the fourth strongest chess engine on Scratch, Element
Element’s approximate rating is 1800 FIDE or 2050 chesscom.
ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

birdracerthree wrote:

ArnoHu wrote:

HasiLover_Test wrote:

Gok Logfile: Already sent

Thanks for logfile. Wow, it finished ply 9 in 22 seconds. That takes 3.5 seconds on my system. Are you sure you don't have other stuff running? Another parallel user session maybe? High memory / CPU usage by other processes? Or is the engine's browser window overlapped during execution? Must be in foreground, without anything else running, no other windows, no other tabs. Power cable plugged in? Which browser are you using?

About the blunder, the move is not considered that bad at ply 9, and internal caching might lead to differences of +/- 10 centipawns, that is why I cannot reproduce even on ply 9.
22 seconds is really bad. The first time I ran GoK, it did consider Qb5 midway through ply 9, but it switched to Kf8 on ply 10 in the final few seconds (23.932).

The second time, I played music on YT and re-ran the position on a new GoK instance. This time, it switched to Rg7 during ply 9 instead of Qb5 and it stayed that way. GoK reached ply 11 “21.891: 11 : Search start, depth = 11”. The second instance started ply 9 faster than the first (3.681 instead of 3.472).

Quick note : I opening a new tab and went to GoK's URL immediately to minimize the memory footprint from the saved history.

Thanks, I also get different results, that is explainable within small boundaries given GoK's incremental / dynamic evaluation approach and caching. One of the fastest runs on my system was this:

2.057: 9 : Search start, depth = 9
2.306: 9 : 2734 : 121
5.250: 9 : 0715 : 127
5.536: 10 : Search start, depth = 10
10.087: 10 : 0715 : 142
11.440: 11 : Search start, depth = 11
14.829: 11 : 0715 : 136
18.826: 12 : Search start, depth = 12
ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

Looking into NPS rates, I discovered something strange when comparing Scurious and GoK. While GoK turned out to run pretty stable at 200k NPS during all game stages, for the same boards Scurious was between 200k to 500k for opening boards, and 100k during midgame and endgame. Checking 1-sec timeframes, GoK was between 150k and 300k.

One example is r3kb1r/pp1n1ppp/q1p1p3/3pPn2/3P2P1/1QN2N2/PPPB1P1P/2KR3R b kq - 0 11, which GoK handles at 255k NPS, Scurious at 80k. Maybe the comparison is difficult to make, given Scurious only searches 5 plies.

Last edited by ArnoHu (March 16, 2024 04:10:58)

ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

Scratch Chess Engine ELO Ratings

As I was curious about that subject, I started to play around with BayesElo. I fed it with 81 games, mostly from birdracerthree's excellent lichess.org study, and games from this forum since beginning of the year. Element, White Dove, GoK, when running on TurboWarp and their highest level, against an opponent as well on its highest level. This was the result:

Rank Name         Elo    +    - games score oppo. draws
1 GoK 189 100 78 34 85% -93 6%
2 White Dove -75 51 52 65 45% -30 18%
3 Element -113 52 54 63 37% -8 16%

If I apply the average rating from ScratchChessChampion's Rating project (2220) as baseline, we get:

Rank Name         Elo    +    - games score oppo. draws
1 GoK 2409 100 78 34 85% -93 6%
2 White Dove 2145 51 52 65 45% -30 18%
3 Element 2107 52 54 63 37% -8 16%

If we talk about FIDE ratings, I consider a baseline of 2000 more realistic

Rank Name         Elo    +    - games score oppo. draws
1 GoK 2189 100 78 34 85% -93 6%
2 White Dove 1925 51 52 65 45% -30 18%
3 Element 1887 52 54 63 37% -8 16%

Disclaimer, I had to run some semi-automated search/replace for unifying the engine names, derive from the forum posting text which side was black / white, manually enter draw results, then concat everything into one file and so on => error-prone. I will try to publish it as a study later, and would be glad if someone could verify it.

I plan to continue creating ratings from an updated study in the future. Please feel free to submit additional games. Preconditions: average or above average hardware, TurboWarp, and the engines on their highest level (unless the lower level wins anyway), less than 1 avg. minute think time, and use the same engine names as in the main study. No pre-selection of games, either submit all or none. I will also add other engines.

Update: I uploaded to lichess.org, learned that there is a limit of 64 chapters per study, so where we are with two studies:

I added some Scurious, Thundershark, Bonsai games, but for a solid baseline we would need several more games between those three and White Dove, Element. birdracerthree's study have some, but there Element is at low search depth (it still won, though).

Last edited by ArnoHu (March 16, 2024 13:15:18)

ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

ArnoHu wrote:

Scratch Chess Engine ELO Ratings

As I was curious about that subject, I started to play around with BayesElo. I fed it with 81 games, mostly from birdracerthree's excellent lichess.org study, and games from this forum since beginning of the year. Element, White Dove, GoK, when running on TurboWarp and their highest level, against an opponent as well on its highest level. This was the result:

Rank Name         Elo    +    - games score oppo. draws
1 GoK 189 100 78 34 85% -93 6%
2 White Dove -75 51 52 65 45% -30 18%
3 Element -113 52 54 63 37% -8 16%

If I apply the average rating from ScratchChessChampion's Rating project (2220) as baseline, we get:

Rank Name         Elo    +    - games score oppo. draws
1 GoK 2409 100 78 34 85% -93 6%
2 White Dove 2145 51 52 65 45% -30 18%
3 Element 2107 52 54 63 37% -8 16%

If we talk about FIDE ratings, I consider a baseline of 2000 more realistic

Rank Name         Elo    +    - games score oppo. draws
1 GoK 2189 100 78 34 85% -93 6%
2 White Dove 1925 51 52 65 45% -30 18%
3 Element 1887 52 54 63 37% -8 16%

Disclaimer, I had to run some semi-automated search/replace for unifying the engine names, derive from the forum posting text which side was black / white, manually enter draw results, then concat everything into one file and so on => error-prone. I will try to publish it as a study later, and would be glad if someone could verify it.

I plan to continue creating ratings from an updated study in the future. Please feel free to submit additional games. Preconditions: average or above average hardware, TurboWarp, and the engines on their highest level (unless the lower level wins anyway), less than 1 avg. minute think time, and use the same engine names as in the main study. No pre-selection of games, either submit all or none. I will also add other engines.

Update: I uploaded to lichess.org, learned that there is a limit of 64 chapters per study, so where we are with two studies:

I added some Scurious, Thundershark, Bonsai games, but for a solid baseline we would need several more games between those three and White Dove, Element. birdracerthree's study have some, but there Element is at low search depth (it still won, though).

Updated rankings after adding Scurious, Thundershark, Bonsai:

Rank Name           Elo    +    - games score oppo. draws
1 GoK 2201 119 93 43 88% 30 5%
2 White Dove 1903 63 65 68 41% 129 18%
3 Element 1872 62 64 69 38% 123 16%
4 Bonsai 1740 171 163 6 58% -136 50%
5 Thundershark 1709 193 218 6 25% 44 17%
6 Scurious 1675 143 157 8 31% -85 38%

With so few games played by Bonsai, Thundershark and Scurious, the numbers don't mean a lot yet. I expect the gap between them and White Dove, Element to be larger.

Last edited by ArnoHu (March 16, 2024 13:15:31)

HasiLover_Test
Scratcher
100+ posts

Scratch Chess Engine - Game of Kings

I have released a version of Scurious 1.2 without Iterative Deepening, could someone give me feedback if its faster?

I am deeply regretting naming my Chess Engine SCURIOUS????? The Name is so bad and I am forced to stare at it everytime I try to make Progress. What was I thinking a few Months ago.
ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

HasiLover_Test wrote:

I have released a version of Scurious 1.2 without Iterative Deepening, could someone give me feedback if its faster?

I just ran 1.2 against Thundershark, can't say if faster than before (I usually let old and new version play the same boards and compare), but certainly fast enough for me. I thought Scurious would improve its rating against Thundershark because it had more difficult opponents so far in the study. It was clearly ahead, but allowed a KQ fork to happen, did not see a pinned queen it could have taken right after that, also did not care about trapped rooks, and decided to get its king out of shelter prematurely, which resulted in a loss: https://lichess.org/study/oWyPldeN/tphwThXS

Rank Name           Elo    +    - games score oppo. draws
1 GoK 2206 118 92 45 89% 29 4%
2 White Dove 1904 64 65 68 41% 131 18%
3 Element 1873 62 64 70 38% 128 16%
4 Thundershark 1735 184 193 7 36% 3 14%
5 Bonsai 1732 169 165 7 50% -76 43%
6 Scurious 1653 138 155 9 28% -86 33%

Last edited by ArnoHu (March 16, 2024 14:39:13)

HasiLover_Test
Scratcher
100+ posts

Scratch Chess Engine - Game of Kings

ArnoHu wrote:

HasiLover_Test wrote:

I have released a version of Scurious 1.2 without Iterative Deepening, could someone give me feedback if its faster?

I just ran 1.2 against Thundershark, can't say if faster than before (I usually let old and new version play the same boards and compare), but certainly fast enough for me. I thought Scurious would improve its rating against Thundershark because it had more difficult opponents so far in the study. It was clearly ahead, but allowed a KQ fork to happen, did not see a pinned queen it could have taken right after that, also did not care about trapped rooks, and decided to get its king out of shelter prematurely, which resulted in a loss: https://lichess.org/study/oWyPldeN/tphwThXS

Rank Name           Elo    +    - games score oppo. draws
1 GoK 2206 118 92 45 89% 29 4%
2 White Dove 1904 64 65 68 41% 131 18%
3 Element 1873 62 64 70 38% 128 16%
4 Thundershark 1735 184 193 7 36% 3 14%
5 Bonsai 1732 169 165 7 50% -76 43%
6 Scurious 1653 138 155 9 28% -86 33%
The version without iterative deepening is not the main DEV Project. You just tested the normal Version.
Also there is no way Thundershark is that Good.

Last edited by HasiLover_Test (March 16, 2024 15:21:29)


I am deeply regretting naming my Chess Engine SCURIOUS????? The Name is so bad and I am forced to stare at it everytime I try to make Progress. What was I thinking a few Months ago.
ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

HasiLover_Test wrote:

ArnoHu wrote:

HasiLover_Test wrote:

I have released a version of Scurious 1.2 without Iterative Deepening, could someone give me feedback if its faster?

I just ran 1.2 against Thundershark, can't say if faster than before (I usually let old and new version play the same boards and compare), but certainly fast enough for me. I thought Scurious would improve its rating against Thundershark because it had more difficult opponents so far in the study. It was clearly ahead, but allowed a KQ fork to happen, did not see a pinned queen it could have taken right after that, also did not care about trapped rooks, and decided to get its king out of shelter prematurely, which resulted in a loss: https://lichess.org/study/oWyPldeN/tphwThXS

Rank Name           Elo    +    - games score oppo. draws
1 GoK 2206 118 92 45 89% 29 4%
2 White Dove 1904 64 65 68 41% 131 18%
3 Element 1873 62 64 70 38% 128 16%
4 Thundershark 1735 184 193 7 36% 3 14%
5 Bonsai 1732 169 165 7 50% -76 43%
6 Scurious 1653 138 155 9 28% -86 33%
The version without iterative deepening is not the main DEV Project. You just tested the normal Version.
Also there is no way Thundershark is that Good.

True, but you saw the last game, and as mentioned, we need more games.
HasiLover_Test
Scratcher
100+ posts

Scratch Chess Engine - Game of Kings

ArnoHu wrote:

HasiLover_Test wrote:

ArnoHu wrote:

HasiLover_Test wrote:

I have released a version of Scurious 1.2 without Iterative Deepening, could someone give me feedback if its faster?

I just ran 1.2 against Thundershark, can't say if faster than before (I usually let old and new version play the same boards and compare), but certainly fast enough for me. I thought Scurious would improve its rating against Thundershark because it had more difficult opponents so far in the study. It was clearly ahead, but allowed a KQ fork to happen, did not see a pinned queen it could have taken right after that, also did not care about trapped rooks, and decided to get its king out of shelter prematurely, which resulted in a loss: https://lichess.org/study/oWyPldeN/tphwThXS

Rank Name           Elo    +    - games score oppo. draws
1 GoK 2206 118 92 45 89% 29 4%
2 White Dove 1904 64 65 68 41% 131 18%
3 Element 1873 62 64 70 38% 128 16%
4 Thundershark 1735 184 193 7 36% 3 14%
5 Bonsai 1732 169 165 7 50% -76 43%
6 Scurious 1653 138 155 9 28% -86 33%
The version without iterative deepening is not the main DEV Project. You just tested the normal Version.
Also there is no way Thundershark is that Good.

True, but you saw the last game, and as mentioned, we need more games.
Scurious isnt made to play as Whie normally, as it messes up its Piece Square Tables.

I am deeply regretting naming my Chess Engine SCURIOUS????? The Name is so bad and I am forced to stare at it everytime I try to make Progress. What was I thinking a few Months ago.
ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

HasiLover_Test wrote:

ArnoHu wrote:

HasiLover_Test wrote:

ArnoHu wrote:

HasiLover_Test wrote:

I have released a version of Scurious 1.2 without Iterative Deepening, could someone give me feedback if its faster?

I just ran 1.2 against Thundershark, can't say if faster than before (I usually let old and new version play the same boards and compare), but certainly fast enough for me. I thought Scurious would improve its rating against Thundershark because it had more difficult opponents so far in the study. It was clearly ahead, but allowed a KQ fork to happen, did not see a pinned queen it could have taken right after that, also did not care about trapped rooks, and decided to get its king out of shelter prematurely, which resulted in a loss: https://lichess.org/study/oWyPldeN/tphwThXS

Rank Name           Elo    +    - games score oppo. draws
1 GoK 2206 118 92 45 89% 29 4%
2 White Dove 1904 64 65 68 41% 131 18%
3 Element 1873 62 64 70 38% 128 16%
4 Thundershark 1735 184 193 7 36% 3 14%
5 Bonsai 1732 169 165 7 50% -76 43%
6 Scurious 1653 138 155 9 28% -86 33%
The version without iterative deepening is not the main DEV Project. You just tested the normal Version.
Also there is no way Thundershark is that Good.

True, but you saw the last game, and as mentioned, we need more games.
Scurious isnt made to play as Whie normally, as it messes up its Piece Square Tables.

OK, please let me know when this is fixed, I can then schedule a re-match.
ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

Rank Name           Elo    +    - games score oppo. draws
1 GoK 2235 118 91 47 89% 45 4%
2 White Dove 1934 63 64 69 42% 154 17%
3 Element 1901 62 64 71 39% 150 15%
4 Bonsai 1753 160 149 10 55% -73 30%
5 Thundershark 1661 156 182 11 23% 40 9%
6 Scurious 1616 139 159 10 25% -95 30%

Calculated using BayesELO tool, based on these lichess.org studies:

Last edited by ArnoHu (March 16, 2024 22:41:55)

ArnoHu
Scratcher
1000+ posts

Scratch Chess Engine - Game of Kings

ArnoHu wrote:

Scratch Chess Engine ELO Ranking
Rank Name           Elo    +    - games score oppo. draws
1 GoK 2235 118 91 47 89% 45 4%
2 White Dove 1934 63 64 69 42% 154 17%
3 Element 1901 62 64 71 39% 150 15%
4 Bonsai 1753 160 149 10 55% -73 30%
5 Thundershark 1661 156 182 11 23% 40 9%
6 Scurious 1616 139 159 10 25% -95 30%

Calculated using BayesELO tool, based on these lichess.org studies:

Just for the fun of it, I merged ScratchChessChampion's SCF 2023 and 2024 tournament studies, did some cleanup, and fed it into BayesELO, which produced the following result (ELOs are relative numbers, I did not provide an ELO baseline in this case):

SCF 2023 + 2024 Combined Tournament Ranking
Rank Name                             Elo    +    - games score oppo. draws
1 GoK Chess (Medium) 479 202 176 9 72% 324 11%
2 GoK Chess (Difficult) 468 163 143 19 68% 350 21%
3 WhiteDove Chess Engine (P3) 437 212 191 7 71% 289 29%
4 WhiteDove Chess Engine (P4) 362 176 170 9 56% 302 22%
5 GoK Chess (Blitz 1) 310 147 143 17 59% 233 0%
6 Element Chess Engine (Depth 4) 244 178 191 9 39% 315 11%
7 Element Chess Engine (Depth 5) 139 160 168 13 42% 188 23%
8 Bonsai Chess (Blue Belt) 23 158 184 11 18% 266 18%
9 Element Chess Engine (Depth 3) -65 275 322 2 25% 23 50%
10 Bonsai Chess (Green Belt) -190 153 151 11 45% -92 18%
Data was limited, but result is not completely off IMHO. Funny that the lower-depth engine versions are ahead, but you might remember GoK Difficult lost one decisive game against GoK Medium. Blitz 1 searches 4 + quiescence on TurboWarp, that explains why - with some good luck of draw and games - it might show up relatively high. BTW, GoK had a severe regression at the time of SCF 2024.

Last edited by ArnoHu (March 17, 2024 13:48:45)

HasiLover_Test
Scratcher
100+ posts

Scratch Chess Engine - Game of Kings

Check out the Shallow Blue Chess Engine: https://scratch.mit.edu/projects/958201361/ even though its described as bad by the creator it plays very Good Chess and most Games between it and Scurious is borderline Winning at Ply 2 Depth.

I am deeply regretting naming my Chess Engine SCURIOUS????? The Name is so bad and I am forced to stare at it everytime I try to make Progress. What was I thinking a few Months ago.
HasiLover_Test
Scratcher
100+ posts

Scratch Chess Engine - Game of Kings

Shallow Blue(Ply 2) Draws Scurious 1.3(Ply 5) https://lichess.org/V5UYtRzx#46 Shallow Blue is White.

I am deeply regretting naming my Chess Engine SCURIOUS????? The Name is so bad and I am forced to stare at it everytime I try to make Progress. What was I thinking a few Months ago.

Powered by DjangoBB