Scratch Chess Engine - Game of Kings

iceysnowman

waabooboo wrote:
grkw2020 wrote:
waabooboo wrote:
ArnoHu wrote:
GoK Classic vs. Wolverine 2, 94% vs. 90%:
https://lichess.org/study/J41kjRR3/KA4ZYsjT

GoK Classic vs. Wolverine 2, 91% vs. 85%:
https://lichess.org/study/J41kjRR3/u73LI2Ga

I can't reproduce the strategically stupid 15… Rf8 in the second game. From 1-30 seconds Wolverine always prefers to castle, although Rf8 is sometimes considered the second best. I guess maybe the history ordering affects Wolverine's decision there.

I may need some extra castling incentives, I've seen it do that kind of thing more than once
Maybe you just need to change the king's piece square table?

King piece square table already gives a better score for king on g8 than king on e8. Usually that alone is enough to entice Wolverine to castle, but in some cases, like this one, it isn't.

Maybe try reducing the score for the rook on f8 so it’s not as enticed to move the rook w/o castling? It’ll usually want to move to a central file later anyway.

waabooboo

iceysnowman wrote:
waabooboo wrote:
grkw2020 wrote:
waabooboo wrote:
ArnoHu wrote:
GoK Classic vs. Wolverine 2, 94% vs. 90%:
https://lichess.org/study/J41kjRR3/KA4ZYsjT

GoK Classic vs. Wolverine 2, 91% vs. 85%:
https://lichess.org/study/J41kjRR3/u73LI2Ga

I can't reproduce the strategically stupid 15… Rf8 in the second game. From 1-30 seconds Wolverine always prefers to castle, although Rf8 is sometimes considered the second best. I guess maybe the history ordering affects Wolverine's decision there.

I may need some extra castling incentives, I've seen it do that kind of thing more than once
Maybe you just need to change the king's piece square table?

King piece square table already gives a better score for king on g8 than king on e8. Usually that alone is enough to entice Wolverine to castle, but in some cases, like this one, it isn't.
Maybe try reducing the score for the rook on f8 so it’s not as enticed to move the rook w/o castling? It’ll usually want to move to a central file later anyway.

Castling replaces the scores for king on e8 and rook on h8 with scores for king on g8 and rook on f8. The rook on f8 gets scored either way.

I know how I'll fix it, I'm not going to fiddle with the piece-square tables.

ArnoHu

ArnoHu wrote:
ArnoHu wrote:
ArnoHu wrote:
GoK Classic vs. Wolverine 2, 94% vs. 90%:
https://lichess.org/study/J41kjRR3/KA4ZYsjT

GoK Classic vs. Wolverine 2, 91% vs. 85%:
https://lichess.org/study/J41kjRR3/u73LI2Ga

GoK Classic vs. White Dove, 96% vs. 97%. That was a great game to watch!
https://lichess.org/study/J41kjRR3/OcOnnCZZ

GoK Classic vs. Delta, 96% vs. 91%:
https://lichess.org/study/J41kjRR3/aDABBbjK

GoK Classic vs. White Dove, 89% vs. 87% (given the clear GoK win, with both of its mistakes not seen by WD, the accuracy is a bit strange):
https://lichess.org/study/J41kjRR3/uTytLBS0

GoK Classic vs. Wolverine 2, 92% vs. 89%:
https://lichess.org/study/J41kjRR3/uhjdOorE

Could it be that Wolverine LMR is too aggressive? Search depth during endgame was 12 for GoK, 16 for Wolverine, still GoK saw promotion and checkmate several moves earlier.

Last edited by ArnoHu (Sept. 3, 2025 16:20:02)

internet44

grkw2020 wrote:
internet44 wrote:
grkw2020 wrote:
Does anyone know why Storm runs so slowly? Like, in terms of NPS? waabooboo says it's because I use string operations, but I feel like there's something else slowing it down.

do you know how to profile your engine? I only had a quick look and apparently “hash board” and your eval script take up the most time. the hash board thing makes sense, you should look into incremental hash updates, much faster. index to file and rank can easily become a lookup table, same for its counterpart file and rank to index. in addition to that, waabooboo is right of course, string ops are slow. I'm looking at that ordered moves to variable block in your negamax, that's going to be pretty tough for performance. there are better (i.e. faster) way to store moves by ply, but I have to say that's a really creative solution lol
Wow, thanks! I'll try those!

no problem. but don't expect a magical 5x speedup or something- I don't think there's one single thing that slows Storm down, but more like a lot of little things combined that together make for low nps. what I said are the first things I would start with personally, but that's not all by any means

you'll figure it out- I have no doubt about that

ArnoHu

ArnoHu wrote:
ArnoHu wrote:
ArnoHu wrote:
ArnoHu wrote:
GoK Classic vs. Wolverine 2, 94% vs. 90%:
https://lichess.org/study/J41kjRR3/KA4ZYsjT

GoK Classic vs. Wolverine 2, 91% vs. 85%:
https://lichess.org/study/J41kjRR3/u73LI2Ga

GoK Classic vs. White Dove, 96% vs. 97%. That was a great game to watch!
https://lichess.org/study/J41kjRR3/OcOnnCZZ

GoK Classic vs. Delta, 96% vs. 91%:
https://lichess.org/study/J41kjRR3/aDABBbjK

GoK Classic vs. White Dove, 89% vs. 87% (given the clear GoK win, with both of its mistakes not seen by WD, the accuracy is a bit strange):
https://lichess.org/study/J41kjRR3/uTytLBS0

GoK Classic vs. Wolverine 2, 92% vs. 89%:
https://lichess.org/study/J41kjRR3/uhjdOorE

Could it be that Wolverine LMR is too aggressive? Search depth during endgame was 12 for GoK, 16 for Wolverine, still GoK saw promotion and checkmate several moves earlier.

GoK Classic vs. Shallow Blue 3, 94% vs. 89%:
https://lichess.org/study/J41kjRR3/Csii0mTG

waabooboo

ArnoHu wrote:
ArnoHu wrote:
ArnoHu wrote:
ArnoHu wrote:
GoK Classic vs. Wolverine 2, 94% vs. 90%:
https://lichess.org/study/J41kjRR3/KA4ZYsjT

GoK Classic vs. Wolverine 2, 91% vs. 85%:
https://lichess.org/study/J41kjRR3/u73LI2Ga

GoK Classic vs. White Dove, 96% vs. 97%. That was a great game to watch!
https://lichess.org/study/J41kjRR3/OcOnnCZZ

GoK Classic vs. Delta, 96% vs. 91%:
https://lichess.org/study/J41kjRR3/aDABBbjK

GoK Classic vs. White Dove, 89% vs. 87% (given the clear GoK win, with both of its mistakes not seen by WD, the accuracy is a bit strange):
https://lichess.org/study/J41kjRR3/uTytLBS0

GoK Classic vs. Wolverine 2, 92% vs. 89%:
https://lichess.org/study/J41kjRR3/uhjdOorE

Could it be that Wolverine LMR is too aggressive? Search depth during endgame was 12 for GoK, 16 for Wolverine, still GoK saw promotion and checkmate several moves earlier.

I don't use check extensions, I just don't allow reductions in check – that may be part of it. But yes, I do think my LMR is too aggressive. It needs some fine-tuning.

ArnoHu

ArnoHu wrote:
ArnoHu wrote:
ArnoHu wrote:
ArnoHu wrote:
ArnoHu wrote:
GoK Classic vs. Wolverine 2, 94% vs. 90%:
https://lichess.org/study/J41kjRR3/KA4ZYsjT

GoK Classic vs. Wolverine 2, 91% vs. 85%:
https://lichess.org/study/J41kjRR3/u73LI2Ga

GoK Classic vs. White Dove, 96% vs. 97%. That was a great game to watch!
https://lichess.org/study/J41kjRR3/OcOnnCZZ

GoK Classic vs. Delta, 96% vs. 91%:
https://lichess.org/study/J41kjRR3/aDABBbjK

GoK Classic vs. White Dove, 89% vs. 87% (given the clear GoK win, with both of its mistakes not seen by WD, the accuracy is a bit strange):
https://lichess.org/study/J41kjRR3/uTytLBS0

GoK Classic vs. Wolverine 2, 92% vs. 89%:
https://lichess.org/study/J41kjRR3/uhjdOorE

Could it be that Wolverine LMR is too aggressive? Search depth during endgame was 12 for GoK, 16 for Wolverine, still GoK saw promotion and checkmate several moves earlier.

GoK Classic vs. Shallow Blue 3, 94% vs. 89%:
https://lichess.org/study/J41kjRR3/Csii0mTG

GoK Classic vs. Black Crow, 96% vs. 96%:
https://lichess.org/study/J41kjRR3/8oarBjzd

GoK NNUE vs. Black Crow, 93% vs. 89%:
https://lichess.org/study/J41kjRR3/KnL7D401

GoK NNUE vs. Black Crow, 98% vs. 92%:
https://lichess.org/study/J41kjRR3/Gz51awDM

Last edited by ArnoHu (Sept. 4, 2025 05:22:40)

grkw2020

I heard somewhere that it's not worth adding positions from the quiescence search to the TT. Is that true?

ArnoHu

grkw2020 wrote:
I heard somewhere that it's not worth adding positions from the quiescence search to the TT. Is that true?

No

ArnoHu

GoK NNUE vs. Black Crow, 96% vs. 90%:
https://lichess.org/study/J41kjRR3/tBcrrvzB

GoK Classic vs. White Dove, 89% vs. 85%:
https://lichess.org/study/J41kjRR3/KyFnJLtj

GoK Classic vs. Wolverine 2, 96% vs. 92%:
https://lichess.org/study/J41kjRR3/KIBWEuz8

waabooboo

ArnoHu wrote:
GoK NNUE vs. Black Crow, 96% vs. 90%:
https://lichess.org/study/J41kjRR3/tBcrrvzB

GoK Classic vs. White Dove, 89% vs. 85%:
https://lichess.org/study/J41kjRR3/KyFnJLtj

GoK Classic vs. Wolverine 2, 96% vs. 92%:
https://lichess.org/study/J41kjRR3/KIBWEuz8

Again Wolverine doesn't understand the pawn structure well enough and loses an endgame

Right now he only understands passed pawns, doubled pawns, and isolated pawns.

I looked at https://github.com/TerjeKir/weiss/blob/v2.0/src/evaluate.c for the phalanx bonus but didn't quite understand it – are phalanx and passed pawn bonuses additive? I played around a bit with a phalanx bonus, but adding that on to the passed pawn bonus caused Wolverine to seriously misevaluate positions with connected passed pawns.

I imagine GoK utilizes some extra tricks?

ArnoHu

waabooboo wrote:
ArnoHu wrote:
GoK NNUE vs. Black Crow, 96% vs. 90%:
https://lichess.org/study/J41kjRR3/tBcrrvzB

GoK Classic vs. White Dove, 89% vs. 85%:
https://lichess.org/study/J41kjRR3/KyFnJLtj

GoK Classic vs. Wolverine 2, 96% vs. 92%:
https://lichess.org/study/J41kjRR3/KIBWEuz8

Again Wolverine doesn't understand the pawn structure well enough and loses an endgame Right now he only understands passed pawns, doubled pawns, and isolated pawns.

I looked at https://github.com/TerjeKir/weiss/blob/v2.0/src/evaluate.c for the phalanx bonus but didn't quite understand it – are phalanx and passed pawn bonuses additive? I played around a bit with a phalanx bonus, but adding that on to the passed pawn bonus caused Wolverine to seriously misevaluate positions with connected passed pawns.

I imagine GoK utilizes some extra tricks?

Yes, additive. GoK Classic evaluates passed, isolated, doubled, connected, blocked pawns and phalanxes, pawn storms, pawn shelters, applying tapered evaluation. And of course everything related to threats (safe pawn pushes, pawns as defenders, etc). And it stores pawn-eval to TT using a pawns+kings Zobrist hash, which can then be applied to a lot of boards where only major + minor positions have changed.

Last edited by ArnoHu (Sept. 5, 2025 21:11:45)

waabooboo

ArnoHu wrote:
waabooboo wrote:
ArnoHu wrote:
GoK NNUE vs. Black Crow, 96% vs. 90%:
https://lichess.org/study/J41kjRR3/tBcrrvzB

GoK Classic vs. White Dove, 89% vs. 85%:
https://lichess.org/study/J41kjRR3/KyFnJLtj

GoK Classic vs. Wolverine 2, 96% vs. 92%:
https://lichess.org/study/J41kjRR3/KIBWEuz8

Again Wolverine doesn't understand the pawn structure well enough and loses an endgame Right now he only understands passed pawns, doubled pawns, and isolated pawns.

I looked at https://github.com/TerjeKir/weiss/blob/v2.0/src/evaluate.c for the phalanx bonus but didn't quite understand it – are phalanx and passed pawn bonuses additive? I played around a bit with a phalanx bonus, but adding that on to the passed pawn bonus caused Wolverine to seriously misevaluate positions with connected passed pawns.

I imagine GoK utilizes some extra tricks?

Yes, additive. GoK Classic evaluates passed, isolated, doubled, connected, blocked pawns and phalanxes, pawn storms, pawn shelters, applying tapered evaluation. And of course everything related to threats (safe pawn pushes, pawns as defenders, etc). And it stores pawn-eval to TT using a pawns+kings Zobrist hash, which can then be applied to a lot of boards where only major + minor positions have changed.

Interesting, maybe the code I referenced is using a scale that's not centipawn… I will have to take another look.

Storm/shelter I will implement later, I haven't done anything with king safety yet. Connected pawns seemed most critical here, Wolverine pushed pawns to where it eventually couldn't defend them (with moves like h5 and c5).

ArnoHu

ArnoHu wrote:
GoK NNUE vs. Black Crow, 96% vs. 90%:
https://lichess.org/study/J41kjRR3/tBcrrvzB

GoK Classic vs. White Dove, 89% vs. 85%:
https://lichess.org/study/J41kjRR3/KyFnJLtj

GoK Classic vs. Wolverine 2, 96% vs. 92%:
https://lichess.org/study/J41kjRR3/KIBWEuz8

Last edited by ArnoHu (Sept. 6, 2025 15:13:15)

ArnoHu

waabooboo wrote:
ArnoHu wrote:
waabooboo wrote:
ArnoHu wrote:
GoK NNUE vs. Black Crow, 96% vs. 90%:
https://lichess.org/study/J41kjRR3/tBcrrvzB

GoK Classic vs. White Dove, 89% vs. 85%:
https://lichess.org/study/J41kjRR3/KyFnJLtj

GoK Classic vs. Wolverine 2, 96% vs. 92%:
https://lichess.org/study/J41kjRR3/KIBWEuz8

Again Wolverine doesn't understand the pawn structure well enough and loses an endgame Right now he only understands passed pawns, doubled pawns, and isolated pawns.

I looked at https://github.com/TerjeKir/weiss/blob/v2.0/src/evaluate.c for the phalanx bonus but didn't quite understand it – are phalanx and passed pawn bonuses additive? I played around a bit with a phalanx bonus, but adding that on to the passed pawn bonus caused Wolverine to seriously misevaluate positions with connected passed pawns.

I imagine GoK utilizes some extra tricks?

Yes, additive. GoK Classic evaluates passed, isolated, doubled, connected, blocked pawns and phalanxes, pawn storms, pawn shelters, applying tapered evaluation. And of course everything related to threats (safe pawn pushes, pawns as defenders, etc). And it stores pawn-eval to TT using a pawns+kings Zobrist hash, which can then be applied to a lot of boards where only major + minor positions have changed.

Interesting, maybe the code I referenced is using a scale that's not centipawn… I will have to take another look.

Storm/shelter I will implement later, I haven't done anything with king safety yet. Connected pawns seemed most critical here, Wolverine pushed pawns to where it eventually couldn't defend them (with moves like h5 and c5).

Yes, often those are not centipawns, and go through a scale factor transformation later.

Last edited by ArnoHu (Sept. 7, 2025 12:29:05)

ArnoHu

GoK Classic - White Dove, 95% vs. 92%:
https://lichess.org/study/J41kjRR3/6YDnZ4yO

GoK Classic - Wolverine 2, 92% vs. 84%:
https://lichess.org/study/J41kjRR3/u9l8j6cF

GoK Classic - Delta, 87% vs. 83%:
https://lichess.org/study/J41kjRR3/jUi7q7EV

ArnoHu

For those interested in such implementation details, this is how GoK Classic spends its think time:

43% Evaluation
    17% Attack tables (threats precondition), mobility
    17% Threats
     4% Pawns
     2% Majors/minors
     3% Other
33% MoveGen (+ MakeMove)
 8% Transposition table read/write
 6% Search self-time
10% Other

internet44

first test of pawnstructure + king shelter eval for SB, looking good so far
https://lichess.org/i70wVQYB (5s) first try win against W2, tho surprisingly not related to eval improvements (?) Wolverine just didn't see that Qxf6 is losing in time, shallow blue saw that pretty much instantly. LMR moment?

anyway, it's still a bit rough around the edges- you can see how it doesn't know what to do in the endgame. I haven't needed special endgame eval so far but now I need to disable the king shield bonus, so I guess I might as well just do a full endgame evaluation tomorrow. for now it's in its own project for testing but if nothing goes wrong, I'll polish it up a bit and then upload to the main project.

Last edited by internet44 (Sept. 7, 2025 17:20:09)

waabooboo

ArnoHu wrote:
waabooboo wrote:
ArnoHu wrote:
waabooboo wrote:
ArnoHu wrote:
GoK NNUE vs. Black Crow, 96% vs. 90%:
https://lichess.org/study/J41kjRR3/tBcrrvzB

GoK Classic vs. White Dove, 89% vs. 85%:
https://lichess.org/study/J41kjRR3/KyFnJLtj

GoK Classic vs. Wolverine 2, 96% vs. 92%:
https://lichess.org/study/J41kjRR3/KIBWEuz8

Again Wolverine doesn't understand the pawn structure well enough and loses an endgame Right now he only understands passed pawns, doubled pawns, and isolated pawns.

I looked at https://github.com/TerjeKir/weiss/blob/v2.0/src/evaluate.c for the phalanx bonus but didn't quite understand it – are phalanx and passed pawn bonuses additive? I played around a bit with a phalanx bonus, but adding that on to the passed pawn bonus caused Wolverine to seriously misevaluate positions with connected passed pawns.

I imagine GoK utilizes some extra tricks?

Yes, additive. GoK Classic evaluates passed, isolated, doubled, connected, blocked pawns and phalanxes, pawn storms, pawn shelters, applying tapered evaluation. And of course everything related to threats (safe pawn pushes, pawns as defenders, etc). And it stores pawn-eval to TT using a pawns+kings Zobrist hash, which can then be applied to a lot of boards where only major + minor positions have changed.

Interesting, maybe the code I referenced is using a scale that's not centipawn… I will have to take another look.

Storm/shelter I will implement later, I haven't done anything with king safety yet. Connected pawns seemed most critical here, Wolverine pushed pawns to where it eventually couldn't defend them (with moves like h5 and c5).

Yes, often those are not centipawns, and go through a scale factor transformation later.

Ok, I see. Do you use Texel's Tuning Method or something similar for GoK? How do you figure out how much each evaluation term should be worth?

I'm trying to add more cheap knowledge to Wolverine, without using attack tables. It's getting harder and harder to make progress…

Destructor_chess

waabooboo wrote:
ArnoHu wrote:
waabooboo wrote:
ArnoHu wrote:
waabooboo wrote:
ArnoHu wrote:
GoK NNUE vs. Black Crow, 96% vs. 90%:
https://lichess.org/study/J41kjRR3/tBcrrvzB

GoK Classic vs. White Dove, 89% vs. 85%:
https://lichess.org/study/J41kjRR3/KyFnJLtj

GoK Classic vs. Wolverine 2, 96% vs. 92%:
https://lichess.org/study/J41kjRR3/KIBWEuz8

Again Wolverine doesn't understand the pawn structure well enough and loses an endgame Right now he only understands passed pawns, doubled pawns, and isolated pawns.

I looked at https://github.com/TerjeKir/weiss/blob/v2.0/src/evaluate.c for the phalanx bonus but didn't quite understand it – are phalanx and passed pawn bonuses additive? I played around a bit with a phalanx bonus, but adding that on to the passed pawn bonus caused Wolverine to seriously misevaluate positions with connected passed pawns.

I imagine GoK utilizes some extra tricks?

Yes, additive. GoK Classic evaluates passed, isolated, doubled, connected, blocked pawns and phalanxes, pawn storms, pawn shelters, applying tapered evaluation. And of course everything related to threats (safe pawn pushes, pawns as defenders, etc). And it stores pawn-eval to TT using a pawns+kings Zobrist hash, which can then be applied to a lot of boards where only major + minor positions have changed.

Interesting, maybe the code I referenced is using a scale that's not centipawn… I will have to take another look.

Storm/shelter I will implement later, I haven't done anything with king safety yet. Connected pawns seemed most critical here, Wolverine pushed pawns to where it eventually couldn't defend them (with moves like h5 and c5).

Yes, often those are not centipawns, and go through a scale factor transformation later.

Ok, I see. Do you use Texel's Tuning Method or something similar for GoK? How do you figure out how much each evaluation term should be worth?

I'm trying to add more cheap knowledge to Wolverine, without using attack tables. It's getting harder and harder to make progress…

The first qustion is do you want a fully incremential evaluation?

Discuss Scratch