Again about the benchmark #5157

amchess · 2024-04-06T22:41:43Z

Describe the issue

No: imho, the problem isn't the network size.
I think the refactoring with the worker isn't no funcitonal changes.
For example, in position.cpp there are the following lines:

    // Update material hash key and prefetch access to materialTable
    k ^= Zobrist::psq[captured][capsq];
    st->materialKey ^= Zobrist::psq[captured][pieceCount[captured]];

but the materialTable has been deleted and there is no prefetch.

Expected behavior

not this behaviour

Steps to reproduce

./Stockfish16 bench
./Stockfish16.1 bench

Anything else?

No response

Operating system

All

Stockfish version

S

The text was updated successfully, but these errors were encountered:

PGG106 · 2024-04-06T22:44:10Z

Yes the comment is outdated but a table not being used anymore and a useless prefetch not being present doesn't make the engine slower, the change was non functional.
The newer versions are slower because the net arch changed, the layers are bigger and the inference has to do more work, hence it's slower (multiplications aren't free), this issue is just a duplicate of the previous one.

Disservin · 2024-04-06T22:59:46Z

SF 16:      
 L1:  1536, nn-5af11540bbfe.nnue
 nps: 1823913

SF 16.1:
 L1:  2560, nn-b1a57edbea57.nnue
 nps: 1210743 

(with smaller network)
SF 16.1:
 L1: 1536, nn-5af11540bbfe.nnue
 nps: 1472458

master: 1096393

SF 16, still used a mixed of classical and main net, where the classical part is probably giving the other additional speedup.

SF 16 bench was a mix of positions with NNUE and classical, I think 50/50 split.
SF 16.1 is 100% NNUE in bench.

SF 16.1 uses the main net and the small net, with classical completely removed...

mstembera · 2024-04-06T23:25:36Z

We should probably simplify materialKey away though.
[Edit] Never mind. Looks like we still use it for syzygy.

amchess · 2024-04-07T11:37:02Z

Before removing classical eval:Simplify PvNode reduction Jul 11,2023
6a8767a
Nodes/second : 1502286

With removing classical eval: Jul 11, 2023
https://github.com/official-stockfish/Stockfish/commits/master/?before=0716b845fdef8a20102b07eaec074b8da8162523+315
Nodes/second : 885742

Same net

I don't understand why.

dav1312 · 2024-04-07T11:39:33Z

As Disservin already said, it is because of 50% of the bench positions used only the classical evaluation, which was much faster
To compare only the "real" Stockfish, use bench 16 1 13 default depth NNUE for commits before the removal of classical

amchess · 2024-04-07T11:57:18Z

Done
bench 16 1 13 default depth NNUE
before
Nodes/second : 924580
after
Nodes/second : 769656
I'm very sorry, but yet I don't understand.

peregrineshahin · 2024-04-07T12:00:13Z

I will answer you, though I'm already regretting it (interaction with you is sickening).
Classical Eval role, was mianly about delevering not so good eval in compansation of speed, so even with using NNUE as default you still use Classical Eval in winning positions.

dav1312 · 2024-04-07T12:07:08Z

Stockfish/src/evaluate.cpp

Lines 1056 to 1059 in 6a8767a

    
           // We use the much less accurate but faster Classical eval when the NNUE 
        
           // option is set to false. Otherwise we use the NNUE eval unless the 
        
           // PSQ advantage is decisive. (~4 Elo at STC, 1 Elo at LTC) 
        
           bool useClassical = !useNNUE || abs(psq) > 2048;

Disservin · 2024-04-07T12:11:36Z

command: ./stockfish bench 16 1 13 default depth NNUE;
commit : 6a8767a0d5d9502e6d4de1bef97468b5d6fab80a (Simplify PvNode reduction)
changes: with classical part removed in evaluate
nps    : 1111266

command: ./stockfish bench 16 1 13 default depth NNUE;
commit : 6a8767a0d5d9502e6d4de1bef97468b5d6fab80a (Simplify PvNode reduction)
nps    : 1180150

command: ./stockfish bench 16 1 13 default depth NNUE;
commit : af110e02ec96cdb46cf84c68252a1da15a902395 (Remove classical evaluation)
nps    : 1112121

amchess · 2024-04-07T12:16:33Z

OK: so, for two elo points, do you degrade performance so fearfully?

Disservin · 2024-04-07T12:18:14Z

What

amchess · 2024-04-07T12:20:18Z

XInTheDark · 2024-04-07T12:22:05Z

how is this relevant to your problem about the benchmark nps?

PGG106 · 2024-04-07T12:22:11Z

hce wasn't removed to gain 2 Elo, it was removed even if removing it lost 2 Elo, because with long enough tc the impact was 0.
A 0 impact on performance is not performance degradation, the fact the engine reports a slower bench speed because hce isn't there to inflate the metric doesn't mean anything.

PGG106 · 2024-04-07T12:23:16Z

In general arguing about hce removal now is useless, it's a done deal, new and better solutions now fill the void the hce removal left, devs are working on sf 17 and stuff that was definitive in sf 16 is definitely not relevant now.

amchess · 2024-04-07T12:25:51Z

For me, it is crazy and I don't agree at all, but clearly do as you wish.

PGG106 · 2024-04-07T12:28:31Z

you don't understand anything you talk about, that's all there is to it, as usual.
Ofc switching over to hce inflates the speed, you are avoiding the nnue slowdown by not using the nnue at all, switching back to fully hce eval would make the engine faster too, but so what?
speed is just a metric, you can sacrifice speed to gain elo, it's what the hce to nnue switch did, moreover removing hce isn't that much of a speed drop, bench shouldn't have been mixed in the first place and the metrics it reports are inflated by the positions that are > 2048 cp in value.
Moreover part 2 current sf dev has a solution to speed up the engine in won positions, it's a smaller net that takes the place of what was hce before (without the need to keep all that hce code around).
For once you should try to understand what you are talking about before second guessing tens of devs that all know far more than you.

Disservin · 2024-04-07T12:35:08Z

For me, it is crazy and I don't agree at all, but clearly do as you wish.

I don't even get your point... on my benchmarks #5157 (comment),
it shows that the actual true bench speed of stockfish was 1180150 and after removing HCE it was 1112121.
That difference is marginal... for some reason people hate it when sf has less nps because it is not using hce anymore, but it is fine for you to have less speed because of a marginally better larger network and some even use this network without any clear evidence that it is stronger even before Stockfish master has it! (looking at you talkchess)

HCE was removed with some loss at STC, the higher marginal higher NPS which Stockfish had before were countervailed with the stronger evaluation.

amchess · 2024-04-07T12:45:25Z

I develop a fork of stockfish with the goal of making it understandable/useful to the OTB player, not necessarily, therefore, stronger, and I wondered why this frightening degradation of performance. That's all.

PGG106 · 2024-04-07T12:48:36Z

how is a 2 Elo impact at bullet time controls a "frightening degradation of performance"?, why are we acting like the basically 0 Elo impact (at the very very very long time controls you otb players love) wasn't made even less impactful by the fact smallnet exists? the truth of the matter is that sf is now stronger than when hce was still present and it's stronger than it would be if hce was still present.
You are wondering wrong things based off of wrong assumptions and either outdated or wrong "knowledge" (or lack thereof).

Disservin · 2024-04-07T12:53:19Z

I develop a fork of stockfish with the goal of making it understandable/useful to the OTB player, not necessarily, therefore, stronger, and I wondered why this frightening degradation of performance. That's all.

NPS doesn't mean anything, if you want high nps go back to pure alpha beta but lose accuracy of the evaluation. Your speed comparison that you posted earlier were wrong since it used a mixed bench, which is not even what stockfish did during regular play. In my two tests, the difference was like 5%, which is not "frightening degradation of performance" imo.
What we aim for (and everyone else in the community who wrote their own engine which competes at a high level) is elo gain. Over the time we have merged many new networks which made STC a bit weaker (reduced nps) but overall scaled rather well to high time controls... which is also what you are after...

PGG106 · 2024-04-07T13:04:53Z

It doesn't matter what anyone thinks, this is about numbers, nps, elo and time controls. This is equivalent to math, it doesn't matter if i think that 3 + 2 is 7 or if my dream is for 12 * 10 to be 97, this is factual information that can only be correctly or incorrectly interpreted, there's no room for opinions here.
That being said it's pretty evident that you are a troll so there's no use engaging with you and i won't do so any further.

amchess · 2024-04-07T13:46:02Z

Commits on Jan 14, 2024
Remove the dependency on a Worker from evaluate

before refactor
Nodes/second : 826439

after refactor
Nodes/second : 822816

Same net and bench
The refactoring isn't no functional changes.
Just about facts, numbers...

PGG106 · 2024-04-07T13:54:35Z

that's a supposedly 0.4% speed drop, it's too small of a drop to be measured with a single bench run or even 3 or 5, it's easily explained away by noise and error bars in the bench measurement.
If you want an accurate way to measure speed use the script that is customarily used to measure for speed gains/losses. you'll notice that with enough runs the change is just neutral.
The refactoring is definitely non functional

Disservin · 2024-04-07T14:00:05Z

Commits on Jan 14, 2024 Remove the dependency on a Worker from evaluate

before refactor Nodes/second : 826439

after refactor Nodes/second : 822816

Same net and bench The refactoring isn't no functional changes. Just about facts, numbers...

A single bench run isn't very reliable. You could probably redo a single bench at some point and get reversed results. You should be using one of the many speedup scripts to better determine the effect of a patch on the overall performance.

PGG106 · 2024-04-07T14:00:12Z

Here you can find all the info on how to properly measure the speed of stockfish: https://github.com/official-stockfish/Stockfish/wiki/Advanced-topics#measure-the-speed-of-stockfish, coincidentally it also says A speedup of 0.3% could be meaningless (i.e. within the measurement noise) which is exactly what you are measuring.

PGG106 mentioned this issue Apr 6, 2024

Issue with the benchmark #5156

Closed

Disservin closed this as not planned Won't fix, can't repro, duplicate, stale Apr 6, 2024

This comment was marked as off-topic.

Sign in to view

This comment was marked as spam.

Sign in to view

This was referenced Apr 8, 2024

Collaboration for a chess engine amchess/StockfishShashGenerator#1

Closed

‏‏ amchess/ShashChess#53

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Again about the benchmark #5157

Again about the benchmark #5157

amchess commented Apr 6, 2024

PGG106 commented Apr 6, 2024 •

edited

Loading

Disservin commented Apr 6, 2024

mstembera commented Apr 6, 2024 •

edited

Loading

amchess commented Apr 7, 2024

dav1312 commented Apr 7, 2024 •

edited

Loading

amchess commented Apr 7, 2024

peregrineshahin commented Apr 7, 2024

dav1312 commented Apr 7, 2024

Disservin commented Apr 7, 2024

amchess commented Apr 7, 2024

Disservin commented Apr 7, 2024

amchess commented Apr 7, 2024 •

edited

Loading

XInTheDark commented Apr 7, 2024

PGG106 commented Apr 7, 2024

PGG106 commented Apr 7, 2024 •

edited

Loading

amchess commented Apr 7, 2024

PGG106 commented Apr 7, 2024

This comment was marked as off-topic.

Disservin commented Apr 7, 2024 •

edited

Loading

amchess commented Apr 7, 2024

PGG106 commented Apr 7, 2024 •

edited

Loading

Disservin commented Apr 7, 2024

This comment was marked as spam.

PGG106 commented Apr 7, 2024

amchess commented Apr 7, 2024

PGG106 commented Apr 7, 2024

Disservin commented Apr 7, 2024

PGG106 commented Apr 7, 2024

Again about the benchmark #5157

Again about the benchmark #5157

Comments

amchess commented Apr 6, 2024

Describe the issue

Expected behavior

Steps to reproduce

Anything else?

Operating system

Stockfish version

PGG106 commented Apr 6, 2024 • edited Loading

Disservin commented Apr 6, 2024

mstembera commented Apr 6, 2024 • edited Loading

amchess commented Apr 7, 2024

dav1312 commented Apr 7, 2024 • edited Loading

amchess commented Apr 7, 2024

peregrineshahin commented Apr 7, 2024

dav1312 commented Apr 7, 2024

Disservin commented Apr 7, 2024

amchess commented Apr 7, 2024

Disservin commented Apr 7, 2024

amchess commented Apr 7, 2024 • edited Loading

XInTheDark commented Apr 7, 2024

PGG106 commented Apr 7, 2024

PGG106 commented Apr 7, 2024 • edited Loading

amchess commented Apr 7, 2024

PGG106 commented Apr 7, 2024

This comment was marked as off-topic.

Disservin commented Apr 7, 2024 • edited Loading

amchess commented Apr 7, 2024

PGG106 commented Apr 7, 2024 • edited Loading

Disservin commented Apr 7, 2024

This comment was marked as spam.

PGG106 commented Apr 7, 2024

amchess commented Apr 7, 2024

PGG106 commented Apr 7, 2024

Disservin commented Apr 7, 2024

PGG106 commented Apr 7, 2024

PGG106 commented Apr 6, 2024 •

edited

Loading

mstembera commented Apr 6, 2024 •

edited

Loading

dav1312 commented Apr 7, 2024 •

edited

Loading

amchess commented Apr 7, 2024 •

edited

Loading

PGG106 commented Apr 7, 2024 •

edited

Loading

Disservin commented Apr 7, 2024 •

edited

Loading

PGG106 commented Apr 7, 2024 •

edited

Loading