I don't think of it as a "limitation". Who would want to page in and out the indices, so that you can then figure out what to page in / out from a specific EGTB file? I/O is already slow enough. Constantly reading indices before reading data would make it two times slower...Gian-Carlo Pascutto wrote:They are. It's limitation of the Nalimov code.BrandonSi wrote:not all indexes would be loaded into memory at the same time,
Nalimov and memory for indexes (are you aware?)
Moderator: Ras
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Nalimov and memory for indexes (are you aware?)
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Nalimov and memory for indexes (are you aware?)
That's why threads are better when using Eugene's code. The indices are shared among all threads, as are the cache blocks. If you use processes instead, you duplicate _everything_ which is a poor use of memory.michiguel wrote:I guess that at least they could ask for EGTB Cache/core rather than EGTB Cache, to let the user know what is going on.Gian-Carlo Pascutto wrote:Probably an issue with every engine that is multiprocessed. Zappa will likely be affected too.Harvey Williamson wrote: Rybka is very bad at this whatever you set as tb cache it will use x the number of cores so on an 8 core machine if you set cache at 64 Rybka will take 8x64 - i have not seen other engines do this.
It's not only the caches, the indexes will also get replicated. Nalimov wrote his code for Crafty at the moment Crafty was multithreaded, and the code just sucks for multiprocessed engines.
Anyway, I am very interested in your expert opinion:
I think it is not a good idea to have (for instance) 4 caches of 32 MiB, one for each thread. It would be better to have a 128 MiB cache shared by all threads, and properly protected for read/write operations (that is what I am doing currently with Gaviota TBs). Of course, this creates a problem when all threads start to hit the cache and it could potentially degrade the parallel scalability. However, the EGTB probe is dominated so much by HD access than anything else is almost irrelevant. Having a bigger shared cache decreases significantly the likelihood of HD access. I prefer to reduce HD accesses rather than the potential overlap of two threads trying to access the EGTB cache. Most of those problems are previously faced by the hash table probe already.
What do you think?
Miguel
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Nalimov and memory for indexes (are you aware?)
Here is what ought to happen, and what _does_ happen under Linux. If you initialize the EGTBs _before_ you use fork() to create new processes, all is well. The indices are allocated, and filled in, and then after a fork() the memory is shared among all new processes via the "copy-on-write" logic that says to share all memory by just duplicating the page table entries so that each process has a separate page table, but with identical contents. Each writable page of memory is temporarily flagged as no-write. If one of those gets modified, the O/S first duplicates the unmodified data by copying it to a new page of RAM, then modifies one of the processes so that its page table now points to that new page of RAM with write permission, and then continues. Since the indices never get modified, they should be shared among all processes just as if they were using threads. And since there is no modification, there is no race issues and no need for any locks. So it really doesn't get "duplicated" under linux, you get exactly one copy no matter how many processes you run. For windows, I don't know if this is true, but would certainly expect it to work like that.Werner wrote:Hi,
I think in this case Windows task manager does not show it correct. If you look at the rest of free memory you see only 1 times the 64 MB are used.
So the duplication is "virtual" but not "physical" and there really is only one copy of the indices in RAM and everyone is sharing them without knowing they are doing so.
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Nalimov and memory for indexes (are you aware?)
It will we two times slower if you read the indexes every single time you probe. However, you can cache the ones you read more often. For a given position you hit only ~20% of all the files. So, I think that keeping only 20% of the indexes in cache should be very safe (even less may suffice). The performance penalty to go fetch the indexes on rare occasions should be negligible.bob wrote:I don't think of it as a "limitation". Who would want to page in and out the indices, so that you can then figure out what to page in / out from a specific EGTB file? I/O is already slow enough. Constantly reading indices before reading data would make it two times slower...Gian-Carlo Pascutto wrote:They are. It's limitation of the Nalimov code.BrandonSi wrote:not all indexes would be loaded into memory at the same time,
BTW, There is an advantage of keeping wtm and btm positions on the same file. Nalimov EGTBs keeps separate files for those.
Miguel
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Nalimov and memory for indexes (are you aware?)
Oops, now I get it. Mmmhhh... I have to think if Gaviota TBs are MP friendly or not... I think they should be if everything is initialized before forking.Gian-Carlo Pascutto wrote:Each *process*, see my comment above.michiguel wrote:Each of the threads load their own indexes? Then it is 20 MiB x cores?M ANSARI wrote:Well I have to agree that there is something different with R3 and Nalimov memory usage. I noticed that when I put EGTB's on a USB drive it takes ages for the engine to load and unload. For some reason if EGTB's are on HDD then this is not a problem. Once loaded things are OK. I don't see this behaviour with other engines and to be honest I have never figured this one out.
Miguel
Miguel
There's a well-known bug where SMP Rybka doesn't use tablebases correctly, and it's closely related: none of the Nalimov stuff is shared and Vasik forgot to pass a parameter from the master process to the slaves.
I'm glad I got rid of sh*t like that in DS 3.0
-
- Posts: 154
- Joined: Fri Mar 10, 2006 1:20 am
- Location: Sonora, Mexico
Re: Nalimov and memory for indexes (are you aware?)
Interestingly, although Windows supports copy-on-write semantics for many things, there isn't a Windows API method for creating a process that behaves the way fork() does in *nix. Specifically, there isn't anything at the Windows API level that copies the page tables into the newly created process. The Windows createProcess() API method creates a brand new fresh process, analogous to fork() followed by exec().bob wrote:For windows, I don't know if this is true, but would certainly expect it to work like that.
Windows prefers using threads for this scenario, but I realize that this is different, and has both pros and cons.
"The foundation of morality is to have done, once for all, with lying; to give up pretending to believe that for which there is no evidence, and repeating unintelligible propositions about things beyond the possibilities of knowledge." - T. H. Huxley
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Nalimov and memory for indexes (are you aware?)
This is simply a far better idea. No idea why windows would not use this, since it is well-known and has been in Linux for several years... Threads are not equivalent, since they share _everything_. copy-on-write shares everything until it is modified and slowly builds up copies of modified data, while still sharing that which has not (or can not) be modified including instructions. Far cache-friendlier as well.lmader wrote:Interestingly, although Windows supports copy-on-write semantics for many things, there isn't a Windows API method for creating a process that behaves the way fork() does in *nix. Specifically, there isn't anything at the Windows API level that copies the page tables into the newly created process. The Windows createProcess() API method creates a brand new fresh process, analogous to fork() followed by exec().bob wrote:For windows, I don't know if this is true, but would certainly expect it to work like that.
Windows prefers using threads for this scenario, but I realize that this is different, and has both pros and cons.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Nalimov and memory for indexes (are you aware?)
Don't bet on it if you use windows. Someone pointed out that windows doesn't do copy-on-write as unix does... which would produce duplicates of everything when you fork().michiguel wrote:Oops, now I get it. Mmmhhh... I have to think if Gaviota TBs are MP friendly or not... I think they should be if everything is initialized before forking.Gian-Carlo Pascutto wrote:Each *process*, see my comment above.michiguel wrote:Each of the threads load their own indexes? Then it is 20 MiB x cores?M ANSARI wrote:Well I have to agree that there is something different with R3 and Nalimov memory usage. I noticed that when I put EGTB's on a USB drive it takes ages for the engine to load and unload. For some reason if EGTB's are on HDD then this is not a problem. Once loaded things are OK. I don't see this behaviour with other engines and to be honest I have never figured this one out.
Miguel
Miguel
There's a well-known bug where SMP Rybka doesn't use tablebases correctly, and it's closely related: none of the Nalimov stuff is shared and Vasik forgot to pass a parameter from the master process to the slaves.
I'm glad I got rid of sh*t like that in DS 3.0
-
- Posts: 154
- Joined: Fri Mar 10, 2006 1:20 am
- Location: Sonora, Mexico
Re: Nalimov and memory for indexes (are you aware?)
Well, it's not really that Windows doesn't do copy-on-write in similar ways to *nix, it's that the Windows API doesn't support creating a process with a copy of the parent's data. There is no equivalent of fork() in Windows.bob wrote:Don't bet on it if you use windows. Someone pointed out that windows doesn't do copy-on-write as unix does... which would produce duplicates of everything when you fork().
Which is a little weird and unfortunate.
So doing this with multiple processes in Windows would be harder. There are probably a million ways to share a cache of memory between processes in Windows, you just can't do it with the fork() semantics.
"The foundation of morality is to have done, once for all, with lying; to give up pretending to believe that for which there is no evidence, and repeating unintelligible propositions about things beyond the possibilities of knowledge." - T. H. Huxley
-
- Posts: 2292
- Joined: Mon Sep 29, 2008 1:50 am
Re: Nalimov and memory for indexes (are you aware?)
I was told that the windows kernel supports it. It is just not documented.There is no equivalent of fork() in Windows.