tablebase caching / mmap() / page cache

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: tablebase caching / mmap() / page cache

Post by bob »

syzygy wrote:
Cardoso wrote:So with mmap() and since we can't know in advance if the data is in cache, we can only use the conditions depth, score, static eval...
Actually, we can. I just discovered that Linux has mincore():

Code: Select all

       mincore()  returns a vector that indicates whether pages of the calling
       process's virtual memory are resident in core (RAM), and  so  will  not
       cause  a  disk  access  (page fault) if referenced.  The kernel returns
       residency information about the pages starting at the address addr, and
       continuing for length bytes.

       The  addr  argument  must  be  a multiple of the system page size.  The
       length argument need not be a multiple of the page size, but since res‐
       idency  information  is returned for whole pages, length is effectively
       rounded up to the next multiple of the page size.  One may  obtain  the
       page size (PAGE_SIZE) using sysconf(_SC_PAGESIZE).

       The   vec   argument  must  point  to  an  array  containing  at  least
       (length+PAGE_SIZE-1) / PAGE_SIZE bytes.  On return, the least  signifi‐
       cant  bit  of  each  byte will be set if the corresponding page is cur‐
       rently resident in memory, and be clear otherwise.   (The  settings  of
       the  other bits in each byte are undefined; these bits are reserved for
       possible later use.)  Of course the information returned in vec is only
       a  snapshot: pages that are not locked in memory can come and go at any
       moment, and the contents of vec may already be stale by the  time  this
       call returns.
Windows has VirtualQuery().

These functions are undoubtedly too expensive to call on every probe, but maybe some balance can be found.

What would be ideal is an x86-64 load instruction that would not result in a page fault if the required memory page is not in RAM, but that would set an error flag. I don't think such an instruction exists in the x86-64 instruction set, but I see no reason why it could not be added. So we have to ask Intel and AMD.
I haven't done much reading, but I wonder if one can directly peek at the page tables to see if the valid bit is set? I think the cr3 register points to the top-level page table, I just don't know if you can access this while in user mode or not. Got me interested so I will look and report back. If this can be done, it would be a VERY fast bit of asm code to answer the question "will this page fault or not?"
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: tablebase caching / mmap() / page cache

Post by syzygy »

Cardoso wrote:Just an additional question:
Since there can be many TB files, when you load (mmap) a subdatabase you keep it mmaped until the engine is shut down or until a new game is started or do you use the LRU principle to unmmap them and free up memory, during normal engine play?
I keep everything mmap()d until the engine shuts down. Note that mmap()ing a file does not use any physical memory. It just allocates a virtual address range through which the engine can access the file's data. Portions of the file that are actually accessed and thereby mapped into the page cache do use memory, but they would use that memory anyway.