Replies: 1 comment
-
Hi @Dwedit , Thanks for bringing this subject. There is actually no issue opened for this feature surprisingly. It is something I have had in my head since a moment. It would be awesome to have it and a great opportunity to review/rewrite the read & write code of the kernel. The right way to implement this and avoid trying to be smarter than the system is just to use the Windows Cache Manager. Regarding the invalidation, the userland filesystem already has access to an API to notify a file content has changed. We could just plug this with the cache manager to discard it. Otherwise it is fine to let the cache live with the
Indeed and I believe there is an issue for this somewhere if someone wants to take a look at it. That said, I can directly say that I will not have time to implement it. If someone wants it, this person will have to contribute to the project and I will be super glad to provide guidance and review the contribution 💪 Small data point: |
Beta Was this translation helpful? Give feedback.
-
I was reading a 2017 article on FUSE from "Proceedings of the 15th USENIX Conference onFile and Storage Technologies (FAST ’17)", available at https://www.usenix.org/system/files/conference/fast17/fast17-vangoor.pdf
According to the article, FUSE uses a 128K size read-ahead buffer, and "Thanks to read-ahead, sequential read performance...was as good as Ext4 for both HDD and SSD"
Having at least 128K of data available would greatly speed up programs that make many small sequential reads. There would be far fewer calls that would need to reach the Dokan Filesystem program if it only had to read 128K size blocks at a time.
But now the question is how would you actually implement this into Dokany. The proper way to do it would be to interact with the kernel's Cache Manager.
There's about 61 API functions that deal with the Cache Manager (they start with "Cc"), so it's not a simple task at all.
Microsoft provides sample code for their CDFS and FastFat file systems. CDFS makes 8 calls to the cache manager API, and implements 8 callback functions for caching. FastFat makes 24 API calls, and implements 8 callbacks.
There's also a 'less proper' alternative idea, use a simple read-ahead mini-cache (maybe a 128K size block) tied to an open file. When there's a read, fill the buffer, then provide data from the buffer, until the next block needs to be read from the open file. This would not interact with the kernel's cache manager, and could work as a proof-of-concept.
Regardless of what is used, Cache Invalidation becomes a concern.
One possible approach to Cache Invalidation would be that the Read-Ahead Buffer can only be read once, and data is immediately invalidated as it is read. It would still need to be invalidated if there is a write operation on that file.
Cache Invalidation would be tricky for the Mirror filesystem. If a file is open with Read/Write share mode, the underlying file can change contents. Old stale data sitting in a cache would be read first before seeing new file contents. There are Win32 API functions to monitor a file for changes, and that could be used to trigger cache invalidation for Mirror.
Meanwhile, a read-only immutable file system needs no cache invalidation.
Support for a Read-Ahead buffer or Caching would need to be Opt-In, and the Dokan Filesystem program could request the read-ahead size.
Beta Was this translation helpful? Give feedback.
All reactions