On 27 Apr 2026, at 4:06, David Hildenbrand (Arm) wrote:

> On 4/25/26 14:00, Zi Yan wrote:
>> On 17 Apr 2026, at 22:44, Zi Yan wrote:
>>
>>> Replace it with a check on the max folio order of the file's address space
>>> mapping, making sure PMD THP is supported. Also remove the read-only fd
>>> check, since collapse_file() now makes sure all to-be-collapsed folios are
>>> clean and the created PMD file THP can be handled by FSes properly.
>>>
>>> Signed-off-by: Zi Yan <[email protected]>
>>> ---
>>>  mm/huge_memory.c | 8 ++++----
>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index 970e077019b7..7e9cf8c0985f 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -86,9 +86,6 @@ static inline bool file_thp_enabled(struct vm_area_struct 
>>> *vma)
>>>  {
>>>     struct inode *inode;
>>>
>>> -   if (!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS))
>>> -           return false;
>>> -
>>>     if (!vma->vm_file)
>>>             return false;
>>>
>>> @@ -97,7 +94,10 @@ static inline bool file_thp_enabled(struct 
>>> vm_area_struct *vma)
>>>     if (IS_ANON_FILE(inode))
>>>             return false;
>>>
>>> -   return !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
>>
>> Hi Matthew, David, and Lorenzo,
>>
>> After some discussions on irc, I feel that we probably should not allow
>> read-write fd for PMD THP collapse at the moment. Combining with the
>> filemap_flush() under folio_dirty() check from collapse_file(), khugepaged
>> would become a kwritebackd that scans pagecache folios and writes them back.
>> If we limit it to read-only fds, at least khugepaged would only write
>> back once for the pagecache folios from these fds.
>
> Makes sense. The comment above the filemap_flush() is valuable :)
>
>>
>> I am planning to restore inode_is_open_for_write() check in the next version.
>> Let me know your thoughts.
>
> That would be better. Do we plan on handling races with concurrent 
> writable-opening?
>

Probably not worth the effort, READ_ONLY_THP_FOR_FS does it using 
filemap_nr_thps()
and cost 8B in struct address_space. And read-only -> read->write race window 
might
not be that a big issue, since it does not cause repeatedly write back.

> Or could we simply gate the filemap_flush() by a inode_is_open_for_write(), to
> not have it turn into a kwritebackd()?

This is a great idea. With it, not dirty folios from read-write fds can be 
collapsed.
I will do it at the end.

Thanks.


--
Best Regards,
Yan, Zi

Reply via email to