On Tue, Sep 29, 2015 at 6:05 AM, Ehsan Akhgari <ehsan.akhg...@gmail.com>
wrote:

> On 2015-09-29 12:52 AM, Gregory Szorc wrote:
>
>> On Mon, Sep 28, 2015 at 6:45 PM, Ehsan Akhgari <ehsan.akhg...@gmail.com
>> <mailto:ehsan.akhg...@gmail.com>> wrote:
>>
>>     On 2015-09-28 5:41 PM, Gregory Szorc wrote:
>>
>>         When writing thousands of files in rapid succession, this 1+ms
>> pause
>>         (assuming synchronous I/O) piles up. Assuming a 1ms pause,
>>         writing 100,000
>>         files spends 100s in CloseFile()! The process profile also shows
>>         the bulk
>>         of the time in CloseFile(), so this is a real hot spot.
>>
>>
>>     There is no CloseFile() on Windows.  Did you mean CloseHandle()?
>>
>>
>> While this is probably something I should know, I confess to blindly
>> copying results from Sysinternals'  procmon utility, which reports file
>> closes as the "CloseFile()" "operation." I reckon it is being
>> intelligent and converting CloseHandle() to something more useful for
>> reporting purposes. In my defense, procmon does report "operations" that
>> I know are actual Windows functions. Kinda weird it is inconsistent. Who
>> knows.
>>
>
> Fair!  Honestly I haven't used procmon in years, I don't even remember it
> having any profiling tools when I last saw it.  :-)  But it probably tracks
> which handles are being passed to CloseHandle().
>

It has some very limited profiling tools built in. I had to dump the output
and write a script to perform the analysis I needed :) It does in fact
track various arguments so you can get filename-level activity for all I/O
operations.


>
>     The reason I'm asking is that CloseHandle() can close various types
>>     of kernel objects, and if that is showing up in profiles, it's worth
>>     to verify that the handle passed to it is actually coming from
>>     CreateFile(Ex).
>>
>>
>> Procmon is reporting lots of CreateFile() calls. And I'm 100% certain
>> the underlying C code is calling CreateFile().
>>
>
> Good.  I'm assuming you mean CreateFile() directly, not wrappers such as
> _open or fopen.
>

We're calling CreateFile() or CreateFileA() directly. However...


>
>     Closing handles on a background thread doesn't help with performance
>>     if you're invoking sub-processes that need to close a handle and
>>     wait for the operation to finish.  It would help if you provided
>>     more details on the constraints you're dealing with, e.g., where do
>>     these handles come from?  Are they being created by one long running
>>     process or by several short lived ones?  etc.  Another idea to
>>     experiment with is leaking the handles and letting the kernel close
>>     them for you when your process is terminated.  I _think_ (but I'm
>>     not sure) that won't count towards the handle of the process to
>>     become signaled so if you're spawning a process that needs to close
>>     the file and wait for that to finish, that may be faster.
>>
>>
>> I'm dealing with a single threaded single long-running process that
>> performs synchronous I/O, 1 open file at a time. CreateFile,
>> CloseHandle, CreateFile, CloseHandle, ... I'm pretty sure leaking
>> handles is out of the question, as we need to write to thousands or even
>> tens of thousands of files and this will exhaust open files limits.
>>
>
> You'd be surprised.  :-)
>
> Windows doesn't really have a notion of open file limits similar to Unix.
> File handles opened using _open can go up to a maximum of 2048. fopen has a
> cap of 512 which can be raised up to 2048 using _setmaxstdio().  *But*
> these are just CRT limits, and if you use Win32 directly, you can open up
> to 2^24 handles all at once <
> https://technet.microsoft.com/en-us/library/bb896645.aspx>.  Since we
> will never need to open that many file handles, you may very well be able
> to use this approach.
>

I experimented with a background thread for just processing file closes.
This drastically increases performance! However, the queue periodically
accumulates and I was seeing errors for too many open files - despite using
CreateFile()! We do make a call to _open_osfhandle() after CreateFile().
I'm guessing the file limit is on file descriptors (not handles) and
_open_osfhandle() triggers the 512 default ceiling? This call is necessary
because Python file objects speak in terms of file descriptors. Not calling
_open_osfhandle() would mean re-implementing Python's file object, which
I'm going to say is too much work for the interim.

Buried in that last paragraph is that a background threading closing files
resulted in significant performance wins - ~5:00 wall on an operation that
was previously ~16:00 wall! And I'm pretty sure it would go faster if
multiple closing threads were used. Still not as fast as Linux. But much
better than the 3x increase from before.
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to