суббота, 18 марта 2017 г., 14:37:11 UTC+3 пользователь Konstantin Khomoutov написал: > > On Sat, 18 Mar 2017 03:50:39 -0700 (PDT) > Vitaly Isaev <[email protected] <javascript:>> wrote: > > [...] > > Assume that application does some heavy lifting with multiple file > > descriptors (e.g., opening - writing data - syncing - closing), what > > actually happens to Go runtime? Does it block all the goroutines at > > the time when expensive syscall occures (like syscall.Fsync)? Or only > > the calling goroutine is blocked while the others are still operating? > > IIUC, since there's no general mechanism to have kernel somehow notify > the process of the completion of any generic syscalls, when a goroutine > enters a syscall, it essentially locks its unrelying OS thread and > waits until the syscall completes. The scheduler detects the goroutine > is about to sleep in the syscall and schedules another goroutine(s) to > run, but the underlying OS thread is not freed. > > This is in contrast to network I/O which uses the platform-specific > poller (such as IOCP on Windows, epoll on Linux, kqueue on FreeBSD and > so on) so when an I/O operation on a socket is about to block, the > goroutine which performed that syscall is suspended, put on the wait > list, its socket is added to the set the poller monitors and its > underlying OS thread is freed to be able to serve a runnable goroutine. > > > So does it make sense to write programs with multiple workers that do > > a lot of user space - kernel space context switching? Does it make > > sense to use multithreading patterns for disk input? > > It may or may not. A syscall-heavy workload might degrade the > goroutine scheduling to actually be N×N instead of M×N. This might not > be the problem in itself (not counting a big number of OS threads > allocated and mostly sleeping) but concurrent access to the same slow > resource such as rotating medum is almost always a bad idea: say, your > HDD (and the file system on it) might only provide such and such read > bandwidth, so spreading the processing of the data being read across > multiple goroutines is only worth the deal if this processing is so > computationally complex that a single goroutine won't cope with that > full bandwidth. If one goroutine is OK with keeping up with that full > bandwidth, having two goroutines read that same data will make each deal > with only half the bandwidth, so they will sleep > 50% of the time. > Note that reading two files in parallel off the filesystem located on > the same rotating medium will usually result in lowered full > bandwidth due to seek times required to jump around the blocks of > different files. > > SSDs and other kinds of medium might have way better performance > characteristics so it worth measuring. > > IOW, I'd say that trying to parallelizing might be a premature > optimization. It worth keeping in mind that goroutines serve two > separate purposes: 1) they allow you to write natural sequential > control flow instead of callback-ridden spaghetti code; 2) they allow > performing tasks truely in parallel--if the hardware supports it > (multiple CPUs and/or cores). > > This (2) is tricky because it assumes such goroutines have something to > do; if they instead contend on some shared resource, the parallelization > won't really happen. >
Thanks, that's a very good point. -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
