On 2014-05-13, 2:42 PM, Rik Cabanier wrote:
On Tue, May 13, 2014 at 10:43 AM, Ehsan Akhgari <ehsan.akhg...@gmail.com <mailto:ehsan.akhg...@gmail.com>> wrote: On 2014-05-13, 9:25 AM, Rik Cabanier wrote: Web applications can already do this today. There's nothing stopping them from figuring out the CPU's and trying to use them all. Worse, I think they will likely optimize for popular platforms which either overtax or underutilize non-popular ones. Can you please provide some examples of actual web applications that do this, and what they're exactly trying to do with the number once they estimate one? (Eli's timing attack demos don't count. ;-) Eli's listed some examples: http://wiki.whatwg.org/wiki/__NavigatorCores#Example_use___cases <http://wiki.whatwg.org/wiki/NavigatorCores#Example_use_cases> That is a list of use cases which could use better ways of supporting a worker pool that actually scales to how many cores you have available at any given point in time. That is *not* what navigator.hardwareConcurrency gives you, so I don't find those examples very convincing. That is not the point of this attribute. It's just a hint for the author so he can tune his application accordingly. Maybe the application is tuned to use fewer cores, or maybe more. It all depends...
The problem is that the API doesn't really make it obvious that you're not supposed to take the value that the getter returns and just spawn N workers. IOW, the API encourages the wrong behavior by design.
(Note that I would be very eager to discuss a proposal that actually tries to solve that problem.) You should do that! People have brought this up in the past but no progress has been made in the last 2 years. However, if this simple attribute is able to stir people's emotions, can you imagine what would happen if you propose something complex? :-)
Sorry, but I have a long list of things on my todo list, and honestly this one is not nearly close to the top of the list, because I'm not aware of people asking for this feature very often. I'm sure there are some people who would like it, but there are many problems that we are trying to solve here, and this one doesn't look very high priority.
I don't have any other cases where this is done. That really makes me question the "positive feedback from web developers" cited in the original post on this thread. Can you please point us to places where that feedback is documented? That was from the email to blink-dev where Adam Barth stated this. I'll ask him where this came from.
Thanks!
I looked at other interpreted languages and they all seem to give you access to the CPU count. Then I searched on GitHub to see the popularity: Python: multiprocessing.cpu_count() 11,295 results https://github.com/search?q=multiprocessing.cpu_count%28%29+extension%3Apy&type=Code&ref=advsearch&l= Perl: use Sys::Info; use Sys::Info::Constants qw( :device_cpu ); my $info = Sys::Info->new; my $cpu = $info->device( CPU => %options ); 7 results https://github.com/search?q=device_cpu+extension%3Apl&type=Code&ref=searchresults Java: Runtime.getRuntime().availableProcessors() 23,967 results https://github.com/search?q=availableProcessors%28%29+extension%3Ajava&type=Code&ref=searchresults Ruby: Facter.processorcount 115 results https://github.com/search?q=processorcount+extension%3Arb&type=Code&ref=searchresults C#: Environment.ProcessorCount 5,315 results https://github.com/search?q=Environment.ProcessorCount&type=Code&ref=searchresults I also searched for JavaScript files that contain "cpu" and "core": 21,487 results https://github.com/search?q=core+cpu+extension%3Ajs&type=Code&ref=searchresults The results are mixed. Some projects seem to hard code CPU cores while others are not about workers at all. A search for "worker" and "cpu" gets more consistent results: 2,812 results https://github.com/search?q=worker+cpu+extension%3Ajs&type=Code&ref=searchresults node.js is also exposing it: require('os').cpus() 4,851 results https://github.com/search?q=require%28%27os%27%29.cpus%28%29+extension%3Ajs&type=Code&ref=searchresults
I don't view platform parity as a checklist of features, so I really have no interest in "checking this checkbox" just so that the Web platform can be listed in these kinds of lists. Honestly a list of github hits without more information on what this value is actually used for etc. is not really that helpful. We're not taking a vote of popularity here. ;-)
Everyone is in agreement that that is a hard problem to fix and that there is no clear answer. Whatever solution is picked (maybe like Grand Central or Intel TBB), most solutions will still want to know how many cores are available. Looking at the native platform (and Adobe's applications), many query the operating system for this information to balance the workload. I don't see why this would be different for the web platform. I don't think that the value exposed by the native platforms is particularly useful. Really if the use case is to try to adapt the number of workers to a number that will allow you to run them all concurrently, that is not the same number as reported traditionally by the native platforms. Why not? How is the web platform different? Here's why I find the native platform parity argument unconvincing here. This is not the only primitive that native platforms expose to make it possible for you to write apps that scale to the number of available cores. For example, OS X provides GCD. Windows provides at least two threadpool APIs. Not sure if Linux directly addresses this problem right now. I'm not familiar with the success of those frameworks. Asking around at Adobe, so far I haven't found anyone that has used them. Tuning the application depending on the number of CPU's is done quite often.
But do you have arguments on the specific problems I brought up which make this a bad idea? "Others do this" is just not going to convince me here.
Another very important distinction between the Web platform and native platforms which is relevant here is the amount of abstraction that each platform provides on top of hardware. Native platforms provide a much lower level of abstraction, and as a result, on such platforms at the very least you can control how many threads your own application spawns and keeps active. We don't even have this level of control on the Web platform (applications are typically even unaware that you have multiple copies running in different tabs for example.) I'm unsure how tabs are different from different processes. As an author, I would certainly want my web workers to run in parallel. Why else would I use workers to do number crunching? Again, this is a problem that already exists and we're not trying to solve it here.
What _is_ the problem that you're trying to solve here then? I thought that this API is supposed to give you a number of workers that the application should start so that it can keep all of the cores busy?
Also, please note that there are use cases on native platforms which don't really exist on the Web. For example, on a desktop OS you might want to write a "system info" application which actually wants to list information about the hardware installed on the system. If you try Eli's test case in Firefox under different workloads (for example, while building Firefox, doing a disk intensive operation, etc.), the utter inaccuracy of the results is proof in the ineffectiveness of this number in my opinion. As Eli mentioned, you can run the algorithm for longer and get a more accurate result. I tried <http://wg.oftn.org/projects/__customized-core-estimator/__demo/ <http://wg.oftn.org/projects/customized-core-estimator/demo/>> which is supposed to give you a more accurate estimate. Have you tried that page when the system is under load in Firefox?
So did you try this? :-)
> Again, if the native platform didn't support this, doing this in C++ would result in the same. Yes, exactly. Which is why I don't really buy the argument that we should do this because native platforms do this. I don't follow. Yes, the algorithm is imprecise and it would be just as imprecise in C++. There is no difference in behavior between the web platform and native.
My point is, I think you should have some evidence indicating why this is a good idea. So far I think the only argument has been the fact that this is exposed by other platforms.
Also, I worry that this API is too focused on the past/present. For example, I don't think anyone sufficiently addressed Boris' concern on the whatwg thread about AMP vs SMP systems. Can you provide a link to that? http://lists.whatwg.org/htdig.__cgi/whatwg-whatwg.org/2014-__May/296737.html <http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2014-May/296737.html> > Are there systems that expose this to the user? (AFAIK slow cores are substituted with fast ones on the fly.) I'm not sure about the details of how these cores are controlled, whether the control happens in hardware or in the OS, etc. This is one aspect of this problem which needs more research before we can decide to implement and ship this, IMO. Does Firefox behave different on such systems? (Is it even supported on these systems?) If so, how are workers scheduled? In the end, even if the cores are heterogeneous, knowing the number of them will keep them ALL busy (which means more work is getting done)
I don't know the answer to any of these questions. I was hoping that you would do the research here. :-)
This proposal also assumes that the UA itself is mostly contempt with using a single core, which is true for the current browser engines, but we're working on changing that assumption in Servo. It also doesn't take the possibility of several ones of these web application running at the same time. How is this different from the native platform? On the first point, I hope the difference is obvious. Native apps don't typically run in a VM which provides highly sophisticated functionality for them. See my long list of interpreted languages earlier in this email. There are lots of VM's that support this and a lot of people are using it. And also they give you direct control over how many threads your "application" (which typically maps to an OS level process) spawns and when, what their priorities and affinities are, etc. I think with that in mind, implementing this API as is in Gecko will be lying to the user (because we run some threads with higher priority than worker threads, for example our chrome workers, the MediaStreamGraph thread, etc.) and it would actually be harmful in Servo where the UA tries to get its hands on as many cores as it can do to things such as running script, layout, etc. Why would that be? Are you burning more CPU resources in servo to do the same thing? If so, that sounds like a problem. If not, the use case to scale your workload to more CPU cores is even better as similar tasks will end faster. For instance, if we have a system with 8 idle cores and we divide up a 64 second task
What Boris said.
UA overhead = 2s + 8 * 8s -> 10s UA overhead over 2 threads = 2 * 1s + 8 * 8s -> 9s On the second point, please see the paragraph above where I discuss that. Until these issues are addressed, I do not think we should implement or ship this feature. FWIW these issues were already discussed in the WebKit bug. The issues that I bring up here are the ones that I think have not either been brought up before or have not been sufficiently addressed, so I'd appreciate if you could try to address them sufficiently. It could be that I'm wrong/misinformed and I would appreciate if you would call me out on those points. I find it odd that we don't want to give authors access to such a basic feature. Not everything needs to be solved by a complex framework. You're asserting that navigator.hardwareConcurrency gives you a basic way of solving the use case of scaling computation over a number of worker threads. I am rejecting that assertion here. I am not arguing that we should not try to fix this problem, I'm just not convinced that the current API brings us any closer to solving it. I'm not asserting anything. I want to give authors an hint that they can make a semi-informed decision to balance their workload. Even if there's a more general solution later on to solve that particular problem, it will sometimes still be valuable to know the layout of the system so you can best divide up the work.
I disagree. Let me try to rephrase the issue with this. The number of available cores is not a constant number equal to the number of logical cores exposed to us by the OS. This number varies depending on everything else which is going on in the system, including the things that the UA has control over and the things that it does not. I hope the reason for my opposition is clear so far.
Cheers, Ehsan _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform