On 2014-05-13, 2:42 PM, Rik Cabanier wrote:
On Tue, May 13, 2014 at 10:43 AM, Ehsan Akhgari <[email protected]
<mailto:[email protected]>> wrote:
On 2014-05-13, 9:25 AM, Rik Cabanier wrote:
Web applications can already do this today. There's nothing
stopping them
from figuring out the CPU's and trying to use them all.
Worse, I think they will likely optimize for popular
platforms
which either
overtax or underutilize non-popular ones.
Can you please provide some examples of actual web
applications that
do this, and what they're exactly trying to do with the
number once
they estimate one? (Eli's timing attack demos don't count. ;-)
Eli's listed some examples:
http://wiki.whatwg.org/wiki/__NavigatorCores#Example_use___cases
<http://wiki.whatwg.org/wiki/NavigatorCores#Example_use_cases>
That is a list of use cases which could use better ways of
supporting a worker pool that actually scales to how many cores you
have available at any given point in time. That is *not* what
navigator.hardwareConcurrency gives you, so I don't find those
examples very convincing.
That is not the point of this attribute. It's just a hint for the author
so he can tune his application accordingly.
Maybe the application is tuned to use fewer cores, or maybe more. It all
depends...
The problem is that the API doesn't really make it obvious that you're
not supposed to take the value that the getter returns and just spawn N
workers. IOW, the API encourages the wrong behavior by design.
(Note that I would be very eager to discuss a proposal that actually
tries to solve that problem.)
You should do that! People have brought this up in the past but no
progress has been made in the last 2 years.
However, if this simple attribute is able to stir people's emotions, can
you imagine what would happen if you propose something complex? :-)
Sorry, but I have a long list of things on my todo list, and honestly
this one is not nearly close to the top of the list, because I'm not
aware of people asking for this feature very often. I'm sure there are
some people who would like it, but there are many problems that we are
trying to solve here, and this one doesn't look very high priority.
I don't have any other cases where this is done.
That really makes me question the "positive feedback from web
developers" cited in the original post on this thread. Can you
please point us to places where that feedback is documented?
That was from the email to blink-dev where Adam Barth stated this.
I'll ask him where this came from.
Thanks!
I looked at other interpreted languages and they all seem to give you
access to the CPU count. Then I searched on GitHub to see the popularity:
Python:
multiprocessing.cpu_count()
11,295 results
https://github.com/search?q=multiprocessing.cpu_count%28%29+extension%3Apy&type=Code&ref=advsearch&l=
Perl:
use Sys::Info;
use Sys::Info::Constants qw( :device_cpu );
my $info = Sys::Info->new;
my $cpu = $info->device( CPU => %options );
7 results
https://github.com/search?q=device_cpu+extension%3Apl&type=Code&ref=searchresults
Java:
Runtime.getRuntime().availableProcessors()
23,967 results
https://github.com/search?q=availableProcessors%28%29+extension%3Ajava&type=Code&ref=searchresults
Ruby:
Facter.processorcount
115 results
https://github.com/search?q=processorcount+extension%3Arb&type=Code&ref=searchresults
C#:
Environment.ProcessorCount
5,315 results
https://github.com/search?q=Environment.ProcessorCount&type=Code&ref=searchresults
I also searched for JavaScript files that contain "cpu" and "core":
21,487 results
https://github.com/search?q=core+cpu+extension%3Ajs&type=Code&ref=searchresults
The results are mixed. Some projects seem to hard code CPU cores while
others are not about workers at all.
A search for "worker" and "cpu" gets more consistent results:
2,812 results
https://github.com/search?q=worker+cpu+extension%3Ajs&type=Code&ref=searchresults
node.js is also exposing it:
require('os').cpus()
4,851 results
https://github.com/search?q=require%28%27os%27%29.cpus%28%29+extension%3Ajs&type=Code&ref=searchresults
I don't view platform parity as a checklist of features, so I really
have no interest in "checking this checkbox" just so that the Web
platform can be listed in these kinds of lists. Honestly a list of
github hits without more information on what this value is actually used
for etc. is not really that helpful. We're not taking a vote of
popularity here. ;-)
Everyone is in agreement that that is a hard problem to
fix and
that there
is no clear answer.
Whatever solution is picked (maybe like Grand Central
or Intel
TBB), most
solutions will still want to know how many cores are
available.
Looking at the native platform (and Adobe's
applications), many
query the
operating system for this information to balance the
workload. I
don't see
why this would be different for the web platform.
I don't think that the value exposed by the native platforms is
particularly useful. Really if the use case is to try to
adapt the
number of workers to a number that will allow you to run
them all
concurrently, that is not the same number as reported
traditionally
by the native platforms.
Why not? How is the web platform different?
Here's why I find the native platform parity argument unconvincing
here. This is not the only primitive that native platforms expose
to make it possible for you to write apps that scale to the number
of available cores. For example, OS X provides GCD. Windows
provides at least two threadpool APIs. Not sure if Linux directly
addresses this problem right now.
I'm not familiar with the success of those frameworks. Asking around at
Adobe, so far I haven't found anyone that has used them.
Tuning the application depending on the number of CPU's is done quite often.
But do you have arguments on the specific problems I brought up which
make this a bad idea? "Others do this" is just not going to convince me
here.
Another very important distinction between the Web platform and
native platforms which is relevant here is the amount of abstraction
that each platform provides on top of hardware. Native platforms
provide a much lower level of abstraction, and as a result, on such
platforms at the very least you can control how many threads your
own application spawns and keeps active. We don't even have this
level of control on the Web platform (applications are typically
even unaware that you have multiple copies running in different tabs
for example.)
I'm unsure how tabs are different from different processes.
As an author, I would certainly want my web workers to run in parallel.
Why else would I use workers to do number crunching?
Again, this is a problem that already exists and we're not trying to
solve it here.
What _is_ the problem that you're trying to solve here then? I thought
that this API is supposed to give you a number of workers that the
application should start so that it can keep all of the cores busy?
Also, please note that there are use cases on native platforms which
don't really exist on the Web. For example, on a desktop OS you
might want to write a "system info" application which actually wants
to list information about the hardware installed on the system.
If you try Eli's test case in Firefox under different
workloads (for
example, while building Firefox, doing a disk intensive
operation,
etc.), the utter inaccuracy of the results is proof in the
ineffectiveness of this number in my opinion.
As Eli mentioned, you can run the algorithm for longer and get a
more
accurate result.
I tried
<http://wg.oftn.org/projects/__customized-core-estimator/__demo/
<http://wg.oftn.org/projects/customized-core-estimator/demo/>> which
is supposed to give you a more accurate estimate. Have you tried
that page when the system is under load in Firefox?
So did you try this? :-)
> Again, if the native platform didn't support this,
doing this in C++ would result in the same.
Yes, exactly. Which is why I don't really buy the argument that we
should do this because native platforms do this.
I don't follow. Yes, the algorithm is imprecise and it would be just as
imprecise in C++.
There is no difference in behavior between the web platform and native.
My point is, I think you should have some evidence indicating why this
is a good idea. So far I think the only argument has been the fact that
this is exposed by other platforms.
Also, I worry that this API is too focused on the
past/present. For
example, I don't think anyone sufficiently addressed Boris'
concern
on the whatwg thread about AMP vs SMP systems.
Can you provide a link to that?
http://lists.whatwg.org/htdig.__cgi/whatwg-whatwg.org/2014-__May/296737.html
<http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2014-May/296737.html>
> Are there systems that expose this to
the user? (AFAIK slow cores are substituted with fast ones on
the fly.)
I'm not sure about the details of how these cores are controlled,
whether the control happens in hardware or in the OS, etc. This is
one aspect of this problem which needs more research before we can
decide to implement and ship this, IMO.
Does Firefox behave different on such systems? (Is it even supported on
these systems?)
If so, how are workers scheduled? In the end, even if the cores are
heterogeneous, knowing the number of them will keep them ALL busy (which
means more work is getting done)
I don't know the answer to any of these questions. I was hoping that
you would do the research here. :-)
This proposal also assumes that the UA itself is mostly
contempt
with using a single core, which is true for the current browser
engines, but we're working on changing that assumption in
Servo. It
also doesn't take the possibility of several ones of these web
application running at the same time.
How is this different from the native platform?
On the first point, I hope the difference is obvious. Native apps
don't typically run in a VM which provides highly sophisticated
functionality for them.
See my long list of interpreted languages earlier in this email.
There are lots of VM's that support this and a lot of people are using it.
And also they give you direct control over how many threads your
"application" (which typically maps to an OS level process) spawns
and when, what their priorities and affinities are, etc. I think
with that in mind, implementing this API as is in Gecko will be
lying to the user (because we run some threads with higher priority
than worker threads, for example our chrome workers, the
MediaStreamGraph thread, etc.) and it would actually be harmful in
Servo where the UA tries to get its hands on as many cores as it can
do to things such as running script, layout, etc.
Why would that be? Are you burning more CPU resources in servo to do the
same thing? If so, that sounds like a problem.
If not, the use case to scale your workload to more CPU cores is even
better as similar tasks will end faster.
For instance, if we have a system with 8 idle cores and we divide up a
64 second task
What Boris said.
UA overhead = 2s + 8 * 8s -> 10s
UA overhead over 2 threads = 2 * 1s + 8 * 8s -> 9s
On the second point, please see the paragraph above where I discuss
that.
Until these issues are addressed, I do not think we should
implement
or ship this feature.
FWIW these issues were already discussed in the WebKit bug.
The issues that I bring up here are the ones that I think have not
either been brought up before or have not been sufficiently
addressed, so I'd appreciate if you could try to address them
sufficiently. It could be that I'm wrong/misinformed and I would
appreciate if you would call me out on those points.
I find it odd that we don't want to give authors access to such
a basic
feature. Not everything needs to be solved by a complex framework.
You're asserting that navigator.hardwareConcurrency gives you a
basic way of solving the use case of scaling computation over a
number of worker threads. I am rejecting that assertion here. I am
not arguing that we should not try to fix this problem, I'm just not
convinced that the current API brings us any closer to solving it.
I'm not asserting anything. I want to give authors an hint that they can
make a semi-informed decision to balance their workload.
Even if there's a more general solution later on to solve that
particular problem, it will sometimes still be valuable to know the
layout of the system so you can best divide up the work.
I disagree. Let me try to rephrase the issue with this. The number of
available cores is not a constant number equal to the number of logical
cores exposed to us by the OS. This number varies depending on
everything else which is going on in the system, including the things
that the UA has control over and the things that it does not. I hope
the reason for my opposition is clear so far.
Cheers,
Ehsan
_______________________________________________
dev-platform mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-platform