On 2014-05-13, 9:01 PM, Rik Cabanier wrote:
On Tue, May 13, 2014 at 3:16 PM, Ehsan Akhgari <ehsan.akhg...@gmail.com
<mailto:ehsan.akhg...@gmail.com>> wrote:
...
That is not the point of this attribute. It's just a hint for
the author
so he can tune his application accordingly.
Maybe the application is tuned to use fewer cores, or maybe
more. It all
depends...
The problem is that the API doesn't really make it obvious that
you're not supposed to take the value that the getter returns and
just spawn N workers. IOW, the API encourages the wrong behavior by
design.
That is simply untrue.
I'm assuming that the goal of this API is to allow authors to spawn as
many workers as possible so that they can exhaust all of the cores in
the interest of finishing their computation faster. I have provided
reasons why any thread which is running at a higher priority on the
system busy doing work is going to make this number an over
approximation, I have given you two examples of higher priority threads
that we're currently shipping in Firefox (Chrome Workers and the
MediaStreamGraph thread) and have provided you with experimental
evidence of running Eli's test cases trying to exhaust as many cores as
it can fails to predict the number of cores in these situations. If you
don't find any of this convincing, I'd respectfully ask us to agree to
disagree on this point.
For the sake of argument, let's say you are right. How are things worse
than before?
I don't think we should necessarily try to find a solution that is just
not worse than the status quo, I'm more interested in us implementing a
good solution here (and yes, I'm aware that there is no concrete
proposal out there that is better at this point.)
(Note that I would be very eager to discuss a proposal that
actually
tries to solve that problem.)
You should do that! People have brought this up in the past but no
progress has been made in the last 2 years.
However, if this simple attribute is able to stir people's
emotions, can
you imagine what would happen if you propose something complex? :-)
Sorry, but I have a long list of things on my todo list, and
honestly this one is not nearly close to the top of the list,
because I'm not aware of people asking for this feature very often.
I'm sure there are some people who would like it, but there are
many problems that we are trying to solve here, and this one doesn't
look very high priority.
That's fine but we're coming right back to the start: there is no way
for informed authors to make a decision today.
Yes, absolutely.
The "let's build something complex that solves everything" proposal
won't be done in a long time. Meanwhile apps can make responsive UI's
and fluid games.
That's I think one fundamental issue we're disagreeing on. I think that
apps can build responsive UIs and fluid games without this today on the Web.
I don't have any other cases where this is done.
That really makes me question the "positive feedback from web
developers" cited in the original post on this thread. Can you
please point us to places where that feedback is documented?
...
Python:
multiprocessing.cpu_count()
11,295 results
https://github.com/search?q=__multiprocessing.cpu_count%28%__29+extension%3Apy&type=Code&__ref=advsearch&l=
<https://github.com/search?q=multiprocessing.cpu_count%28%29+extension%3Apy&type=Code&ref=advsearch&l=>
...
Java:
Runtime.getRuntime().__availableProcessors()
23,967 results
https://github.com/search?q=__availableProcessors%28%29+__extension%3Ajava&type=Code&__ref=searchresults
<https://github.com/search?q=availableProcessors%28%29+extension%3Ajava&type=Code&ref=searchresults>
...
node.js is also exposing it:
require('os').cpus()
4,851 results
https://github.com/search?q=__require%28%27os%27%29.cpus%28%__29+extension%3Ajs&type=Code&__ref=searchresults
<https://github.com/search?q=require%28%27os%27%29.cpus%28%29+extension%3Ajs&type=Code&ref=searchresults>
I don't view platform parity as a checklist of features, so I really
have no interest in "checking this checkbox" just so that the Web
platform can be listed in these kinds of lists. Honestly a list of
github hits without more information on what this value is actually
used for etc. is not really that helpful. We're not taking a vote
of popularity here. ;-)
Wait, you stated:
Native apps don't typically run in a VM which provides highly
sophisticated functionality for them.
and
That really makes me question the "positive feedback from
web developers" cited in the original post on this thread.
There were 24,000 hits for java which is on the web and a VM but now you
say that it's not a vote of popularity?
We may have a different terminology here, but to me, "positive feedback
from web developers" should indicate a large amount of demand from the
web developer community for us to solve this problem at this point, and
also a strong positive signal from them on this specific solution with
the flaws that I have described above in mind. That simply doesn't map
to searching for API names on non-Web technologies on github. :-)
Also, FTR, I strongly disagree that we should implement all popular Java
APIs just because there is a way to run Java code on the web. ;-)
...
Why not? How is the web platform different?
Here's why I find the native platform parity argument
unconvincing
here. This is not the only primitive that native platforms
expose
to make it possible for you to write apps that scale to the
number
of available cores. For example, OS X provides GCD. Windows
provides at least two threadpool APIs. Not sure if Linux
directly
addresses this problem right now.
I'm not familiar with the success of those frameworks. Asking
around at
Adobe, so far I haven't found anyone that has used them.
Tuning the application depending on the number of CPU's is done
quite often.
But do you have arguments on the specific problems I brought up
which make this a bad idea?
Can you restate the actual problem? I reread your message but didn't
find anything that indicates this is a bad idea.
See above where I re-described why this is not a good technical solution
to achieve the goal of the API.
Also, as I've mentioned several times, this API basically ignores the
fact that there are AMP systems shipping *today* and dies not take the
fact that future Web engines may try to use as many cores as they can at
a higher priority (Servo being one example.)
"Others do this" is just not going to convince me here.
What would convince you? The fact that every other framework provides
this and people use it, is not a strong indication?
It's not possible for me to find exact javascript examples that use this
feature since it doesn't exist.
I'm obviously not asking you to create evidence of usage of an API which
no engine has shipped yet. You originally cited strong positive
feedback from web developers on this and given the fact that I have not
seen that myself I would like to know more about where those requests
are coming from. At the lack of that, what would convince me would be
good answers to all of the points that I've brought up several times in
this thread (which I have summarized above.)
Please note that _if_ this were the single most requested features that
actually blocked people from building apps for the Web, I might have
been inclined to go on with a bad solution rather than no solution at
all. And if you provide evidence of that, I'm willing to reconsider my
position.
...
I'm unsure how tabs are different from different processes.
As an author, I would certainly want my web workers to run in
parallel.
Why else would I use workers to do number crunching?
Again, this is a problem that already exists and we're not trying to
solve it here.
What _is_ the problem that you're trying to solve here then? I
thought that this API is supposed to give you a number of workers
that the application should start so that it can keep all of the
cores busy?
Make it possible for authors to make a semi-informed decision on how to
divide the work among workers.
That can already be done using the timing attacks at the waste of some
CPU time. The question is, whether we should do that right now?
In a good number of cases the pool will be smaller than the number of
cores (ie a game), or it might be bigger (see the webkit bug that goes
over this).
Which part of the WebKit bug are you mentioning exactly? The only
mention of "games" on the bug is
https://bugs.webkit.org/show_bug.cgi?id=132588#c10 which seems to argue
against your position. (It's not very easy to follow the discussion in
that bug...)
Also, please note that there are use cases on native
platforms which
don't really exist on the Web. For example, on a desktop
OS you
might want to write a "system info" application which
actually wants
to list information about the hardware installed on the system.
I don't think that's all that important.
Well, you seem to imply that the reason why those platforms expose the
number of cores is to support the use case under the discussion, and I'm
challenging that assumption.
If you try Eli's test case in Firefox under different
workloads (for
example, while building Firefox, doing a disk
intensive
operation,
etc.), the utter inaccuracy of the results is
proof in the
ineffectiveness of this number in my opinion.
As Eli mentioned, you can run the algorithm for longer
and get a
more
accurate result.
I tried
<http://wg.oftn.org/projects/____customized-core-estimator/____demo/
<http://wg.oftn.org/projects/__customized-core-estimator/__demo/>
<http://wg.oftn.org/projects/__customized-core-estimator/__demo/
<http://wg.oftn.org/projects/customized-core-estimator/demo/>>>
which
is supposed to give you a more accurate estimate. Have you
tried
that page when the system is under load in Firefox?
So did you try this? :-)
I did. As expected, it drops off as the load increases. I don't see what
this proves except that the polyfill is unreliable as it posted.
It's an argument that the information, if exposed from the UA, will be
*just* as unreliable.
> Again, if the native platform didn't support this,
doing this in C++ would result in the same.
Yes, exactly. Which is why I don't really buy the argument
that we
should do this because native platforms do this.
I don't follow. Yes, the algorithm is imprecise and it would be
just as
imprecise in C++.
There is no difference in behavior between the web platform and
native.
My point is, I think you should have some evidence indicating why
this is a good idea. So far I think the only argument has been the
fact that this is exposed by other platforms.
And used successfully on other platforms.
Note that it is exposed on PNaCl in Chrome as well
So? PNaCl is a Chrome specific technology so it's not any more relevant
to this discussion that Python, Perl, Java, etc. is.
Does Firefox behave different on such systems? (Is it even
supported on
these systems?)
If so, how are workers scheduled? In the end, even if the cores are
heterogeneous, knowing the number of them will keep them ALL
busy (which
means more work is getting done)
I don't know the answer to any of these questions. I was hoping
that you would do the research here. :-)
I did a little bit of research. As usual, wikipedia is the easiest to
read: http://en.wikipedia.org/wiki/Big.LITTLE There are many other
papers [1] for more information.
In "In-kernel switcher" mode, the little CPU's are taken offline when
the big one spool up. So, in this case the number of cores is half the
physical CPU's.
In "Heterogeneous multi-processing", the big CPU's will help out when
the system load increases. In this case, the number of cores is equal to
the number of CPU's.
So which number is the one that the OS exposes to us in each case? And
is that number constant no matter how many actual hardware cores are
active at any given point in time?
This proposal also assumes that the UA itself is
mostly
contempt
with using a single core, which is true for the
current browser
engines, but we're working on changing that
assumption in
Servo. It
also doesn't take the possibility of several ones
of these web
application running at the same time.
How is this different from the native platform?
On the first point, I hope the difference is obvious.
Native apps
don't typically run in a VM which provides highly sophisticated
functionality for them.
...
Why would that be? Are you burning more CPU resources in servo
to do the
same thing? If so, that sounds like a problem.
If not, the use case to scale your workload to more CPU cores is
even
better as similar tasks will end faster.
For instance, if we have a system with 8 idle cores and we
divide up a
64 second task
What Boris said.
He didn't refute that knowing the number of cores would still help.
I'm trying to do that here. :-)
UA overhead = 2s + 8 * 8s -> 10s
UA overhead over 2 threads = 2 * 1s + 8 * 8s -> 9s
On the second point, please see the paragraph above where I
discuss
that.
Until these issues are addressed, I do not think
we should
implement
or ship this feature.
FWIW these issues were already discussed in the WebKit bug.
The issues that I bring up here are the ones that I think
have not
either been brought up before or have not been sufficiently
addressed, so I'd appreciate if you could try to address them
sufficiently. It could be that I'm wrong/misinformed and I
would
appreciate if you would call me out on those points.
I find it odd that we don't want to give authors access
to such
a basic
feature. Not everything needs to be solved by a complex
framework.
You're asserting that navigator.hardwareConcurrency gives you a
basic way of solving the use case of scaling computation over a
number of worker threads. I am rejecting that assertion
here. I am
not arguing that we should not try to fix this problem, I'm
just not
convinced that the current API brings us any closer to
solving it.
I'm not asserting anything. I want to give authors an hint that
they can
make a semi-informed decision to balance their workload.
Even if there's a more general solution later on to solve that
particular problem, it will sometimes still be valuable to know the
layout of the system so you can best divide up the work.
I disagree. Let me try to rephrase the issue with this. The number
of available cores is not a constant number equal to the number of
logical cores exposed to us by the OS. This number varies depending
on everything else which is going on in the system, including the
things that the UA has control over and the things that it does not.
I hope the reason for my opposition is clear so far.
No, you failed to show why this does not apply to the web platform and
JavaScript in particular.
That is not a fair summary of everything I have said here so far.
Please see the first paragraph of my response here where I summarize why
I think this doesn't help the use case that it's trying to solve.
You're of course welcome to disagree, but that doesn't mean that I've
necessarily failed to show my side of the argument.
Your arguments apply equally to PNaCL, Java, native applications and all
the other examples listed above
Yes they do!
> yet they all provide this functionality
and people are using it to build successful applications.
1. PNaCl/Java/native platforms doing something doesn't make it right.
2. There is a reason why people have built more sophisticated solutions
to solve this problem (GCD/Windows threadpools, etc.) So let's not just
close our eyes on those solutions and pretend that the number of cores
is the only solution out there to address this use case in native platforms.
Cheers,
Ehsan
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform