[RFC] Using DC in amdgpu for upcoming GPU

Kevin Brace Thu, 15 Dec 2016 16:48:39 +0100

Hi,

I have been reading the ongoing discussion about what to do about AMD DC 
(Display Core) with great interest since I have started to put more time into 
developing OpenChrome DRM for VIA Technologies Chrome IGP.
I particularly enjoyed reading what Tony Cheng wrote about what is going on 
inside AMD Radeon GPUs.
As a graphics stack developer, I suppose I am still someone somewhat above a 
beginner level, and Chrome IGP might be considered garbage graphics to some (I 
do not really care what people say or think about it.), but since my background 
is really digital hardware design (self taught) rather than graphics device 
driver development, I will like to add my 2 cents (U.S.D.) to the discussion.
I also consider myself an amateur semiconductor industry historian, and in 
particular, I have been a close watcher of Intel's business / hiring practice 
for many years. 
For some, what I am writing may not make sense or even offend some (my guess 
will be the people who work at Intel), but I will not pull any punches, and if 
you do not like what I write, let me know. (That does not mean I will 
necessarily take back my comment even if it offended you. I typically stand 
behind what I say, unless it is obvious that I am wrong.)
    While my understanding of DRM is still quite primitive, my simplistic 
understanding of why AMD is pushing DC is due to the following factors.


1) AMD is understaffed due to its precarious financial condition it is in right 
now (i.e., < $1 billion CoH and losing 7,000 employees since Year 2008 or so)
2) The complexity of the next generation ASIC is only getting worse due to the 
continuing process scaling = more transistors one has to use (i.e., TSMC 28 nm 
to GF 14 nm to probably Samsung / TSMC 10 nm or GF 7 nm)
3) Based on 1 and 2, unless the design productively can be improved, AMD will 
be late to market, and this can be the possible end to AMD as a corporation
4) Hence, in order to meet TtM and improve engineer productivity, AMD needs to 
reuse the existing pre-silicon / post-silicon bring up test code and share the 
code with the Windows side of the device driver developers
5) In addition, power is already the biggest design challenge, and very precise 
power management is crucial to the performance of the chip (i.e., it's not all 
about the laptop anymore, and desktop "monster" graphics cards also need power 
management for performance reasons, in order to manage heat generation)
6) AMD Radeon is really running an RTOS (Real Time Operating System) inside the 
GPU card, and they want to put the code to handle initialization / power 
management closer to the GPU rather than from the slower response x86 (or any 
other general purpose) microprocessor


Since I will probably need to obtain "favors" down the road when I try to get 
OpenChrome DRM mainlined, I probably should not go into what I think of how 
Intel works on their graphics device driver stack (I do not mean to make this 
personal, but Intel is the "other" open source camp in the OSS x86 graphics 
world, so I find it a fair game to discuss the approach Intel takes from 
semiconductor industry perspective. I am probably going to overly generalize 
what is going on, so if you wanted to correct me, let me know.), but based on 
my understanding of how Intel works, Intel probably has more staffing resources 
than AMD when it comes to graphics device driver stack development. (and on the 
x86 microprocessor development side)
Based on my understanding of where Intel stands financially, I feel like Intel 
is standing on very thin ice due to the following factors, and I will predict 
that they will eventually adopt AMD DC like design concept. (i.e., use of a HAL)
Here is my logic.

1) PC (desktop and laptop) x86 processors are not selling very well, and my 
understanding is that since Year 2012 peak, x86 processor shipment is down 30% 
as of Year 2016 (I will say around $200 ASP)
2) Intel's margins are being propped up by the unnaturally high data center 
marketshare (99% for x86 data center microprocessors) and very high data center 
x86 processor ASP (Average Selling Price) of $600 (Up from $500 a few years ago 
due to AMD screwing up the Bulldozer microarchitecture. More on this later.)
3) Intel did a significant layoff in April 2016 where they targeted older (read 
"expensive"), experienced engineers
4) Like Cisco Systems (notorious for their annual summer time 5,000 layoff), 
Intel then turns around and goes in a hiring spree hiring from many graduate 
programs of U.S. second and third tier universities, bringing down the overall 
experience level of the engineering departments
5) While AMD is financially in a desperate shape, it will likely have one last 
chance in Zen microarchitecture to get back into the game (Zen will be the last 
chance for AMD, IMO.)
6) Since AMD is now fabless due to divestiture of the fabs in Year 2009 
(GLOBALFOUNDRIES), it no longer has the financial burden of having to pay for 
the fab, whereas Intel "had to" delay 10 nm process deployment to 2H'17 due to 
weak demand of 14 nm process products and low utilization of 14 nm process (Low 
utilization delays the amortization of 14 nm process. Intel historically 
amortized the given process technology in 2 years. 14 nm is starting to look 
like 2.5 to 3 years due to yield issues they encountered in 2014.)
7) Inevitably, the magic of market competition will drag down Intel ASP (both 
PC and data center) since Zen microarchitecture is a rather straight forward 
x86 microarchitectural implementation (i.e., not too far apart from Skylake), 
hence, their low 60% gross margin will be under pressure from AMD starting in 
Year 2017.
8) Intel overpaid for Altera (a struggling FPGA vendor where the CEO probably 
felt like he had to sell the corporation in order to cover up the Stratix 10 
FPGA development screw up of missing the tape out target date by 1.5 years) by 
$8 billion, and the next generation process technology is getting ever more 
expensive (10 nm, 7 nm, 5 nm, etc.)
9) In order to "please" Wall Street, Intel management will possibly do further 
destructive layoffs every year, and if I were to guess, will likely layoff 
another 25,000 to 30,000 people over the next 3 to 4 years
10) Intel has already lost the experienced engineers over the past layoffs, 
replacing them with far less experienced engineers hired relatively recently 
from mostly second and third tier U.S. universities
11) Now, with 25,000 to 30,000 layoff, the management will force the software 
engineering side to reorganize, and Intel will be "forced" to come up with ways 
to reuse their graphics stack code (i.e., sharing more code between Windows and 
Linux)
12) Hence, maybe a few years from now, Intel people will have to do something 
similar to AMD DC, in order to improve their design productivity since they no 
longer can throw people at the problem (Their tendency to overhire new college 
graduates since they are cheaper, and this allowed them to throw people at the 
problem relatively cheaply until recently. High x86 ASP also allowed them to do 
this as well, and they got too used to this for too long. They will not be able 
to do this in the future. In the meantime, their organizational experience 
level is coming down due to hiring too many NCGs and laying off too many 
experienced people at the same time.)


I am sure there are people who are not happy reading this, but this is my 
harsh, honest assessment of what Intel is going through right now, and what 
will happen in the future.
I am sure I will be effectively blacklisted from working at Intel for writing 
what I just wrote (That's okay since I am not interested in working at Intel.), 
but I came to this conclusion based on various people who used to work at Intel 
told me and observing the hiring practice of Intel for a number of years.
In particular, one person who worked on Intel 740 project (i.e., the long 
forgotten discrete AGP graphics chip from 1998) on the technical side has told 
me that Intel is really terrible at IP (Intellectual Property) core reuse, and 
Intel frequently redesigns too many portions of their ASICs all the time.
Based on that, I am not too surprised to hear that Intel does Windows and Linux 
graphics device driver stack development separately. (That's what I read.)
In other words, Intel is bloated from a staffing point of view. (I do not 
necessarily like people to lose jobs, but compared to AMD and NVIDIA, Intel is 
really bloated. The same person who worked on the Intel 740 project told me 
that Intel employee productivity is much lower than their competitors like AMD 
and NVIDIA on a per employee basis, and they have not been able to fix this for 
years.)
Despite the constant layoffs, Intel's employee count has not really gone down 
for the past few years (it is staying around 100,000 for the past 4 years), but 
eventually Intel will have to get rid of people in absolute numbers.
Intel also heavily relies on its "shadow" workforce of interns (from local 
universities, especially the foreign master's degree students desperate to pay 
off part of their high out of state tuition) and contractors / consultants, so 
their "real" employee count is probably closer to 115,000 or 120,000.
I get Intel related contractor / consultant position "unsolicited" e-mails from 
recruiters possibly located 12 time zones away from where I reside (please do 
not call me a racist for pointing this out since I find this so weird as a U.S. 
citizen) almost every weekday (M-F), and I am always surprised at the type of 
work Intel wants contractors to work on.
Many of the positions they want people to work are highly specialized stuff (I 
saw a graphics device driver contract position recently.), and they have been 
like this for several years already.
I no longer bother with Intel anymore based on this since they appear to not 
want to commit to proper employment of highly technical people.
Going back to the graphics world, my take is, Intel will have to get used to 
doing the same with far fewer people, and they will need to change their 
corporate culture of throwing people at the problem very soon since their x86 
ASP will be crashing down fairly soon, and AMD will likely never repeat the 
Bulldozer microarchitecture screw up again. (Intel got lucky when former IBM 
PowerPC architects AMD hired around Year 2005 screwed up the Bulldozer. Speed 
Demon design is a disaster in a power constrained post-90 nm process node. They 
tried to compensate for Bulldozer's low IPC with high clock frequency. Intel 
learned a painful lesson about power with NetBurst microarchitecture between 
Year 2003 to 2005. Also, then AMD management seem to have really believed in 
the many-core concept too seriously. AMD had to live with the messed up 
Bulldozer for 10+ years with disastrous financial results.)
    I do understand that what I am writing isn't terribly technical in nature 
(it is more like corporate strategy stuff business / marketing side people 
worry about), but I feel like what AMD is doing is quite logical. (i.e., using 
higher abstraction level for initialization / power management, and code reuse)
Sorry for the off topic assessment of Intel (i.e., hiring practice stuff, x86 
stuff), and based on the subsequent messages, it appears that DC can be 
rearchitected to satisfy Linux kernel developers, but overall, I feel like 
there is a lack of appreciation for the concept of design reuse in this case 
even though in ASIC / FPGA design world, this is very normal. (It has been like 
this since the mid-'90s when ASIC engineers had to start doing this regularly.)
AMD side people appeared to have been trying to apply this concept to the 
device driver side as well.
Considering AMD's meager staffing resources (currently approximately 9,000; 
less than 1/10 of Intel although Intel owns many fabs and product lines, so the 
actual developer staffing disadvantage is probably more like 1:3 to 1:5 ratio), 
I am not too surprised to read that it is trying to improve their productivity 
where they can, and combining some portions of Windows and Linux code makes 
sense.
I would imagine that NVIDIA is going something like this already. (but closed 
source)
Again, I will almost bet that Intel will adopt AMD DC like concept in the next 
few years.
Let me know if I was right in a few years.

Regards,

Kevin Brace
The OpenChrome Project maintainer / developer

[RFC] Using DC in amdgpu for upcoming GPU

Reply via email to