# People keep saying "chiplets are the future" but this just doesn't make any sense.



## adolf512

The cited reason for moving to chiplets is usually "better yeild" but this just doesn't make any sense at all. It's not the case that any defect will result in a monolithic die having to be thown away. What nvidia and intel has been doing for a very long time is to simply disable the defective cores. Nvidia is has been very efficient in salvaging defective dies, just look at all the cards they make based off the ga102 chip.

Chiplets might make sense for the server space but it just doesn't make sense but it just doesn't make sense for things like gaming. Gaming has actually moved in the exact opposite direction, away from multi-GPU towards larger monolithic dies drawing as high as 600W. Sure you could theoretically add another die to the GPU but good luck cooling 1200W. 

The issue with chiplets is that it adds a latency, increases power consumption in addition to the extra cost the interconnects brings. 

We have a real-world comparison now with the 13900K(F) vs 7950x and here the monolithic option is not only better but also less expensive. 

One parhaps more viable approach for desktop is to for example have a separate gpu-die using a cheaper process but this doesn't really matter in terms of yeild since intel can currently just sell chips with defects in the GPU area as KF models (for just 25$ less).


----------



## Zver1

Defects are an issue. Same problem affects Samsung on making OLED screens. Easy to find enough defect free area for a phone size screen but extremely hard to get a defect free large screen TV. Intel and NVidia partially compensate by knocking out segments and calling it a lower model. A not bad solution when defect rate is low enough. It leaves the mix of chips outside the manufacturers control and still leaves lost chips where the defect is in a critical area.
Separately, The cost per chip climbs extremely fast with the shrinkage to a smaller node. I/O and cache functions can perform slightly better on larger(much cheaper) node so there are performance and cost benefits in mixing nodes. AMD's Infinity fabric has been refined over multiple generations resulting in a latency penalty that is much lower than Foveros. Intel's cost advantage is largely due to its use of a much cheaper 10nm process("Intel 7") which to its credit has been refined to be quite high performance for its size. The large node for the entire chip eliminates and benefit in moving I/O to a separate chiplet even if Intel had Infinity fabric. The much older/larger node also means dealing with a relatively low defect rate making the benefits of chiplets in reducing lost chips much lower then for AMD.
On GPUs, the need for extremely low latency has slowed down the adoption of chiplets. AMD's introduction of cache in the GPU(speed of cache access much better than VRAM access) has started to change this. Mix of cheaper lower defect 6nm cache and expensive 5nm core graphics chiplet, cuts production cost while allowing a lot of cache without defects climbing. NVidia is looking into doing this and has purchased its own high speed interconnect technology.


----------



## adolf512

Zver1 said:


> Intel and NVidia partially compensate by knocking out segments and calling it a lower model. A not bad solution when defect rate is low enough. It leaves the mix of chips outside the manufacturers control and still leaves lost chips where the defect is in a critical area.


Manufacturers do have the option to do "artificial binning" but then they lose out on money. Nvidia rather sell you a 3080ti than a 3080 12GB.

It does seem like nvidia opted for lowering the prices of the higher end ga102 models since too many people just wanted the base 3080 and this actually worked.


> Separately, The cost per chip climbs extremely fast with the shrinkage to a smaller node.


It can also make cooling more difficult due to the heat being concentrated in a smaller area. 


> I/O and cache functions can perform slightly better on larger(much cheaper) node so there are performance and cost benefits in mixing nodes.


This is the first time i hear someone claim that cache is faster on larger nodes.


> AMD's Infinity fabric has been refined over multiple generations resulting in a latency penalty that is much lower than Foveros. Intel's cost advantage is largely due to its use of a much cheaper 10nm process("Intel 7") which to its credit has been refined to be quite high performance for its size. The large node for the entire chip eliminates and benefit in moving I/O to a separate chiplet even if Intel had Infinity fabric. The much older/larger node also means dealing with a relatively low defect rate making the benefits of chiplets in reducing lost chips much lower then for AMD.


Surprisingly even efficiently is pretty good at low power budget









The main issue is that 2 e-cores are weaker than a single p-core for MT and this forces intel to deliver worse MT performance than AMD unless they push up the frequencies of the p and e cores well above the point where they are actually efficient.



> On GPUs, the need for extremely low latency has slowed down the adoption of chiplets. AMD's introduction of cache in the GPU(speed of cache access much better than VRAM access) has started to change this. Mix of cheaper lower defect 6nm cache and expensive 5nm core graphics chiplet, cuts production cost while allowing a lot of cache without defects climbing. NVidia is looking into doing this and has purchased its own high speed interconnect technology.


Still the total size of the AMD navi 31 dies is significantly smaller than the size of the AD102 die so AMD did not use the chiplet approach to push for larger total die-size. I am pretty sure nvidia could have made an even larger die but the 4090 already draws up to 450W by default and that's not even the full die.


----------



## Blameless

The yield advantage is real, as is the flexibility provided by modularity. The fewer transistors that need to be on high-end nodes and the fewer die flavors that need to be made, the simpler the product chain can be.

There are trade offs, but they aren't insurmountable and profit margins are what count.



adolf512 said:


> We have a real-world comparison now with the 13900K(F) vs 7950x and here the monolithic option is not only better but also less expensive.


Not less expensive to manufacture, which is the driving force behind these changes.

Once Intel gets the kinks worked out with their high profit enterprise chips, they will be moving to chiplets for mainstream parts. Not because it's better for you or I, but because it's better for them.



adolf512 said:


> Still the total size of the AMD navi 31 dies is significantly smaller than the size of the AD102 die so AMD did not use the chiplet approach to push for larger total die-size.


Total die area is considerably larger than it was with the prior generation.

Anyway, the MCDs are built on a cheaper node and the exact same MCD can go into every descrete RDNA 3 GPU product.


----------



## ZealotKi11er

Blameless said:


> The yield advantage is real, as is the flexibility provided by modularity. The fewer transistors that need to be on high-end nodes and the fewer die flavors that need to be made, the simpler the product chain can be.
> 
> There are trade offs, but they aren't insurmountable and profit margins are what count.
> 
> 
> 
> Not less expensive to manufacture, which is the driving force behind these changes.
> 
> Once Intel gets the kinks worked out with their high profit enterprise chips, they will be moving to chiplets for mainstream parts. Not because it's better for you or I, but because it's better for them.
> 
> 
> 
> Total die area is considerably larger than it was with the prior generation.
> 
> Anyway, the MCDs are built on a cheaper node and the exact same MCD can go into every descrete RDNA 3 GPU product.


Yep the MCD can be used for lower products in RDNA3 and they can just build 256-bit/320-bit with less MCDs while GCD remains the same.


----------



## CynicalUnicorn

adolf512 said:


> The cited reason for moving to chiplets is usually "better yeild" but this just doesn't make any sense at all. It's not the case that any defect will result in a monolithic die having to be thown away.


That's technically true but it's overly simplified. Remember, defects don't just occur in the cores or other bits that can be easily disabled. In addition the yields absolutely do increase. Let's say you're AMD. You're designing a new 64-core CPU using a single monolithic die that's 800mm^2, to pull a number out of my butt. You really want to be able to sell lots of fully-enabled 64-core parts. Let's say that there's a 5% chance of a defect in every 100mm^2 of the wafer, or a 95% chance of no defects. In order to get a perfect die, then, there is a 0.95^8 = 66% chance of a fully enabled die; each 100mm^2 region must be defect-free. You can make 88 dies per wafer and only 58 of them are perfect.

One of your engineers comes along and says that they can make chiplets, and instead of one 64-core die they can make eight 8-core dies instead. Each of these is 100mm^2, so there's a 95% chance of getting a perfect chiplet. You can produce 706 of these chiplets per wafer, and 670 of them are perfect. You only have 36 salvage dies now, corresponding to 4.5 CPUs! That means instead of 58 fully-enabled CPUs per wafer, you're making 83. That's a 43% increase in high-margin halo products just by switching over to chiplets.

(I'm simplifying here for the sake of an example, but this is an industry that will do everything in its power to improve yields or performance even if the gains are under 10% and my napkin math is comfortably above that.)

Yields aside though one major advantage of chiplets is that manufacturers can mix and match process nodes. AMD has taken advantage of this, mixing N5 and N6 for Zen 4 and RDNA 3. Processor cores benefit from N5, but things like memory controllers and cache really don't, so AMD's use of chiplets allows them to minimize the use of the expensive bleeding-edge process for logic that doesn't really benefit from it. Go back in time to Zen 2 and it was even more extreme, where cheap, high-yielding 12nm/14nm constituted a huge portion of the silicon used for both Ryzen and Epyc while the new and expensive 7nm silicon was reserved for the cores that really needed it.

Just watch what happens with GPU prices this generation. Napkin math suggests that the 7900 XTX should be within spitting distance of a 4090, but it's $600 less at MSRP and I honestly would not be surprised if AMD's margins will be higher than Nvidia's. AMD has a lot more room to cut prices _because_ of chiplets. Nvidia's monolithic AD102 is inherently more expensive. That's why chiplets are the future: cost. There is no other reason. In a perfect world we'd see perfect monolithic dies every time, and even in such a perfect world chiplets would provide a cost advantage because you can cram more of them into the corners of a wafer.


----------



## adolf512

CynicalUnicorn said:


> That's technically true but it's overly simplified. Remember, defects don't just occur in the cores or other bits that can be easily disabled. In addition the yields absolutely do increase. Let's say you're AMD. You're designing a new 64-core CPU using a single monolithic die that's 800mm^2, to pull a number out of my butt. You really want to be able to sell lots of fully-enabled 64-core parts. Let's say that there's a 5% chance of a defect in every 100mm^2 of the wafer, or a 95% chance of no defects. In order to get a perfect die, then, there is a 0.95^8 = 66% chance of a fully enabled die; each 100mm^2 region must be defect-free. You can make 88 dies per wafer and only 58 of them are perfect.


Seems fine since a lot of the other 34% can be used for a 48-core version and still sell for a lot of money.



> one major advantage of chiplets is that manufacturers can mix and match process nodes. AMD has taken advantage of this, mixing N5 and N6 for Zen 4 and RDNA 3. Processor cores benefit from N5, but things like memory controllers and cache really don't, so AMD's use of chiplets allows them to minimize the use of the expensive bleeding-edge process for logic that doesn't really benefit from it.


While this does make some sense the actual IO portion of the 13900K isn't particularly large. 










The issue is that if there is an defect there they might end up having to throw away the entire chip.



> Just watch what happens with GPU prices this generation. Napkin math suggests that the 7900 XTX should be within spitting distance of a 4090, but it's $600 less at MSRP and I honestly would not be surprised if AMD's margins will be higher than Nvidia's. AMD has a lot more room to cut prices _because_ of chiplets. Nvidia's monolithic AD102 is inherently more expensive. That's why chiplets are the future: cost. There is no other reason. In a perfect world we'd see perfect monolithic dies every time, and even in such a perfect world chiplets would provide a cost advantage because you can cram more of them into the corners of a wafer.


AMD cherrypicked games where they looked good (such as the very AMD friendly AC valhalla), the 7900xtx isn't coming close to the 4090 in raytracing or productivity.


----------



## adolf512

The reason the 4090 is 1600$ is not because of the cost of the die (which isn't even 20% of that) the reason it's 1600$ is that nvidia has 30-series to sell and they do not want to lower the price on these further.

4090: 608 mm² die

4080 16GB: 379 mm² die

Is cost was linear to die-size the 4080 16GB would be 997$

So you actually get less die-area for your $ when you buy the cheaper 4080.

Both GPUs are cut down with the 4090 having a bigger % disabled (it only has 128 of 144 SMs).

Nvidia could probably sell the 4080 16GB for 900$ and still make a profit but they are not going to unless forced due to competition (and people actually switching to AMD).


----------



## Blameless

Not all yield issues are catastrophic (parametric yields are also important), but they do all scale with area. Chiplets also remove the need to distinguish between server space and consumer space for many products.

The CCD in a Ryzen 5 is the exact same chiplet that is in the highest core count EPYC. Where Intel would have a half-dozen different die flavors, AMD has one universal CCD and two different IODs (one consumer, one not).



adolf512 said:


> While this does make some sense the actual IO portion of the 13900K isn't particularly large.
> 
> View attachment 2581051


Moving everything that isn't the cores, ring, and L3 to a separate IOD would still trim die area by 25-30% and what was left could be chopped up even further.

The trade offs might not be worth it for Intel, but if they didn't make sense for AMD, AMD wouldn't be doing it. It's not like they can't design large die parts, or don't have access to a foundry that can make them.


----------



## EastCoast

This certainly reads as a nonsensical "marketing" thread. Ignoring obvious factors decided and creating a mindshare bubble that doesn't exist. 
We have to prove a double negative? LOL.


----------



## adolf512

Blameless said:


> If they didn't make sense for AMD, AMD wouldn't be doing it.


Corporations do dumb stuff all the time. Just look how microsoft ****ed up with xbox 360 (RROD) and xbox one (mandatory kinect).


----------



## Blameless

adolf512 said:


> Corporations do dumb stuff all the time. Just look how microsoft ****ed up with xbox 360 (RROD) and xbox one (mandatory kinect).


Vaguely successful corporations don't double down on the same dumb stuff for several product generations.

Much of AMD's recent success can easily be traced back to the simplification chiplets have enabled.


----------



## adolf512

Blameless said:


> Much of AMD's recent success can easily be traced back to the simplification chiplets have enabled.


No they become successful on the CPU market due to intel not getting their **** together. 

The chiplet approach seems to have worked in the server segment but monolithic dies are probably still the way to do for the desktop segment (when you don't benefit from having like 64 cores).


----------



## Pee-C

We are at the point right now where GPUs having heatsinks like a CPU heatsink is almost becoming a necessity. If gamers have no problem using a Noctua D-14 on their CPU, then it's time to accept the same fate for the GPU as well. I think we need a redesign of the PC case as well as the ATX motherboard design.. A design that supports GPUs vertically just like how a CPU is installed. coolers that can actually dissipate heat without needing super loud fans.
The PCIe slot as we know it was never designed for such a massive power hungry device as a GPU like we have today. If it were designed today, it would probably look radically different. But GPU makers have to accomodate for the way motherboards are designed, and it is becoming a hinderance. They could in theory design motherboards were the GPU is mounted flush against it and it is supported by a backplate just like a CPU, but this would require us to abandon the ATX standard where the GPU has to hang precariously horizontally off the motherboard, with monstrous power cables connected to it, instead of designing a "gpu socket" which supplies all of the power it needs.

Yeah, we have PCIe riser cables which can mount GPUs vertically in those tini mini-itx cases, but that's not what I mean. It's time we bite the bullet and accept that the ATX design is a hindrance for GPUs, along with the power delivery limitations of the PCIe slot.


----------



## Arni90

adolf512 said:


> No they become successful on the CPU market due to intel not getting their **** together.
> 
> The chiplet approach seems to have worked in the server segment but monolithic dies are probably still the way to do for the desktop segment (when you don't benefit from having like 64 cores).


Have you considered the possibility that desktop AM4/AM5 isn't AMD's biggest focus anymore?
Being able to use the scraps that don't make the cut for servers on desktop is a great way to improve margins.


----------



## Arni90

Pee-C said:


> I think we need a redesign of the PC case as well as the ATX motherboard design.. A design that supports GPUs vertically just like how a CPU is installed. coolers that can actually dissipate heat without needing super loud fans.


Here's my suggestion: vertical mounting is bad. Mount the coolers to the case, then strap the PCB of the GPU and CPU to it horizontally. Have the power delivery and memory chips on the other side of the PCB with a small heatsink and some airflow.


----------



## adolf512

Pee-C said:


> We are at the point right now where GPUs having heatsinks like a CPU heatsink is almost becoming a necessity.


A 4090 can draw up to 600W (with increased power-budget) there is no CPU that come even close to that. Maybe if you clock a 13900K to the max you will reach about the same there. I am pretty sure an NH-D15 wouldn't be enough for a 4090 pushed to 600W (or even the default 450W).

The default power-limit (which a lot of motherboards will just ignore) of the 13900K is just 253W. 

The 4090ti will have 11% or 12.5% more shaders than the 4090 so it will draw around 500W for the same clock. The issue is that you might only get like 7% more performance out of the extra 14 or 16 SMs so they also need to push for even higher clockspeed to get a significant increase in performance.


----------



## Arni90

adolf512 said:


> A 4090 can draw up to 600W (with increased power-budget) there is no CPU that come even close to that. Maybe if you clock a 13900K to the max you will reach about the same there. I am pretty sure an NH-D15 wouldn't be enough for a 4090 pushed to 600W (or even the default 450W).
> 
> The default power-limit (which a lot of motherboards will just ignore) of the 13900K is just 253W.
> 
> The 4090ti will have 11% or 12.5% more shaders than the 4090 so it will draw around 500W for the same clock. The issue is that you might only get like 7% more performance out of the extra 14 or 16 SMs so they also need to push for even higher clockspeed to get a significant increase in performance.


The 10980XE can easily exceed 800W without liquid nitrogen.
An NH-D15 should be easily able to deal with a 4090, the heat density on GPUs is significantly smaller than on CPUs


----------



## Blameless

adolf512 said:


> The chiplet approach seems to have worked in the server segment but monolithic dies are probably still the way to do for the desktop segment (when you don't benefit from having like 64 cores).


This is a false dichotomy.

From a silicon manufacturing perspective, desktop, HEDT, Server, HPC, etc, are all the same product, and that is the key advantage of chiplets. AMD can make one part (a Vermeer CCD, for example) and then bin them to cover half the products they make. Yields are great, not only because a CCD is tiny, but because there is one enormous pool of samples from which to bin, meaning there is enough parts for almost any combination of parameters that you'd want in any product. Need a single CCD that clocks extremely high for the 5800X or CCD 0 of a 5950X? You have plenty of samples to choose from. Need a CCD that clocks moderately well, but is very low leakage and sips power so you can put it eight of them in a Milan-X and not have it overheat? You've got plenty of samples to choose from. Any anything that doesn't make the cut for the high margin parts can be stuffed in the mid-range consumer stuff. You only had to make one part to do this, and the economies of scale are great.

The rest of the CPU (the IOD) can be made on a trash node by GlobalFoundries, or whoever, for peanuts. So, you save most of your expensive TSMC current node wafer allocation for a single, high-yield part.

The monolithic alternative means you're dividing your resources among a dozen different die flavors to target specific market segements. You can't use different nodes to save wafer area. Your yields--both parametric and catastrophic--will be worse. And the total pool of parts to bin from will be much smaller because it has to come from within a compatable die flavor.


----------



## ozlay

Nvidia has basically reached the limits of monolithic. Just like intel has. They might get another 10%-20% out of their next gpu. But after that they are going to have to switch to a chiplet design. No one really wants a 900w GPU.


----------



## adolf512

ozlay said:


> Nvidia has basically reached the limits of monolithic. Just like intel has. They might get another 10%-20% out of their next gpu. But after that they are going to have to switch to a chiplet design. No one really wants a 900w GPU.


Switching to chiplets isn't going to solve the issue of high power-consumption, it does the exact opposite by adding latency and power-consumption from the interconnects. 

The issue with GPUs is similar to that of CPUs, eventually adding more cores isn't going to add more performance/efficiency. With GPUs you also need to push up the frequency to reach the best performance since you cannot just rely on having a lot of shaders (they only scale to a point).

The top monolithic nvidia GPUs are already so big that if you actually load close to the full die the chip will throttle badly since even the minimum voltage of 0.712v is too high.


----------



## adolf512

Blameless said:


> From a silicon manufacturing perspective, desktop, HEDT, Server, HPC, etc, are all the same product, and that is the key advantage of chiplets.


You don't need to release chiplets to the consumer-market for that. You can have a monolithic 16-core versions for the gaming market and then have like 4x16 for the server market.


----------



## Blameless

adolf512 said:


> You don't need to release chiplets to the consumer-market for that. You can have a monolithic 16-core versions for the gaming market and then have like 4x16 for the server market.


Excepting the specifics of core count, this is what AMD did for Zen and Zen+.

Problem was that each die had a bunch of redundant stuff on it for the HEDT and enterprise parts, as well as memory locality issues that don't exist with a centralized IOD. Moving to chiplets with Zen 2 did have trade-offs, but it was evidently more profitable and ultimately allowed for better parts than would otherwise have been sold.

Sixteen cores is also more than needed for the mainstream consumer, or gaming, CPU market.


----------



## adolf512

Blameless said:


> Problem was that each die had a bunch of redundant stuff on it for the HEDT and enterprise parts, as well as memory locality issues that don't exist with a centralized IOD.


It will not neccessarily be "redundament". I am pretty sure all memory controllers are going to be utilized for the top sapphire rapids CPU









It remains to be seen what intel can deliver but this looks a lot better than AMDs approach.



> Sixteen cores is also more than needed for the mainstream consumer, or gaming, CPU market.


That comment isn't going to age well. 

There are already plenty of games that use more than 8 threads, this is why 13900K often performs better with e-cores on.


----------



## umeng2002

Well, feel free to submit your resume to AMD.


----------



## adolf512

umeng2002 said:


> Well, feel free to submit your resume to AMD.


I would rather work for nvidia or intel. 

nvidia is like the polar opposite of AMD, they actually push for greatness instead of "you save 100$". Intel is a mixed bag but they seem to be moving in the right direction.

AMD might change for the better if sales are bad enough for zen4.


----------



## th3illusiveman

Lol, this thread reminds me of that guy on twitter trying to tell an actual astronaut how supersonic flight works. AMD is a multibillion dollar company with a team on PHD engineers working around the clock to deliver the best performance they can with the budget and power targets they have - in sure they know more about what they are doing with the GPUs they literally spend billions developing then some random dude on OCN. Its also never good to judge a product on it's first variant, Zen only really started picking up steam after further iterations.


----------



## adolf512

th3illusiveman said:


> AMD is a multibillion dollar company with a team on PHD engineers working around the clock to deliver the best performance they can with the budget and power targets they have.


If they knew what they were doing they wouldn't be losing badly to nvidia and intel.

Zen4 was an utter disaster.

AMD is fundamentally governed by their shareholders. If the shareholders are incompetent so will the decisions of the company. 

Corporations make dumb decisions all the time, just look at the metaverse disaster.


----------



## paulerxx1

adolf512 said:


> Seems fine since a lot of the other 34% can be used for a 48-core version and still sell for a lot of money.
> 
> 
> While this does make some sense the actual IO portion of the 13900K isn't particularly large.
> 
> 
> 
> The issue is that if there is an defect there they might end up having to throw away the entire chip.
> 
> 
> 
> AMD cherrypicked games where they looked good (such as the very AMD friendly AC valhalla), the 7900xtx isn't coming close to the 4090 in raytracing or productivity.


7900XTX is $600 less than the 4090, why are you comparing them...? Smooth brains be like:


----------



## adolf512

paulerxx1 said:


> 7900XTX is $600 less than the 4090, why are you comparing them...?


Becuase the 7900XTX is the best AMD will offer.


----------



## ZealotKi11er

What ever AMD is going through right now, Nvidia and Intel will at a latter date.


----------



## dagget3450

https://www.techradar.com/news/the-largest-cpu-in-the-world-just-got-a-massive-upgrade



> Thanks to the miniaturization possible with the 7nm process, the new gigantor of a chip offers 850,000 AI cores spread over 46,225 mm(2) of silicon.





> Cerebras also revealed the chip consumes the same 15kW of power as its predecessor, but provides twice the performance, again thanks mostly due to the new 7nm process.


----------



## Blameless

adolf512 said:


> It will not neccessarily be "redundament". I am pretty sure all memory controllers are going to be utilized for the top sapphire rapids CPU


You're still talking about things from a user perspective, which is why none of this seems to make any sense to you. Chiplets aren't for users, they are for manufacturers. From Intel's perspective, all of those separate function blocks would be better off as discrete chiplets.

Intel has been pushing this stuff for years, but their HC34 slides are their most up to date public information:








Intel Enters a New Era of Chiplets that will Change Everything


At Hot Chips 34, Intel hinted at how its vision for chiplets may disrupt the largely structured industry that we have seen for decades




www.servethehome.com












Hot Chips 34 – Intel’s Meteor Lake Chiplets, Compared to AMD’s


During a presentation at Hot Chips 34, Intel detailed how their upcoming Meteor Lake processors employ chiplets. Like AMD, Intel is seeking to get the modularity and lower costs associated with usi…




chipsandcheese.com





Also, NVIDIA's COPA proposal, which explains, both in general, and for a specific market, why chiplets/disaggregation is the future:








GPU Domain Specialization via Composable On-Package Architecture | ACM Transactions on Architecture and Code Optimization


As GPUs scale their low-precision matrix math throughput to boost deep learning (DL) performance, they upset the balance between math throughput and memory system capabilities. We demonstrate that a converged GPU design trying to address diverging ...




dl.acm.org







adolf512 said:


> View attachment 2581124
> 
> It remains to be seen what intel can deliver but this looks a lot better than AMDs approach.


That _was_ AMD's old approach with Zen and Zen+. Intel is just a little behind.



adolf512 said:


> There are already plenty of games that use more than 8 threads, this is why 13900K often performs better with e-cores on.


More cores on a monolithic die can certainly help some games, has helped some games for a very long time, and will help more as time goes on, but that's not automatically a good reason to build one.

The fastest possible gaming part is probably not a very good product.



adolf512 said:


> That comment isn't going to age well.


The statement had nothing to do with how many cores is best for games, but how many cores is best to put on a gaming focused CPU, which are very different things. There was also no implication that this figure would remain static; building a monolithic CPU die to age well is not practical, but chiplets give more flexibility.



adolf512 said:


> AMD is fundamentally governed by their shareholders. If the shareholders are incompetent so will the decisions of the company.
> 
> Corporations make dumb decisions all the time, just look at the metaverse disaster.


Shareholders in actual corporations nominate a board of directors that nominate officers. By and large, these semiconductor design firms we are talking about are competently run...well, except maybe Samsung...because the shareholders want to make money.

Meta isn't an analogous situation. Zuckerberg controls 90% of the company's preferred shares, each of which has ten times the voting power of public shares, which ultimately gives him 54.4% of the total voting power of the company. He's also the CEO. It may as well be a sole proprietorship rather than a corporation. He can unilaterally run it into the ground and none of the other shareholders can stop him. None of the corporations we're talking about are in a similar position.



adolf512 said:


> I would rather work for nvidia or intel.
> 
> nvidia is like the polar opposite of AMD, they actually push for greatness instead of "you save 100$". Intel is a mixed bag but they seem to be moving in the right direction.


Both NVIDIA and Intel are moving away from monolithic parts toward chiplets. Three or four architectural generations from now most NVIDIA GPUs and Intel CPUs are going to be collections of small dies of different manufacturing processes.

Anyway, if your irrational bias against AMD is your hangup, take a look at Intel's Hotchips slides again.


----------



## adolf512

ZealotKi11er said:


> What ever AMD is going through right now, Nvidia and Intel will at a latter date.


No that's not going to happen. If nvidia or intel does some chiplet approach it will be a lot better executed than anything we have seen from AMD so far.

If chiplets or more than one GPU on a graphic make a comeback it will not be for gaming. Cooling would be a problem with 4090x2 though.


----------



## adolf512

I have looked at meteor lake and it does seem like a downgrade in some sense. According to rumors they will reduce the number of p-cores to just 6 which is a movement in the wrong direction.

But unlike AMD they will not try to split cores into multiple chiplets so the L3 cache will remain unitied. 

Having a separate io-die will probably add latency when reading or writing to ram, not ideal but hopefully better than the zen4 garbage (no longer 1:1 with the fabrik clock).


----------



## Imglidinhere

adolf512 said:


> I would rather work for nvidia or intel.
> 
> nvidia is like the polar opposite of AMD, they actually push for greatness instead of "you save 100$". Intel is a mixed bag but they seem to be moving in the right direction.
> 
> AMD might change for the better if sales are bad enough for zen4.


Man... you have it out for team red hardcore...

Also lol, you think Nvidia is "pushing for greatness"? Really? With their insane production costs and stupid high pricing margins? Reference 4090s go for $1600. AIB models start at $2200.

With AMD, they can probably price at $1200 for their top SKUs and still pull profit without half the investment going into things. Cost savings matter a lot. Chiplets are the future, and everyone laughed at AMD's Infinity Fabric when they first showed it off, but don't understand that it's an ever-evolving technology. It's like laughing at someone starting to learn an instrument and assuming that because they suck right now, they'll always suck and never improve.


----------



## Shenhua

Chiplet design is the next logical step. Yes it adds latency, but it has so many advantages over disadvantages.
It solves the problem with thermal density as you go down in node, which translates to lower heat, and lower voltage, despite wasting more on interconnects. Adds versatily since you can combine different nodes and architecture............ and it allows manufacturers to scale size. It is also a lot cheaper, thing which sadly does not translate to the consumer. Amd could sell those ryzens for half the price and still make a ton of profit.


----------



## umeng2002

adolf512 said:


> If they knew what they were doing they wouldn't be losing badly to nvidia and intel.
> 
> Zen4 was an utter disaster.
> 
> AMD is fundamentally governed by their shareholders. If the shareholders are incompetent so will the decisions of the company.
> 
> Corporations make dumb decisions all the time, just look at the metaverse disaster.


Why do you think Zen4 is a disaster? Because Microcenter told some Youtubers that it wasn't selling well? The mistake, again, is that AMD launched AM5 with only the expensive chipset first. They also priced the CPUs just a little too high based on the new competition from Intel.

The other odd thing too is that a lot of PC enthusiasts seem to be hung up on cache and memory latency. Literally, who cares if the performance is what the performance is? If Zen4 was beating Intel right now with a trillion millisecond memory latency, people here would still complain about the latency.

It's like complaining that an nVidia GPU is a disaster because it gets 500 fps at 1500 MHz instead of 500 fps at 2000 MHz.


----------



## adolf512

Blameless said:


> You're still talking about things from a user perspective, which is why none of this seems to make any sense to you. Chiplets aren't for users, they are for manufacturers. From Intel's perspective, all of those separate function blocks would be better off as discrete chiplets.


I got some news for you, there is something called competition. If AMD makes bad products people are going to look towards nvidia and intel which is exactly what is happening now. 

People wanted more powerfun single-GPU cards and this is what we got thanks to nvidia understanding the market. Sure you could say "SLI makes more sense for nvidia" since it's a lot easier to manufacture 2 less powerful cards and then combine them then having to engineer a card to handle like 600W but look where we are today, sli and crossfire is now dead.



Shenhua said:


> .
> It solves the problem with thermal density as you go down in node


We didn't exactly see that with zen4.



umeng2002 said:


> Why do you think Zen4 is a disaster?


If you look at which CPUs people are actually buying in stores you will see that in terms of popularity zen3 > raptor lake >> zen4


----------



## adolf512

Imglidinhere said:


> Also lol, you think Nvidia is "pushing for greatness"? Really?
> 
> With their insane production costs and stupid high pricing margins?


AMD doesn't come close when it come raytracing or productivity performance and this isn't going to change with RDNA3, in-addition AMD has a terrible reputation when it comes to basics like drivers.



> Reference 4090s go for $1600. AIB models start at $2200.


That's outright false. AIB 4090 cards also start at 1600$

Sure it's hard to find a 4090 for 1600$ but that's because so many people are trying to buy them.


----------



## Blameless

adolf512 said:


> According to rumors they will reduce the number of p-cores to just 6 which is a movement in the wrong direction.


E-cores provide more performance for a given die area than P-cores. P-cores are mostly only needed for serial tasks.

Intel will go back to eight P-cores after Metor Lake, for a while at least, but the long term trend, for everyone, for most products, will be fewer P-cores and a lot more E-cores.



adolf512 said:


> I got some news for you, there is something called competition.


And everyone will need to move toward disaggregation to remain competitive in the long term.

Competition is driving convergent evolution. Three or four generations from now, your Intel CPUs will be based on chiplet tiles and your AMD CPUs will have four times as many E-cores as P-cores, because this is how you get the most flexible and powerful products (in that order), with the fastest turn around time and highest margins, with the resources available.



adolf512 said:


> People wanted more powerfun single-GPU cards and this is what we got thanks to nvidia understanding the market. Sure you could say "SLI makes more sense for nvidia" since it's a lot easier to manufacture 2 less powerful cards and then combine them then having to engineer a card to handle like 600W but look where we are today, sli and crossfire is now dead.


Not sure what those assertions have to do with anything.

A chiplet based GPU, even one with multiple logic dies, is still a single GPU and will still function like a single GPU.


----------



## adolf512

Blameless said:


> A chiplet based GPU, even one with multiple logic dies, is still a single GPU and will still function like a single GPU.


Of course but nvidia has good reasons not to go that route, especially not for the gaming market. 



> The interviewer then asked where the crossover point is with the industry moving down to 7nm and then onto 5nm… where is the crossover point for GPU chiplets to actually become worthwhile? To which Alben replied, “We haven’t hit it yet.”
> 
> But when it comes to gaming GPUs I’m not convinced we ever will. With CPUs it’s a lot easier to combine multiple chips together to work for a common processor-y goal on their specific workloads. And for GPUs simply chewing through large datasets or deep learning the hell out of something it’s a doddle too, but when your GeForce graphics card is trying to spit out multiple game frames rendered on multiple chiplets it’s a whole lot tougher.











Nvidia has “de-risked” multiple chiplet GPU designs – “now it’s a tool in the toolbox”


If it "becomes economically the right thing to do" Nvidia will go chiplet, but it's not there yet




www.pcgamesn.com





The chiplet 7900XTX will be a lot less competitive than what 6900xt was. You may have forgotten but AMD actually beat the 3090 in many games (this was more of the case of lower resolutions). Of course the 3090 was still the better cards due to way better productivity and RT performance but with 7900XTX the only thing AMD will really have going for them is displayport 2.1


----------



## adolf512

For gaming it's pretty clear that monolithic is the way to go to get the best performance. The reason AMD doesn't do that is that they prioritize competing in the server-space which is why gamers get sub-par CPUs when they buy from AMD. 

Making a 16 p-core monolithic CPU would be relatively easy, intel could have done that by ditching the e-cores and iGPU.



dagget3450 said:


> https://www.techradar.com/news/the-largest-cpu-in-the-world-just-got-a-massive-upgrade


This is a very interesting example of delivering a very large monolithic die with high yield (supposedly 100%). 

If you make sure no part of the CPU is vital you can use chips that has defects (disabling cores, disabling memory channels, etc).


----------



## adolf512

Blameless said:


> E-cores provide more performance for a given die area than P-cores.


The difference isn't particularly large with alder/raptor lake. An e-core takes up around 1/3 of the space a p-cores takes up while delivering around 40% of the MT performance.










I think it would have been better to ditch the iGPU and e-cores and do a model with 16-p cores and avx-512, maybe intel will do something like that next year, we will see.

If intel got avx-512 to work with the e-cores it would be a different situation but that hasn't happened yet and might not happen any time soon. Maybe intel could do some energy-efficient avx-512 implementation for the e-cores similar to what AMD did with zen4 but that would require years of engineering, unclear if intel will ever bother to do that.


----------



## Blameless

adolf512 said:


> Of course but nvidia has good reasons not to go that route, especially not for the gaming market.


They will have to go that route soon. They are likely near the reticle limit on current and foreseeable EUV lithography.

Intel is facing similar limitations:








Monolithic Sapphire Rapids


Absolute Reticle Limit




www.angstronomics.com







adolf512 said:


> The chiplet 7900XTX will be a lot less competitive than what 6900xt was.


That depends on what one means by competitive.

Again, you're stuck in the consumer perspective. Navi 31 is less expensive to make than any monolithic chip of equal or better performance would be. However competitive the RX 7900 XTX is, it would likely be less competitive if it were monolithic. Citing it's performance segment as a ding against chiplets is a false correlation.



adolf512 said:


> For gaming it's pretty clear that monolithic is the way to go to get the best performance.


For a given number of transistors, monolithic will always be faster, and not just for gaming, for everything. Lower latency, less transistors and less power spent on interconnects, etc...those are not trivial things.

But it's soon not going to be economical for much of anything, other than APUs and small low-power parts with high profit margins. The advantages of chiplets increasingly outweight their downsides.

Intel and NVIDIA will probably start at the top, using it as a way to bypass reticle limits and hedge against poor yields for high-margin parts. But as they gain experience and streamline operations dealing with chiplets, they will move down the product stack and most of their processors will become disaggregated.



adolf512 said:


> The difference isn't particularly large with alder/raptor lake. An e-core takes up around 1/3 of the space a p-cores takes up while delivering around 40% of the MT performance.


In Raptor Lake the Gracemont E-cores deliver 50-100% more performance per square mm than the Raptor Cove P-cores and have even better PPW. That's huge. It was a little less advantageous with Alder Lake, but still more than enough to justify their inclusion. The gap will skew more heavily toward E-cores as time goes on. We are way into diminishing returns with performance per transistor and this makes smaller cores make sense.



adolf512 said:


> I think it would have been better to ditch the iGPU and e-cores and do a model with 16-p cores and avx-512, maybe intel will do something like that next year, we will see.


There are no plans for any such part in the consumer space.

16 P-cores is impractical on a ring bus (the 14 ring stops on Raptor Lake is already on the edge, which is why Meteor will drop two P-cores until Intel can cluster more E-cores together) and mesh adds a lot of latency. Regardless, there is not going to be another mainstream Intel chip with more than 8 P-cores in the foreseeable future, if ever.

There are and will continue to be some Xeon SKUs that are all P-core (and others that are all E-core), but there is no profitable consumer market for such parts, unless they are low-end salvaged parts that just have their E-core disabled for market segmentation.

What would even be the point of 16 P-cores, especially if you dump the instruction set that is the only hard capability they have over an E-core? If you're playing lightly threaded games, four P-cores would suffice. If you're playing well threaded games, four P-cores would still suffice for the main game loop and synchronization threads, plus maybe the main render thread...everything else could be done faster with the rest of the space devoted to E-cores. Only huge batches of serial tasks justify piles of P-cores if E-cores are an option. Such uses are extremely rare in the consumer space.


----------



## ZealotKi11er

Did someone hurt @adolf512 

I dont know how a stupid chiplet 7600x with less cores is keeping up with 13600K.


----------



## adolf512

ZealotKi11er said:


> I dont know how a stupid chiplet 7600x with less cores is keeping up with 13600K.
> 
> View attachment 2581246


The issue with the 13600K is the low base clock in addition to only having 6 p-cores.

The 7600x and 7700x have all cores on the same die so you get less of a performance hit from the chiplet approach there. It's the 7950x and 7900x that take loses the most gaming performance from not being monolithic and having properly unified L3.


----------



## Shenhua

adolf512 said:


> We didn't exactly see that with zen4.


Zen 3 and zen 2 have the same problem to a lesser extent for each generation. However that's because they're pushing the clocks way past the wall.....
7900-7950x tunned, you can shave like half the power for the same multicore score. And no, im not joking, nor i am delusional.
Drop power limits to 130w, cap max clock 300mhz lower, slide the CO to -30 and you're done. 7 out of 10 are gonna be stable with that settings, without further adjustment to CO.


----------



## ZealotKi11er

adolf512 said:


> The issue with the 13600K is the low base clock in addition to only having 6 p-cores.
> 
> The 7600x and 7700x have all cores on the same die so you get less of a performance hit from the chiplet approach there. It's the 7950x and 7900x that take loses the most gaming performance from not being monolithic and having properly unified L3.


Lol. 7600x is also 6 core and also has iod chiplet.


----------



## adolf512

The hardware unboxed results where the 7600x slightly beats the stock 13600K is an outlier results, if you look at reviews overall the 13600K at stock is still faster and it*s also much better for productivity.



ZealotKi11er said:


> Lol. 7600x is also 6 core and also has iod chiplet.


Yes it has an io-die chiplet which adds latency when readying or writing to ram. There isn't however cores separated into more than 1 die.

With the 7950x one of the chiplets is basically useless due to the latency pentaly that comes from actually utilizing more than 8 cores. The 13900K also scales poorly past 8 threads but that's due to only having 8 p-cores. The bigger issue with the 7950x is due to the infinity fabric latency you half the cache is basically useless so while it does have 2x the cache of the 7700x it doesn't get any performance benefit from that extra cache since it takes too long time for it to access it (to the point where it might aswell just read/write to RAM).


----------



## umeng2002

Now that Intel finally managed to beat AMD, chippers are now bad? Yet they were great the previous generation?

Unless we know the margins on each CPU product, which Intel and AMD don't share, no one can really say if chiplets are useless.

We do know that Intel has nearly infinite money to absorb cost to get their market share while AMD needs to watch their margins more closely, yet they keep moving to chiplets.

Large, monolithic dies are a thing of the past, imho.


----------



## adolf512

umeng2002 said:


> Now that Intel finally managed to beat AMD, chippers are now bad? Yet they were great the previous generation?


AMD hasn't beaten intel in gaming since the old athlon fx CPUs.

Alder lake when tuned beat 7950x.

The chiplet approach is arguably to blame for AMD never taking back the gaming crown (while they were great for productivity at times).

I bought 3600 because it was cheap and i was hoping to be able to upgrade to a much better zen3 CPU later but zen3 was meh and i didn't really need a better CPU much for that period anyway. Zen3 was also very expensive until it became obsolete.


----------



## adolf512

umeng2002 said:


> Large, monolithic dies are a thing of the past, imho.


The 4090 say otherwise, AMD isn't even coming close to beating that. 

AMD is losing across the board with their chiplets, makes RDNA2 look good in comparison.


----------



## Imglidinhere

adolf512 said:


> *That's outright false. Reference 4090 cards also start at 1600$*


That's literally what I said. Thank you for... correcting what I said with a quote of what I said?


----------



## adolf512

Imglidinhere said:


> That's literally what I said. Thank you for... correcting what I said with a quote of what I said?


Several AIB models of 4090 have been sold for 1600$


----------



## Imglidinhere

adolf512 said:


> Several AIB models of 4090 have been sold for 1600$


Link.


----------



## adolf512

Imglidinhere said:


> Link.





Unavailable


----------



## Blameless

Blaming AMD's perceived failings on their use of chiplets is like blaming atmospheric CO2 concentrations on a decline in high-seas piracy. You can observe loose correlation, but it's coincidental, and clearly attributable to other factors. Suggesting causation is batshit insanity.



Imglidinhere said:


> Link.


Most every AIB has a $1600 model.

US examples:






NVIDIA GeForce RTX 4090 GPUs / Video Graphics Cards - Best Buy


Shop for NVIDIA GeForce RTX 4090 GPUs / Video Graphics Cards at Best Buy. Find low everyday prices and buy online for delivery or in-store pick-up




www.bestbuy.com













Shipped by Newegg,GeForce RTX 4090 GPUs / Video Graphics Cards | Newegg.com


Shop Shipped by Newegg,GeForce RTX 4090 GPUs / Video Graphics Cards on Newegg.com. Watch for amazing deals and get great pricing.




www.newegg.com













Graphics Cards for PC & Gaming (GPU) | RTX Graphics Cards


Buy Graphic Display Cards from top brands like EVGA, HP, MSI and ASUS. Visit us to see our large inventory. Decades of great prices and unmatched service..




www.bhphotovideo.com


----------



## doom26464

I just keep looking at yield data and cost to build got Chiplet tech as it's strong point. 

Amd might be able to make 7900xtx and 7900 xt in good volumes. Once they bin up a few of there golden samples they can do a 7950

Nvidia on big die and cutting edge node yields for that 4nm node and cost has me like hrmmmm


----------



## betam4x

th3illusiveman said:


> Lol, this thread reminds me of that guy on twitter trying to tell an actual astronaut how supersonic flight works. AMD is a multibillion dollar company with a team on PHD engineers working around the clock to deliver the best performance they can with the budget and power targets they have - in sure they know more about what they are doing with the GPUs they literally spend billions developing then some random dude on OCN. Its also never good to judge a product on it's first variant, Zen only really started picking up steam after further iterations.


Yes, that tends to happen on the internet.


adolf512 said:


> The issue with the 13600K is the low base clock in addition to only having 6 p-cores.
> 
> The 7600x and 7700x have all cores on the same die so you get less of a performance hit from the chiplet approach there. It's the 7950x and 7900x that take loses the most gaming performance from not being monolithic and having properly unified L3.


Which is a non-issue IMO. The 7950X is a great gaming and productivity chip. I know because I have one. It is much faster than my 5950X.

Any perceived lack of performance (a debatable claim at best) isn’t due to the chiplet based design. SOME applications may be sensitive to the design, but most are not.

If you are a gamer, you absolutely should avoid both the Intel and AMD parts because AMD will have V-Cache parts drop next year.

Moving on: Excluding Radeon for a moment, chiplets are for “cost” and margin. 1 chiplet, several different market segments. The IO die costs 30-40% less by being on N6 vs N5. AMD can mix and match parts like legos. They are extending this to the laptop market as well. Because AMD only had to design 1 chip, R&D costs are much lower. They can also easily reallocate capacity where needed when supply constrained.

The comments about Zen 4 being a disaster are comical to me. Just because it isn’t a chart topper doesn’t mean AMD isn’t selling chips.


----------



## Imglidinhere

Blameless said:


> Most every AIB has a $1600 model.
> 
> US examples:
> 
> 
> 
> 
> 
> 
> NVIDIA GeForce RTX 4090 GPUs / Video Graphics Cards - Best Buy
> 
> 
> Shop for NVIDIA GeForce RTX 4090 GPUs / Video Graphics Cards at Best Buy. Find low everyday prices and buy online for delivery or in-store pick-up
> 
> 
> 
> 
> www.bestbuy.com
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Shipped by Newegg,GeForce RTX 4090 GPUs / Video Graphics Cards | Newegg.com
> 
> 
> Shop Shipped by Newegg,GeForce RTX 4090 GPUs / Video Graphics Cards on Newegg.com. Watch for amazing deals and get great pricing.
> 
> 
> 
> 
> www.newegg.com
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Graphics Cards for PC & Gaming (GPU) | RTX Graphics Cards
> 
> 
> Buy Graphic Display Cards from top brands like EVGA, HP, MSI and ASUS. Visit us to see our large inventory. Decades of great prices and unmatched service..
> 
> 
> 
> 
> www.bhphotovideo.com


Fair, point taken.

Still, the only 4090s in stock are for $2000+ pricing. I wonder why.


----------



## Marios145

Complete lack of understanding engineering.

AMD creates one chiplet, uses it from desktop to server to HPC to supercomputers to custom designs.

Manufacturing design masks cost at least 70-100M $ each, or used to in the past, who knows how much they cost now.
By having the exact same chiplet used in 5-10 different markets, you just saved at least half a billion.

All R&D goes into the same design - much higher potential for future technologies.

During manufacturing, defects become predictable and usually they are very tiny
Manufacturer (TSMC, Intel etc..) knows where they will be.

When you have 8 cores with just logic, and minimal IO you get around a 100-150mm2 die.
When your dies are 32 cores with 100PCI-E lanes and 8 channel RAM that results into 700mm2, and you have 3 defects randomly around the wafer, you will always lose 3 full chips, or it might hit the PCIE lanes, or the memory controllers.
8 channel IMC and 100 PCIE lanes can easily take 30-50% of the die, so that would be 250-350mm2.

New node is out, logic still benefits and without any other architectural changes, your 32 core area will drop from let's say 350-400mm2 to 200mm2.
Meanwhile those 150mm2 chiplets, will drop down to 70-90mm2

But...advances to a smaller node are no longer providing scaling for IMC, PCIE and other IO, which means that from your IO area which would be around 250-350mm2 you will save 50mm2 if you're lucky.
Best case, a process advantage will give you a final monolithic die size of 400-500mm2.

This might force longer paths on-chip between the many cores, algorithms to keep the latency consistent within chip, design around signal areas and many drawbacks during design that have to accomodate signal integrity for your chip logic.

Then, there are defects that kill the chip and there are "defects" like areas on the wafer that produce lower quality of silicon, that cannot hit target clocks speeds.

If you have a huge monolithic die, that might result in 5/32 cores working without being able to reach target clockspeed at target voltage.
This will force you to drop clocks for all cores.

Meanwhile the chiplet approach, will generally give equal 8core dies and you can bin the best for 32 cores.

Wafer size is limited, wafer capacity is limited. Production takes 3 months from start to finish

If you need to allocate more wafers to a different mask to react to market, it will take at least 3 months.

The amount of monolithic dies you get from a wafer will be 75, process node advantage will produce 110.
The amount of chiplets you get from a wafer will be 410, process node advantage will produce 750.

If all the dies are perfect, this would give you 75 or 110 monolithic 32-core CPUs VS 110 or 187 chiplet CPUs.

With 10 defects on the wafer, you lose 13% of your production on a monolithic die, or 2.4% from chiplets.
On the new node that becomes 9% vs 1.3% <- notice the scaling?


This is only scratching the surface of the economics behind these choices, because this also opens up hybrid mix of chiplets for custom designs

I bet that at this moment there are multiple teams in both nvidia and intel working their asses off to get a chiplet design and the interconnects.
EMIB by Intel already showed how much better and scalable it can become.


----------



## adolf512

betam4x said:


> If you are a gamer, you absolutely should avoid both the Intel and AMD parts because AMD will have V-Cache parts drop next year.


Given the gaming performance difference between 13900K and 7950x it does not seem like adding 3D cache is going to allow AMD to surpass intel in gaming. 

A lot of tests are GPU bottnecked to a very large extent causing the performance difference in gaming to seem smaller than it really is. At best the 7800x3d will beat the stock 13900K, it will not beat 13900KF that has been heavily overclocked.


> The comments about Zen 4 being a disaster are comical to me. Just because it isn’t a chart topper doesn’t mean AMD isn’t selling chips.


Zen4 sales are awful so yes it's a disaster. 

Given the price AMD is charging for zen4 CPU now they very much needed to top the charts.


----------



## adolf512

Marios145 said:


> Complete lack of understanding engineering.
> 
> AMD creates one chiplet, uses it from desktop to server to HPC to supercomputers to custom designs.
> 
> Manufacturing design masks cost at least 70-100M $ each, or used to in the past, who knows how much they cost now.
> By having the exact same chiplet used in 5-10 different markets, you just saved at least half a billion.
> 
> All R&D goes into the same design - much higher potential for future technologies.


Sure the chiplet approach might make R&D cheaper but you can also save money by not improving ipc much, seems like AMD did both with zen4.



> During manufacturing, defects become predictable and usually they are very tiny
> Manufacturer (TSMC, Intel etc..) knows where they will be.
> 
> When you have 8 cores with just logic, and minimal IO you get around a 100-150mm2 die.
> 
> When your dies are 32 cores with 100PCI-E lanes and 8 channel RAM that results into 700mm2, and you have 3 defects randomly around the wafer, you will always lose 3 full chips, or it might hit the PCIE lanes, or the memory controllers.
> 
> 8 channel IMC and 100 PCIE lanes can easily take 30-50% of the die, so that would be 250-350mm2.
> 
> New node is out, logic still benefits and without any other architectural changes, your 32 core area will drop from let's say 350-400mm2 to 200mm2.
> Meanwhile those 150mm2 chiplets, will drop down to 70-90mm2
> 
> But...advances to a smaller node are no longer providing scaling for IMC, PCIE and other IO, which means that from your IO area which would be around 250-350mm2 you will save 50mm2 if you're lucky.
> 
> Best case, a process advantage will give you a final monolithic die size of 400-500mm2.


The chiplet approach does make more sense for the data-center segment since it isn't as sensitive to latency and has much larger IO-die-area.



> Then, there are defects that kill the chip and there are "defects" like areas on the wafer that produce lower quality of silicon, that cannot hit target clocks speeds.
> 
> If you have a huge monolithic die, that might result in 5/32 cores working without being able to reach target clockspeed at target voltage.
> This will force you to drop clocks for all cores.
> 
> Meanwhile the chiplet approach, will generally give equal 8core dies and you can bin the best for 32 cores.


There really isn't much of a point having all 32 cores be able to reach like 6ghz.

I have seen talk about cherrypicking chiplets for better performance but we really hasn't seen anything good come from that.



> I bet that at this moment there are multiple teams in both nvidia and intel working their asses off to get a chiplet design and the interconnects.
> EMIB by Intel already showed how much better and scalable it can become.


Yes it might be the case that they are able to come up with something much better than the garbage AMD implementation. 

With meteor lake intel will still keep all cores on the same die but they will have a separate io-die might might add latency when it comes to ram read or write.


----------



## Blameless

Imglidinhere said:


> Still, the only 4090s in stock are for $2000+ pricing. I wonder why.


Because all the cheap ones are going to sell out until demand abates or supply increases.

The RTX 4090 is selling quite well though, given it's price segment.


----------



## dagget3450

adolf512 said:


> For gaming it's pretty clear that monolithic is the way to go to get the best performance. The reason AMD doesn't do that is that they prioritize competing in the server-space which is why gamers get sub-par CPUs when they buy from AMD.
> 
> Making a 16 p-core monolithic CPU would be relatively easy, intel could have done that by ditching the e-cores and iGPU.
> 
> 
> This is a very interesting example of delivering a very large monolithic die with high yield (supposedly 100%).
> 
> If you make sure no part of the CPU is vital you can use chips that has defects (disabling cores, disabling memory channels, etc).


You glossed over my point though...

This was an example of the insanity of large monolithic dies..

Insane power requirements, and insane price as that Cerebras CPU is a cool couple million bucks.


----------



## adolf512

dagget3450 said:


> You glossed over my point though...
> 
> This was an example of the insanity of large monolithic dies..
> 
> Insane power requirements, and insane price as that Cerebras CPU is a cool couple million bucks.


The power-requirements wouldn't have been any less with a chiplet approach. The more total die-area the larger power-consumption. 

Even if nvidia could stitch 2 4090 dies together without latency there would still be the issue of cooling. To get double the compute TFLOPs you would also need to double the power-budget to 900W and from that you would maybe get 50% better performance due to poor scaling (difficulty actually utilizing all the shaders).


----------



## Imglidinhere

adolf512 said:


> The power-requirements wouldn't have been any less with a chiplet approach. The more total die-area the larger power-consumption.
> 
> Even if nvidia could stitch 2 4090 dies together without latency there would still be the issue of cooling. To get double the compute TFLOPs you would also need to double the power-budget to 900W and from that you would maybe get 50% better performance due to poor scaling (difficulty actually utilizing all the shaders).


You talk like someone who has zero knowledge of how anything works.


----------



## ThatGuyJD

adolf512 said:


> Making a 16 p-core monolithic CPU would be relatively easy, intel could have done that by ditching the e-cores and iGPU.





adolf512 said:


> I think it would have been better to ditch the iGPU and e-cores and do a model with 16-p cores and avx-512, maybe intel will do something like that next year, we will see.


And then have it pull 600W at stock or what, with absolutely uncoolable heat density that means it's unusable in consumer and you're going to need below ambient cooling and amazing thermal transfer cooling to keep it from just always throttling?


----------



## adolf512

Imglidinhere said:


> You talk like someone who has zero knowledge of how anything works.


It's basic physics. Chiplets have higher power-draw and worse performance for the same frequency.


----------



## adolf512

ThatGuyJD said:


> And then have it pull 600W at stock or what, with absolutely uncoolable heat density that means it's unusable in consumer and you're going to need below ambient cooling and amazing thermal transfer cooling to keep it from just always throttling?


The raptor lake p-cores are just fine in terms of efficiency when you dont insist on running them at like 6ghz.

Adding more cores generally makes MT more efficient since you then do not need to push frequencies as high for the same performance.


----------



## Mr.N00bLaR

adolf512 said:


> The raptor lake p-cores are just fine in terms of efficiency when you dont insist on running them at like 6ghz.
> 
> Adding more cores generally makes MT more efficient since you then do not need to push frequencies as high for the same performance.



At this point you are clearly just trolling. You have denied clear explanations. Your arguement only amounts to you do not like AMD products or as a company. While nobody else here seems to be willing to say you are clearly trolling, I will. You are trolling.

You have posted your opinion as fact several times and are only interested in arguing. This thread should be locked so future readers can find the good bits of info linked and the trolling can stop.


----------



## 99belle99

And his username Aldof. End of story.


----------



## Arni90

adolf512 said:


> The difference isn't particularly large with alder/raptor lake. An e-core takes up around 1/3 of the space a p-cores takes up while delivering around 40% of the MT performance.


It's more like 33% of the space with 60% of the MT performance in most cases on HWBot, which is a far better tradeoff than you're implying. x265 encoding sees double performance with E-cores enabled on a 13900K.

As for your suggestion of removing the iGPU and adding 8 more P-cores, what's stopping Intel from removing the iGPU, adding another 8 E-cores, and using the smaller area for cost savings?


----------



## adolf512

Arni90 said:


> It's more like 33% of the space with 60% of the MT performance in most cases on HWBot, which is a far better tradeoff than you're implying. x265 encoding sees double performance with E-cores enabled on a 13900K.


7-zip seems to do better with the e-cores than cinebench (around 50% of the p-core MT assuming linear performance scaling with frequency) with wprime i got 41%

Not sure where you got 60% from. Did you forgot that the p-cores have HT ?

The main issue with the current e-core implementation is that it resulted in loss of avx-512 support.

I wonder if the estrogen-cores at stock beat the r5 3600 at gaming, i need to do some benchmark before i upgrade.


----------



## ThatGuyJD

adolf512 said:


> The raptor lake p-cores are just fine in terms of efficiency when you dont insist on running them at like 6ghz.
> 
> Adding more cores generally makes MT more efficient since you then do not need to push frequencies as high for the same performance.


I'm not paying for 16 P cores to run them at half speeds. F that.
I don't care about power efficiency when I'm pushing OCs. Wasn't my argument for the 600W, was being unable to cool it at all.


----------



## adolf512

ThatGuyJD said:


> I'm not paying for 16 P cores to run them at half speeds. F that.


You don't actually need to reduce the frequency of the p-cores that much to drastically cut the power-consumption. 



> I don't care about power efficiency when I'm pushing OCs. Wasn't my argument for the 600W, was being unable to cool it at all.


Part of the issue seems to be poor IHS design, zen4 has a similar issue, not sure which is worse in that respect. 

With improved IHS or delidding 6 ghz on 16 p-cores should be achievable on water. If you go custom loop there isn't really any limit to how much heat you can cool off from the water (you can just add more radiators if needed), the limitation is the waterblock and IHS.


----------



## CynicalUnicorn

adolf512 said:


> Seems fine since a lot of the other 34% can be used for a 48-core version and still sell for a lot of money.


They'll sell for _less money_, is the thing. The question AMD is asking is not "can we make money?" but "how can we make the most money possible?" The answer to the first question is yes or no, but the answer to the second question is a number.



> While this does make some sense the actual IO portion of the 13900K isn't particularly large.
> 
> The issue is that if there is an defect there they might end up having to throw away the entire chip.


I didn't mention the 13900K or Intel, but your second point is a very good reason to use chiplets: if enough of the system agent doesn't work, then the cores and graphics have to get trashed too.




> AMD cherrypicked games where they looked good (such as the very AMD friendly AC valhalla), the 7900xtx isn't coming close to the 4090 in raytracing or productivity.


Yeah I'm factoring that in. If you look at how AMD and Nvidia compare in those particular titles then you can extrapolate and estimate the difference in all titles, which looks to be less than 10%.




adolf512 said:


> The reason the 4090 is 1600$ is not because of the cost of the die (which isn't even 20% of that) the reason it's 1600$ is that nvidia has 30-series to sell and they do not want to lower the price on these further.
> 
> 4090: 608 mm² die
> 
> 4080 16GB: 379 mm² die
> 
> Is cost was linear to die-size the 4080 16GB would be 997$


Then I'm not sure you understand yields at all because there shouldn't be a linear relationship between those quantities.


----------



## adolf512

CynicalUnicorn said:


> They'll sell for _less money_, is the thing. The question AMD is asking is not "can we make money?" but "how can we make the most money possible?" The answer to the first question is yes or no, but the answer to the second question is a number.


Nvidia solved that issue by having 5 different consumer ga102 variants so they could squeeze out every $ from their defective monolithic ga102 dies.



> I didn't mention the 13900K or Intel, but your second point is a very good reason to use chiplets: if enough of the system agent doesn't work, then the cores and graphics have to get trashed too.


It does seem like the chiplet approach has less drawback and more benefit when it's used only for the IO-section.


> Then I'm not sure you understand yields at all because there shouldn't be a linear relationship between those quantities.


Of course the percentage of dies with defects go up when the chip is larger but if we look at the pricing of nvidia GPUs we actually see that you get less enabled transistors/$ when you buy one of their cheaper versions. This is an indication that nvidia could have made a version with a significantly larger die if they wanted but they decided against that for various reasons.

The worst value 30-series cards right now is the 3050 and 3070ti, the 3070ti uses the full die which limits the supply (so nvidia cannot sell it for too cheap).

We don't really see the supposed issues with large monolithic dies when it comes to the 4090. Iẗ́s not worse value than variants with smaller dies (it's actually better value) and cooling isn't a problem either, the main problem was the 16-pin power-connector melting and you getting less performance with weaker PSUs.


----------



## Marios145

This guy is a troll


----------



## 99belle99

Doubters watch this.


----------



## 99belle99

He reckons AMD have a better card coming out next year.


----------



## adolf512

99belle99 said:


> Doubters watch this.


These AMD fanboy channels are desperate to spin AMDs latest chiplet failure into some victory.

RT performance is worse than 3090 in one of the games cherrypicked by AMD.

1% lows will be even worse.


----------



## CynicalUnicorn

I wonder if Adored still thinks about me. He got really mad at me on Twitter for saying that the obviously wrong things he was saying were obviously wrong, and then he blocked me, and more than a year later he referenced something I said which means he probably still gets mad thinking about me to this day. Funny as **** lmao.

I didn't realize he was back to making computer videos. After everybody cyberbullied him for mispredicting Zen 2 that badly, he pivoted to the sorts of rational skeptic videos that went out of vogue ten years ago after YouTube atheists got bored of debunking creationism. So I guess he's back to computers...? He does still have the website and afaik there's only one guy writing for it who actually knows what he's talking about. Adored paid one of his writers like 10% the industry rate, which is hilarious.

I used to have a folder of funny screenshots related to him but I deleted it so I could pretend that I'm above that sort of stupid drama. Oh well. tl;dr don't trust his videos.


----------



## adolf512

CynicalUnicorn said:


> I wonder if Adored still thinks about me.
> 
> tl;dr don't trust his videos.


This goes for fanboy channels in general.

LAwLz did a compilation on "mores law is dead" and his track-record is terrible.










That's another fanboy channel to avoid.


----------



## Blameless

adolf512 said:


> Of course the percentage of dies with defects go up when the chip is larger but if we look at the pricing of nvidia GPUs we actually see that you get less enabled transistors/$ when you buy one of their cheaper versions. This is an indication that nvidia could have made a version with a significantly larger die if they wanted but they decided against that for various reasons.


End user pricing of the GPUs doesn't indicate much of anything about yields, because the GPU die itself is a small fraction of total board cost and retail prices are set by marketing departments relative to the competition. Manufacturing costs are a much smaller influence. Even a -50% or +100% die cost would probably result in the same retail price, with NVIDIA absorbing or pocketing the difference.

If an entire RTX 4090 cost five dollars to make, the retail price would still be 1600 bucks. Because it has no competition they could probably raise prices if they needed to, but if it did have competition, it would cost 1600 bucks, or less, even if they had to sell it at a loss.

Hopper GH100 is just about the largest die that could be physically patterned. Yields can't be good, but since these things are going into enormously expensive parts, that's justifiable. AD100 is quite a bit smaller, and while yeilds are probably not spectacular, they're almost certainly good enough for what NVIDIA is trying to do.



adolf512 said:


> We don't really see the supposed issues with large monolithic dies when it comes to the 4090.


NVIDIA certainly does.

You've underestimated the loss from catastrophic defects, completely ignored parametric defects, and haven't taken into account the lost yields from simply having giant squarish dies on a circular wafer. Even if NVIDIA was magically seeing a defect rate of zero, ~24% of a 300mm wafer would still be lost because an entire AD102 cannot physically be placed in an area that is less than ~26*24mm (Die-Per-Wafer Estimator).

There is no way around the fundamental fact that larger dies are more expensive for any given number of working transistors than smaller ones. No matter how many salvaged SKUs one has of a given die flavor, yields are still exponentially worse as die size increases.

There are costs to chiplets that go down as practical experience with them increases and benefits that go up as wafer costs increase. When the cost benefit ratio is favorable, NVIDIA will switch over, because that is what will make them the most/cost them the least money. AMD is already there.



adolf512 said:


> the main problem was the 16-pin power-connector melting and you getting less performance with weaker PSUs.


Neither of those are major problems. The recommended PSU for the RTX 4090 is overkill and there have been maybe a few dozen of credible reports of defective adapters and a handful of non-adapter users with burned 12VHPWR connectors, out of ~100k 4090 units. This probably isn't much higher than the rate of failures of eight-pin connectors. I still recommend hedging one's bets with a $20 non-NVIDIA adapter/cable, as insurance against the small risk of harming one's $1600+ card, but that's just playing it safe. The problem has been vastly overstated.


----------



## adolf512

Blameless said:


> Even if NVIDIA was magically seeing a defect rate of zero, ~24% of a 300mm wafer would still be lost because an entire AD102 cannot physically be placed in an area that is less than ~26*24mm (Die-Per-Wafer Estimator).


24*26 is 624 which is bigger than the AD102 size of 608 (mm²).

I calculated the edge-loss and got 16.6% when assuming no edge clearance.









I tried finding the dimensions of AD102 but i couldn't find that anywhere. 






















> End user pricing of the GPUs doesn't indicate much of anything about yields, because the GPU die itself is a small fraction of total board cost and retail prices


That's actually a case against the chiplet approach since it isn't going to save much money anyway while making performance worse.


----------



## Imglidinhere

adolf512 said:


> It's basic physics. Chiplets have higher power-draw and worse performance for the same frequency.


Yup, troll. No one makes statements like this anymore. Shoo troll, don't bother me.


----------



## Imglidinhere

adolf512 said:


> You don't actually need to reduce the frequency of the p-cores that much to drastically cut the power-consumption.
> 
> 
> Part of the issue seems to be poor IHS design, zen4 has a similar issue, not sure which is worse in that respect.
> 
> With improved IHS or delidding 6 ghz on 16 p-cores should be achievable on water. If you go custom loop there isn't really any limit to how much heat you can cool off from the water (you can just add more radiators if needed), the limitation is the waterblock and IHS.


GN already did power testing on the 7950X. At 158w draw, the CPU runs 30C cooler and loses 5% total performance. That's it. It's not the IHS design that's the issue, it's the fact that you're pushing 250 watts through an area the size of a thumbnail. You talk about basic physics, there's some basic physics for ya.


----------



## ENTERPRISE

Locked. I think this thread has run its course.


----------

