Please register or login. There are 0 registered and 1047 anonymous users currently online. Current bandwidth usage: 326.30 kbit/s December 10 - 09:26pm EST 
Hardware Analysis
      
Forums Product Prices
  Contents 
 
 

  Latest Topics 
 

More >>
 

    
 
 

  You Are Here: 
 
/ Forums / ST Micro's Kyro II & Hercules' 3D Prophet 4500
 

  Another great article 
 
 Author 
 Date Written 
 Tools 
Continue Reading on Page: 1, 2, 3, 4, Next >>
Noite Escura Jun 18, 2001, 11:11am EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List Replies: 72 - Views: 7116
I've just read the article about the Kyro II, and I want to say I really find your articles/reviews the most interesting, clear, and well written that I 've found these days. They express matters well for non-expert users and provide good and right-to-the-point info for knowingly people. I enjoy so much reading your articles. Keep the good work, people.
As for the Kyro, I have to say it is the ideal choosing for the the low/medium gamer that doesn't have tons of money to spend in the 'latest trick'. It would surely be my next acquisition if I colud find one where I live.


Want to enjoy fewer advertisements and more features? Click here to become a Hardware Analysis registered user.
Dan Mepham Jun 18, 2001, 12:02pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
Glad you enjoyed it. :)

Dan Mepham

Editor in Chief, Hardware Analysis
Email : dmepham@hardwareanalysis.com
Visit us at : http://www.hardwareanalysis.com

Dan Mepham
Dan Mepham Jun 18, 2001, 11:18pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
And by the way, I'm sure you could find a vendor in the US that would ship internationally.

Try http://www.pricewatch.com . It's an online database of a bunch of vendors .. mostly in the US, but many of them ship internationally.

Dan Mepham

Editor in Chief, Hardware Analysis
Email : dmepham@hardwareanalysis.com
Visit us at : http://www.hardwareanalysis.com

Dan Mepham
Joe Birney Jun 22, 2001, 04:17pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> Nice!
Yes very nice article. Can you post on your 3dmark the color and res that those were ran in? Just wondering..

I remember seeing a RivaStation review on the K2 and using a faster CPU we had the K2 on type in the 32bit color higher res case.....

http://www.rivastation.com/review/3dprophet_4500/p4_k2_3dmark2k.GIF

Not saying your or there numbers are right/worng just saying that a faster processer really really helps the K2 when it has to do TnL.


Also dont forget about the NAOMI which was PowerVRs Arcade system. I don't remember the specs but it was a tile based solution that had a TnL that was capible of around 25 million triangle / sec. They do have experince with TnL and tilers. That system dates back before the orignal Geforce card. I am not buying some of the web hype that there is any issue with TnL and tilers because of NAOMI. They did it before. Hopefully they can do it again. I would provide links to the B3D forum where I first heard them talking about TnL, NAOMI and tilers, but there site is still down.

Still, great article you did!

Darren Teasdale Jun 22, 2001, 05:07pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
I just read the review and I really liked it, I have some info you might want to add though.

First the low 3dmark2001 score, the problem here is not all lack of HW T&L, the problem is a bug in DX8. When DX8 was first being made Kyro 1 wasn't even out (it was made but not released) and since it was so low profile MS didn't have a Kyro 1 board. Because of this they actually disabled features for Kyro 1 that they didn't know for sure it had. What they disabled was HW rendering into textures which is a very common feature and is supported in HW by most cards. This feature is used for effects such a dynamic shadows and more and Kyro 1 and II does support this feature in HW. But MS weren't sure so it was disabled and because of this when any game uses rendering into textures on a machine with a Kyro II and DX8 installed the effect must be done in software by the system CPU. 3dmark2001 uses rendering into textures for dynamic shadows in the high detail game tests and because of the DX8 bug the system processor is being used for those shadows which is making the Kyro II highly CPU limited in that benchmark. DX8.1 beta build 620 finally fixed this bug for Kyro 1 and II cards, the only problem is its still beta and it causes other problems with Kyro and 3dmark2001. It causes the Kyro 1 and II to render everything in the high detail lobby test (no other game tests in 3dmark2001 or any other benchmark or games are effected by this problem AFAICS). So obviously when drawing everything the Kyro's advantage of HSR becomes non-existent and it gets a worse score in that test then it used to get in the version of DX8 with the rendering into textures bug in it. However this graphical bug will be fixed in the final release of DX8.1 and on the possitive side with DX8.1 beta build 620 Kyro 1 and II boards are allot less CPU limited in 3dmark2001 (and other games that use rendering into textures) and also now Kyro 1 and II boards can run the 3dmark2001 demo which requires HW rendering into textures. I tested the high detail tests of 3dmark2001 with both the older version of DX8 (the one with the rendering into textures bug) and build 620. The tests were done at 640x480 as this is a CPU limited resolution (which is especially neccesarry considering the fillrate problem in the high detail lobby test because of the bug that forces Kyro to draw everything) and these are my results on my Duron 950 system:

High detail car chase test:

Old DX8 = 5fps
DX8.1 build 620 = 13fps

High detail lobby test:

Old DX8 = 15fps
DX8.1 build 620 = 23fps

High detail dragothic test

Old DX8 = 15fps
DX8.1 build 620 = 16fps

So as you can see appart from the dragothic high detail test (which doesn't use much rendering into textures anyway) all the framerates are either almost doubled or tripled with the new fixed DX8.1, my overall score at 640x480 went up from 2000 to 2350 and once the final DX8.1 is out and the 3dmark2001 high detail lobby test is fixed I expect default 1024x768 tests to get a similar speed increase.

Also the Kyro 1 wasn't the first tile based chip from Imagination Technologies, the first was called PCX1 and was out at the same time as the Voodoo1, then there was PCX2, Neon250, the graphics chips for the Sega Dreamcast and Sega Naomi 1 and II arcade system and then Kyro 1 and II. The problem with the other chips especially the early PCX1 was a lack of compatability, the Neon250 was a good chip and faster then anything else around when it was made however Imagination Tech and there then partner NEC concentrated on the Dreamcast graphics chip and the Sega arcade machine which both used a very similar variant of Noen250 which was called PowerVRDC.

Also when you say that its hard to make a HW T&L engine with a tile based rendering card you might like to know that Imagination Tech has had a tile based rendering system with HW T&L out for years, as I've already said the Sega Dreamcast console and Sega Naomi 1 and 2 arcade machines had tile based rendering chips from Imagination Tech. The Naomi 2 arcade system used 2 PowerVRDC (Dreamcast) chips along with an Imagination Tech HW T&L unit called Elan that was capable of 10 million polys per second with 6 HW lights. If they can do it in arcade then they can do it in Kyro III.

Dan Mepham Jun 22, 2001, 07:10pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
Hey guys .. thanks for all the comments.

Regarding the 3DMark tests, both 2000 and 2001 were run in the default mode (Default benchmark). That means 1024x768x16 in 3DMark2000, and 1024x768x32 in 3DMark 2001. (Unless specified otherwise, if you see a 3DMark benchmark on HA, assume it was in default mode..)

Regarding the performance - thanks, I wasn't aware of this issue. I'll be sure to check out the DX8.1 beta and see what sort of numbers I get. If the results are noteworthy, I'll update the review when DX8.1 officially launches (but not until then - I don't want to post benchmarks on a Beta DX that causes other issues)

The comment about it being difficult to implement a T&L unit similar to nVidia's on a Tile renderer came from software designers and engineers. To be honest, I'm not in a position to validate that one way or another - I don't have enough knowledge of the design of either the Kyro or a T&L unit to know how difficult that would be. Was merely repeating the opinions of those in the know. :-) However, like I said, STMicro & Imagination say they CAN, so I say... best of luck, I hope they're right.

Take care!
Dan Mepham

Editor in Chief, Hardware Analysis
Email : dmepham@hardwareanalysis.com
Visit us at : http://www.hardwareanalysis.com

Dan Mepham
Darren Teasdale Jun 22, 2001, 07:37pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
<<<However, like I said, STMicro & Imagination say they CAN, so I say... best of luck, I hope they're right.>>>

Yeah I sort of guessed you were just repeating what you'd heard as you seem to like TBR and its only the people that hate it like Tim Sweeney that like to make up unsubstanciated crap about problems with TBR and HW T&L AFAICS. I say made up because as of yet I've never actually heard any specific argument from anyone on why TBR wouldn't work with HW T&L Afterall all the HW T&L unit does is replace the CPU in doing the T&L so how could that cause any problem with TBR?. BTW where did you hear a software designer mention difficulty between HW T&L and TBR? and did they give any real argument why it was hard to do? or did you just read something Tim Sweeney said?, in which case he's a strange strange man and I personally wouldn't use his *opinion* on TBR to wipe my arse:)

Also as I've already said its not a matter of if they can do HW T&L because they already have as proved by Naomi 2. I think since you mentioned rumours of possible difficulties wiith HW T&L and TBR in your review you should also mention the fact that IMGTEC have already made a TBR chip with HW T&L in the very famous Sega arcade system just to level it out a bit.

Ben Skywalker Jun 23, 2001, 08:44am EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> TBR+T&L
"I say made up because as of yet I've never actually heard any specific argument from anyone on why TBR wouldn't work with HW T&L Afterall all the HW T&L unit does is replace the CPU in doing the T&L so how could that cause any problem with TBR?"

It isn't that it can't work, it is simply more difficult to do.

With TBRs you need to bin all your geometry data after it has been transformed, why this causes problems-

Vertice data is transmitted over the AGP bus to the TBR.

The T&L unit then needs to process the data and write it back to the "bin". This creates two potential problems, one is that with a powerful T&L unit being exploited you need to have additional RAM for binning purposes(though that is simply due to larger amounts of geometric data and would be the case with or without hard T&L), and the other is that you are utilizing local memory bandwith for vertice data two fold that of which is required on an IMR.

Then the geometry data needs to be checked from the bin for a visibility check before final rasterization(again, chewing up more bandwith).

The situation becomes more problematic if you want to cache vertex data on board eliminating the AGP bottleneck as you need to rehandle the vertice data three times before final rasterization(IMR only needs to handle it once caching localy or not).

This is the reality of the situation, and it is definately more difficult to add hardware T&L to a TBR then it is to an IMR. It is certainly possible, in highly specialized settings such as arcade units it already has been done for some time now.

Darren Teasdale Jun 23, 2001, 09:17am EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
Ben Skywalker, this to me doesn't sound incredibly difficult to overcome and as you said would occur with or without a HW T&L unit as a CPU can still start to push more polys then can be held in the small space set asside for poly binning especially on a 32mb board. I was thinking that they were talking about some problem that was really specific to HW T&L and TBR that would mean they couldn't work together which is why I said it was made up as HW T&L and TBR will obviously work together if only just to do around 10mpps like the Kyro II 64mb can already safely store just to take that transform and lighting weight off the CPU (although I'm not saying for a minute that Kyro III will only do 10mpps because it will do a hell of a lot more).. If this is the only problem people are talking about then they need not worry because IMGTEC have it sorted.

Ben Skywalker Jun 23, 2001, 09:46pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
"Ben Skywalker, this to me doesn't sound incredibly difficult to overcome and as you said would occur with or without a HW T&L unit as a CPU can still start to push more polys then can be held in the small space set asside for poly binning especially on a 32mb board."

The problem is that TBRs main advantage over traditionals is the significantly lesser bandwith needed, with extremely large poly loads this edge starts to vanish.

I was thinking that they were talking about some problem that was really specific to HW T&L and TBR that would mean they couldn't work together which is why I said it was made up as HW T&L and TBR will obviously work together if only just to do around 10mpps like the Kyro II 64mb can already safely store just to take that transform and lighting weight off the CPU (although I'm not saying for a minute that Kyro III will only do 10mpps because it will do a hell of a lot more).. If this is the only problem people are talking about then they need not worry because IMGTEC have it sorted.

At default settings, the Kyro2 can't handle 10MPps. There was a thread over at Beyond3D where we discussed this at great length and as of now you need to modify registry settings to be able to get above ~5MPps(enlarge the bin from default settings). That chews into on board RAM and when combined with the greater bandwith needs higher vertice loads will demand that will need to be a greater amount of high speed RAM, eliminating a great deal of the cost saving properties that current TBRs have. This is the main problem, with on board hardware T&L you are doubling(at least) the bandwith needs of handling vertice data which isn't the case if you simply pair a TBR with a high power CPU.

Of course it is still possible, but high poly loads will create a situation for TBRs somewhat like excessive overdraw causes for IMRs at the moment. IMRs simply handle higher poly loads better at this point in time and for the forseeable future. It isn't unthinkable that solutions could be developed to assist TBRs in early vertice rejection to aid in reducing the poly strain, much as the GF3 is capable of doing now to help eliminate overdraw(when rendering front to back at least), but it is still an additional hurdle for TBRs.

Joe Birney Jun 23, 2001, 10:29pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
Hey Ben,

how have you been. I hardly get a chance to aurgue...I mean discuss things with you over at Anand and B3D :-)


Agree that TnL will present some challanges. But I don't buy into the High Ploys gloom and doom yet. At the polys ben is talking about will cause all current cards will choke. And choke well. Someday, sure High polys will be the thing. But the is a huge base of cards out that that will not be able to handle this. Game developers now this and thats one of the reasons that TnL has never really gotten off yet. Game developers will not but high poly into the games until they know that every body can handle them. If they dont, they are gonna have poor sales and less than popular game. For awhile the things that will hold you back from High Res play are memory bandwidth and pixel fill rate....

Joe Birney Jun 23, 2001, 10:33pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
Doh sorry for the typos folks. I was making some UT maps and did not think to proof read before I hit the "post" button...doh!

Ben Skywalker Jun 24, 2001, 08:18am EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
"how have you been. I hardly get a chance to aurgue...I mean discuss things with you over at Anand and B3D"

Been doing good, still lurk over at Anand's though I won't post until we see the high tech forum come on line. Dave said B3D will be back up soon for good<crosses fingers>

The poly levels I'm talking about will be close to a reality in Unreal2, and Tim has already stated multiple times that the reference rasterizer is a GTS. To push 60FPS you'll need to be cranking out 9Mil to 12Mil polys a second based on the information we have now. Five million ps without hard T&L would get you 25FPS if there was no game code, simply not enough. I'm sure they will offer a geometric complexity slider, but that will leave you with less then optimal graphics with a non hard T&L board(or one without enough power for that matter).

The argument over higher res versus geometric complexity is another matter entirely. If given the choice, as Unreal2 is likely to have for the hard T&L boards at least, it will be interesting to see what people chose. When titles that truly expoloit hard T&L, such as Unreal2, will be commonplace is still very much up in the air and another factor that should be looked at when discussing the importance of course.

Current boards hard T&L units have yet to be tapped by any game, not even close. At what levels they will choke is still very much in the air. We know now that the nV based offerings have a decisive edge over ATi's in raw power, and that ATi's can ~match today's most powerful CPUs when they are dedicated solely to geometry(which is ~25% of the actual strain in game).

Darren Teasdale Jun 24, 2001, 09:00am EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
<<<The problem is that TBRs main advantage over traditionals is the significantly lesser bandwith needed, with extremely large poly loads this edge starts to vanish.>>>

And HSR which is part of the bandwidth saving but also has massive savings to fillrate too. Any amount of poly storage isn't going to get rid of that advantage.

<<<At default settings, the Kyro2 can't handle 10MPps. There was a thread over at Beyond3D where we discussed this at great length and as of now you need to modify registry settings to be able to get above ~5MPps(enlarge the bin from default settings). That chews into on board RAM and when combined with the greater bandwith needs higher vertice loads will demand that will need to be a greater amount of high speed RAM, eliminating a great deal of the cost saving properties that current TBRs have. This is the main problem, with on board hardware T&L you are doubling(at least) the bandwith needs of handling vertice data which isn't the case if you simply pair a TBR with a high power CPU>>>

I never said it could do it at the default settings and whether it can or not really is irrelivant as were not talking about the default settings of a Kyro II were talking about Kyro III an the possibilities of TBR with high poly counts. I was in that discusion and if you remember I was the one that did the tests showing different poly binning sizes. You don't need to do anything to be able to get over 5mpps, if I remember right my first test with unchanged binning space got around 7.5mpps and with binning space upped to 5mb I got 9.2mpps. The 64mb Kyro II can easily use 5 or even 6 or more MB for poly binning because it has easily enough ram to do that as it saves so much elsewhere thats why I said that the Kyro II 64mb could safely store 10mpps.

<<<Of course it is still possible, but high poly loads will create a situation for TBRs somewhat like excessive overdraw causes for IMRs at the moment. IMRs simply handle higher poly loads better at this point in time and for the forseeable future. It isn't unthinkable that solutions could be developed to assist TBRs in early vertice rejection to aid in reducing the poly strain, much as the GF3 is capable of doing now to help eliminate overdraw(when rendering front to back at least), but it is still an additional hurdle for TBRs.>>>

Not really, its a completely different situation, overdraw cuts into fillrate which high poly counts doesn't do, its very simple to sort out poly storage on a TBR, you could use poly compression and/or you could use a seperate bus for poly binning which could be very cheap as all it needs to support is the poly storage and nothing more therefore a 64bit bus and ram could be used, say 32mb or even 64mb. There are also other ways I beleive.

Dan Mepham Jun 24, 2001, 10:15am EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> Excellent
Excellent thread guys, this is totally the kind of discussion we want around here. :-)

Someone mentioned a thread at Beyond3D - could someone point me to that thread? I'd love to take a look. Thanks!

Dan Mepham

Editor in Chief, Hardware Analysis
Email : dmepham@hardwareanalysis.com
Visit us at : http://www.hardwareanalysis.com

Dan Mepham
Ben Skywalker Jun 24, 2001, 11:32am EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
<<<Someone mentioned a thread at Beyond3D - could someone point me to that thread?>>>

Site is down at the moment, and when it comes back up it will be iffy if the thread is still there or not(issues wih the host). If it is still there when the site comes back up I'll gladly post a link:)

<<<And HSR which is part of the bandwidth saving but also has massive savings to fillrate too. Any amount of poly storage isn't going to get rid of that advantage.>>>

Fillrate is rather moot at the moment, none of the current IMRs can come close to their theoretical peaks because of bandwith although the bandwith savings are definately of significant importance.

<<<I never said it could do it at the default settings and whether it can or not really is irrelivant as were not talking about the default settings of a Kyro II were talking about Kyro III an the possibilities of TBR with high poly counts. I was in that discusion and if you remember I was the one that did the tests showing different poly binning sizes. You don't need to do anything to be able to get over 5mpps, if I remember right my first test with unchanged binning space got around 7.5mpps and with binning space upped to 5mb I got 9.2mpps.>>>

That's why I put the "~" in front of 5MP, I didn't recall exactly what it was just that is was well below 10MPps.

<<<The 64mb Kyro II can easily use 5 or even 6 or more MB for poly binning because it has easily enough ram to do that as it saves so much elsewhere thats why I said that the Kyro II 64mb could safely store 10mpps.>>>>

Wait a second there, it depends greatly on how much vertice data you are dealing with. If you have 20K verts per scene you may well be able to push 500FPS with the tweaks you used, but it is highly unlikely you will be able to push 250K verts per scene @40FPS. This can be worked around, but the fact that it needs to be worked around is what I thought this discussion was about:)

<<<Not really, its a completely different situation, overdraw cuts into fillrate which high poly counts doesn't do, its very simple to sort out poly storage on a TBR,>>>

I don't see how it is any different from OD with an IMR. The GeForce3 has shown how effectively it can deal with OD when rendering front to back, to date we have yet to see a TBR that has sorted out the poly issue.

<<<you could use poly compression and/or you could use a seperate bus for poly binning which could be very cheap as all it needs to support is the poly storage and nothing more therefore a 64bit bus and ram could be used, say 32mb or even 64mb. There are also other ways I beleive.>>>

At what cost? Having an entirely seperate second bus will be incredibly expensive, in relative terms. Using fast DDR by itself would solve a great deal of the problem, although again there is the cost factor. IMRs have been dealing with their issues for some time now with concerns to bandwith. High poly counts are going to create a problem, one that must be dealt with in the design process, for TBRs that IMRs simply don't need to deal with. In that respect I see it as very similar to IMRs, it is going to cost money to work around, and make the end product more expensive for the end user. More expensive then a comparable IMR? Time will tell.

It will definately be interesting this fall to see Glaze, GF3U, Radeon2, and Kyro3 "slugging it out", we will just have to wait and see how Doom3, Unreal2 and others like them perform on each board before we will be able to make firm conclusions of how much importance this will have on end users and the respective companies design philosophies.

Darren Teasdale Jun 24, 2001, 01:11pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
<<<Fillrate is rather moot at the moment, none of the current IMRs can come close to their theoretical peaks because of bandwith although the bandwith savings are definately of significant importance>>>

You said that bandwidth savings were the main advantage of TBR as in it can hit its peak fillrate because its not bandwidth limited but what I'm saying is that the fillrate it can hit is used more efficiently then an IMR irrespective of bandwidth. If you have a TBR with a 800mpixel/s fillrate and an IMR with a 800mpixel/s fillrate and they can both only actually hit 500mpixels/s because of bandwidth limitations the TBR will still have much better pixel pushing power as it won't use that 500mpixels/s to render unseen pixels.

<<<That's why I put the "~" in front of 5MP, I didn't recall exactly what it was just that is was well below 10MPps>>>

It wasn't really well under 10mpps though and even if it was at defualt settings thats not the point as were talking about Kyro III and TBR not specifically Kyro II, I know I was the one who mentioned Kyro II but my point is defualt settings aren't important as all they are are reg entries and could be changed by IMGTEC at any time, there's no HW limitation thats stopping a 64mb (or even a 32mb for that matter) Kyro II from pushing 10mpps.

<<<Wait a second there, it depends greatly on how much vertice data you are dealing with. If you have 20K verts per scene you may well be able to push 500FPS with the tweaks you used, but it is highly unlikely you will be able to push 250K verts per scene @40FPS. This can be worked around, but the fact that it needs to be worked around is what I thought this discussion was about>>>

AFAIK this discussion is about some people claiming that HW T&L and TBR is "extremely difficult" to impliment, do you aggree that its extremely difficult to impliment considering the options I already mentioned? because I don't. The extremely difficult quote is from this review and also this quote "Nevertheless, Imagination & ST Micro insist it can and will be done" that quote is questioning whether its actually possible to do which I can't see is a justified question considering the situation we've already discussed. Adding a seperate bus and ram for binning isn't difficult at all.

<<<At what cost? Having an entirely seperate second bus will be incredibly expensive, in relative terms. Using fast DDR by itself would solve a great deal of the problem, although again there is the cost factor. IMRs have been dealing with their issues for some time now with concerns to bandwith. High poly counts are going to create a problem, one that must be dealt with in the design process, for TBRs that IMRs simply don't need to deal with. In that respect I see it as very similar to IMRs, it is going to cost money to work around, and make the end product more expensive for the end user. More expensive then a comparable IMR? Time will tell.>>>

Why would it bring a high cost? Adding some low clocked 32mb 64bit sdr/ddr ram to a card would barely cost anything and would provide the storage space for large poly counts. Using DDR ram wouldn't solve the storage problem because you'd need allot of 250mhz 128bit DDR (thats what will be used for Kyro III) ram for binning with a high poly count. Say it was already going to be 64mb 250mhz 128bit DDR (which is very likely), why say double that to 128mb at huge expense just to add the extra 64mb space for binning when you can have a seperate 64bit sdr/ddr bus and ram dedicated entirely to binning, it could be 64mb at a low cost. Using 64mb 250mhz 128bit DDR ram as video ram and 64mb or even 32mb 64bit SDR/DDR ram for binning is allot cheaper then using 128mb 250mhz 128bit DDR ram and provides just as much binning space and enough bandwidth.

Ben Skywalker Jun 24, 2001, 09:03pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
<<<You said that bandwidth savings were the main advantage of TBR as in it can hit its peak fillrate because its not bandwidth limited but what I'm saying is that the fillrate it can hit is used more efficiently then an IMR irrespective of bandwidth. If you have a TBR with a 800mpixel/s fillrate and an IMR with a 800mpixel/s fillrate and they can both only actually hit 500mpixels/s because of bandwidth limitations the TBR will still have much better pixel pushing power as it won't use that 500mpixels/s to render unseen pixels.>>>

The highest MPixel rate on a TBR currently is 350MPixels, 1GPixel for an IMR. Fillrate isn't an issue right now for IMRs, only bandwith.

<<<AFAIK this discussion is about some people claiming that HW T&L and TBR is "extremely difficult" to impliment, do you aggree that its extremely difficult to impliment considering the options I already mentioned?>>>

Yes, it is extremely difficult. At least, to do it cost effectively.

<<<Why would it bring a high cost? Adding some low clocked 32mb 64bit sdr/ddr ram to a card would barely cost anything and would provide the storage space for large poly counts. Using DDR ram wouldn't solve the storage problem because you'd need allot of 250mhz 128bit DDR (thats what will be used for Kyro III) ram for binning with a high poly count. Say it was already going to be 64mb 250mhz 128bit DDR (which is very likely), why say double that to 128mb at huge expense just to add the extra 64mb space for binning when you can have a seperate 64bit sdr/ddr bus and ram dedicated entirely to binning, it could be 64mb at a low cost. Using 64mb 250mhz 128bit DDR ram as video ram and 64mb or even 32mb 64bit SDR/DDR ram for binning is allot cheaper then using 128mb 250mhz 128bit DDR ram and provides just as much binning space and enough bandwidth.>>>

First you are talking about requiring two completely seperate busses for one chip, something never implemented on a consumer gfx card that I can remember. This would be very costly on two fronts, first off you have to design a memory controller capable of handling it, then you have to design and produce the PCB to handle it. Using 64bit would make it less costly on the PCB front to implement, but still quite a bit more costly then having a single bus would be. Not only that, but you still would only have ~2GB/sec using 250MHZ 64bit DDR(for the vertice RAM at least).

Also, you are requiring RAM amounts above and beyond that which an IMR would require. Will these factors combined push the cost higher then an IMR using high speed DDR? If they do, then TBR has lost its' main advantage over IMRs. Using 64bit DDR will save you almost nothing in terms of the RAM cost, though it will definately save you in PCB complexity. It will be interesting to see if they come up with a close to IMR comparable hard T&L unit(20Mil-30Mil/sec) cost effectively(ie- keep their price advantage).

Dave Baumann Jun 25, 2001, 09:22am EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
Ben,

"First you are talking about requiring two completely seperate busses for one chip, something never implemented on a consumer gfx card that I can remember. This would be very costly on two fronts, first off you have to design a memory controller capable of handling it, then you have to design and produce the PCB to handle it."

If you've designed one bus, then there is little ussue with adding another - what Darren is proposing would be two independant busses, so theres not really much to worry about.

"Also, you are requiring RAM amounts above and beyond that which an IMR would require."

You appear to be assuming that they will do nothing to circumvent these issues, which I would say is a little short sighted. I'm sure PowerVR know their business, and I'm sure they know their limitations / future design concerns.

BTW - who say you always need to buffer than a frame? That would be the very max; why can't memory be reallocated once a tile is finished?

Darren Teasdale Jun 25, 2001, 12:29pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
<<<First you are talking about requiring two completely seperate busses for one chip, something never implemented on a consumer gfx card that I can remember. This would be very costly on two fronts, first off you have to design a memory controller capable of handling it, then you have to design and produce the PCB to handle it. Using 64bit would make it less costly on the PCB front to implement, but still quite a bit more costly then having a single bus would be. Not only that, but you still would only have ~2GB/sec using 250MHZ 64bit DDR(for the vertice RAM at least).>>>

You must have forgoten to factor in the DDR part because 250MHZ 64bit DDR ram brings 4GB mem bandwidth.

<<<Also, you are requiring RAM amounts above and beyond that which an IMR would require. Will these factors combined push the cost higher then an IMR using high speed DDR? If they do, then TBR has lost its' main advantage over IMRs. Using 64bit DDR will save you almost nothing in terms of the RAM cost, though it will definately save you in PCB complexity. It will be interesting to see if they come up with a close to IMR comparable hard T&L unit(20Mil-30Mil/sec) cost effectively(ie- keep their price advantage).>>>

Add 250mhz 32mb 64bit ram to a Kyro II and your adding allot of extra cost to it because it already only costs $109, but add the same ram to a TBR that already has the raw specs of a Geforce 3 (probably minus allot of transistors) and its like a drop in the ocean, say if adding that ram puts $30 onto the price of a card, add that to the Kyro II and its up from $109 to $139 which is a significant jump and makes it almost the price of a Geforce 2 Pro in the U.S. Now do the same to a card the price of a Geforce 3 it would go up from $330 to $360, thats not anywhere near as significant a jump because the price was already high, of course this is all forgetting thst a TBR with the raw specs of a Geforce 3 would be easily twice the speed of the Geforce 3, so the extra $30 is well worth it:)

Ben Skywalker Jun 25, 2001, 09:56pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> [No Subject]
<<<You must have forgoten to factor in the DDR part because 250MHZ 64bit DDR ram brings 4GB mem bandwidth.>>>

Actually, I was figuring for 250MHZ effective. 4ns DDR cost is extremely high, using 64bit only saves you on PCB cost, not on the actual RAM.

<<<Add 250mhz 32mb 64bit ram to a Kyro II and your adding allot of extra cost to it because it already only costs $109, but add the same ram to a TBR that already has the raw specs of a Geforce 3 (probably minus allot of transistors) and its like a drop in the ocean>>>

You also need to factor in two busses on the PCB and also the memory controller for the board. The memory controller in particular would take a significant amount of R&D to pull off, the only reason nV managed it was MS paid for the development.

<<<say if adding that ram puts $30 onto the price of a card, add that to the Kyro II and its up from $109 to $139 which is a significant jump and makes it almost the price of a Geforce 2 Pro in the U.S.>>>

RIght now, it would be more then an additional $30 just for the memory, let alone the fact that you would need a significantly more costly PCB design and the memory controller.

<<<Now do the same to a card the price of a Geforce 3 it would go up from $330 to $360, thats not anywhere near as significant a jump because the price was already high, of course this is all forgetting thst a TBR with the raw specs of a Geforce 3 would be easily twice the speed of the Geforce 3, so the extra $30 is well worth it>>>

If this setup only cost an additional $30, what would stop nVidia from doing the same thing on the GeForce3U and eliminating its' bandwith shortcomings? With dual memory channels the GF3U could see 10GB/sec throughput easily, combined with its' OD removal which is being exploited by developers for the next generation of games and IMRs could well be back on top of the performance heap.

Of course, that isn't realistic as the cost is too prohibitive. This is why I see the problem as very comparable to IMRs. Bandwith is the main concern and the same things that could be done to aid TBRs with vertice bandwith could be used on IMRs to help eliminate their existing bandwith issues.


Write a Reply >>

Continue Reading on Page: 1, 2, 3, 4, Next >>

 

    
 
 

  Topic Tools 
 
RSS UpdatesRSS Updates
 

  Related Articles 
 
 

  Newsletter 
 
A weekly newsletter featuring an editorial and a roundup of the latest articles, news and other interesting topics.

Please enter your email address below and click Subscribe.