PDA

View Full Version : Amd64


illogical
27th July 2007, 11:33 PM
all the new computers are 64bit, dual core, etc.

hyperthreading appears to do a little extra work, but the application has to be aware. dual core and dual proc are nice, but again that assumes that one is running threads (?) that can be dumped on both. i know there are certain tasks that split up nicely, like searching for Mersenne primes.

so if buying a new PC, should i just get a dual core AMD64? the heaviest apps i run are MESS, MAME, and Bochs, which should run on a single computer. but my math apps would do best on a cluster.

rockoon
28th July 2007, 02:44 AM
Get the dual core.

Remember that CPU speed isnt everything. The main bottlenecks in applications are I/O related.

Even if you could get a single core processor with a clock rate twice as fast as a dual core option, it wont really be twice as fast as one core of that dual core. This is because the majority of the time, a core is waiting for something external such as memory to be moved into its caches, or a disk drive to respond to a request.

With a dual core, you have a second core that isnt waiting just because the first one is, and can thus actualy do something that might even be productive.

You are correct that only applications with multiple threads are designed to take advantage of dual cores. However, several applications running simultaneously are by definition several threads and hence you yourself can still take advantage by running more than one single-threaded application simultaneously.

And those apps that ARE designed to take advantage, will in many cases really see nearly (but not quite) double performance.

I would suggest getting a dual or quad core with the fastest memory I/O interface affordable. AMD calls its I/O interface "Hyper-Transport" and Intel I believe still calls theirs "Front Side Bus". Make sure that your chosen motherboard also supports the speed.

Its hard to go wrong with this approach to buying a motherboard and processor.

PixyMisa
28th July 2007, 03:14 AM
All but the very cheapest CPUs are now dual core. Even my notebook (a budget Compaq model, six months old) is a dual core AMD64. It's only 1.6GHz, but it still zips along quite nicely. All the AMD desktop chips are 2GHz or faster, so they'll be even better.

Jekyll
28th July 2007, 09:23 AM
all the new computers are 64bit, dual core, etc.

hyperthreading appears to do a little extra work, but the application has to be aware. dual core and dual proc are nice, but again that assumes that one is running threads (?) that can be dumped on both. i know there are certain tasks that split up nicely, like searching for Mersenne primes.


The other nice thing about using a dual core is that you can run intensive work on a single core and keep using the computer as normal, writing or surfing the internet at the same time without much of a slowdown.

illogical
28th July 2007, 08:16 PM
thanks guys, gals.

i tried motherboard settings on a Systemax with 2.4 ghz Celeron. i got it well past 3.0 ghz or 4:3 memory. it appeared to be stable at 2.9, as the room temp was very low. apparently the Celerons as not as good as the Celeron D's.

i guess i'll get an inexpensive motherboard/CPU and PS and add parts as money permits.

Kaylee
30th July 2007, 12:55 PM
so if buying a new PC, should i just get a dual core AMD64?


Might want to consider an Intel Core 2 Duo (which is also a 64 bit chip like the AMD64 and can also run either a 32 bit or 64 bit OS). I replaced my laptop last month and opted for Intel over AMD because it is suppose to run cooler and therefore have a longer battery life.

Its not bad, but I really don't like Vista. When time permits I will install Ubuntu.

PixyMisa
30th July 2007, 04:33 PM
Core 2 Duo is also great. If you're running on an old Celeron, then any of the current dual-core chips will be a huge improvement.

Intel also have a cheap version of the Core 2 Duo called the Pentium Dual Core. If you're into overclocking, this is the chip for you.

illogical
30th July 2007, 07:33 PM
thanks PixyMisa.

The other nice thing about using a dual core is that you can run intensive work on a single core and keep using the computer as normal, writing or surfing the internet at the same time without much of a slowdown.

#4, finally i can look for Mersenne primes and surf erotica sites, at the same time. i love progess.

Kilgore Trout
31st July 2007, 12:30 PM
MAME (and I'd certainly think MESS) won't benefit much at all from a dual core. There may be a slight increase because of other background tasks running on the other core and I think they did something that does run a little bit of code in a second thread (don't quote me on that one, though I can try to look it up if you'd like -- however the bottom line is that it wasn't much if anything).

Some non-devs have proposed that, because MAME has to often emulate several CPUs it could split those tasks up but the devs are insistent that it won't work -- the CPUs have to be in sync so perfectly that code would have to be added to achieve that and negate any gain in speed or even slow it down.

The only way to speed up MAME is raw brute force computing power.

PixyMisa
1st August 2007, 04:19 AM
Some non-devs have proposed that, because MAME has to often emulate several CPUs it could split those tasks up but the devs are insistent that it won't work -- the CPUs have to be in sync so perfectly that code would have to be added to achieve that and negate any gain in speed or even slow it down.
You could probably do this successfully on dedicated hardware, either by message passing or spin locks, but on a general purpose OS, yeah, the inter-thread synchronisation calls would kill you.

Stimpson J. Cat
1st August 2007, 06:08 AM
Actually I have found that MAME benefits quite a bit from having either a dual core system or hyperthreading. This is because MAME can easily become a CPU hog. If you're only running MAME, then this is no big deal. But I often have file sharing software running in the background.

On my old single core system I had the problem that when I ran MAME, my file sharing software would end up not getting enough CPU time, and I would lose some of my connections. My previous computer had been slightly slower, but with hyperthreading, and had not had this problem. Likewise my new dual core system does not have this problem.

In my view this is the main advantage to having a dual core system for most users. It's not that any of your programs will run faster (they won't unless they are multithreaded), but running multiple applications works much better. Not only can you run two CPU intensive processes as fast as a single core would run one, but when you are running one CPU intensive process with lots of less intensive processes in the background, you don't have to worry about the lesser processes getting strangled by the intensive one.

Dr. Stupid

PixyMisa
1st August 2007, 06:12 AM
I'm tempted to go for a quad-core. One core for Windows, one for iTunes, one for what I'm actually doing, and one for all the other crap.

(Whenever iTunes starts updating my podcasts, or I sync my iPod, my machine pretty much grinds to a halt.)

Reclaimer
1st August 2007, 09:58 AM
I wish I could remember exactly how I remember this... but I recall reading somewhere that MAME runs best with lots of RAM and a videocard that has a lot of onboard RAM as well. I'm wracking my brain trying to figure out exactly why, but I just recall that fact alone.
Either way, some MAME roms run like crap on my laptop which is a AMD Clawhammer 1.8GHZ, integrated ATI 200 video and a gig of ram, but runs like the wind on my desk top that I built with a AMD X2 4400, 2 gigs of ram and an Nvidia 7800GT.

But for having a dual core? It's really nice. I make a lot of music using Cakewalk's Sonar Pro 6 and it can be really CPU intensive recording track by track and using digital effects and so forth. It used to choke my old computer to death, a AMD Barton core 2500. But now, the CPU usage gauge barely even registers activity when mixing 12 tracks of music with over 6 digital effects per track down to MP3, then burning to CD.

I definitely cast my vote for getting a dual core anything for a CPU nowadays.

Kilgore Trout
1st August 2007, 12:21 PM
Actually I have found that MAME benefits quite a bit from having either a dual core system or hyperthreading. This is because MAME can easily become a CPU hog. If you're only running MAME, then this is no big deal. But I often have file sharing software running in the background.
[...]
But that's not MAME itself benefiting from the dual core. Like I said, a dual core will help with background processes, but MAME is very much a single threaded program and has to be.

A 2 ghz single core CPU will run MAME much faster than a 1.5 dual core. And it only makes sense that you'd turn off things like filesharing, though if you must, there are many settings to keep MAME from using 100% of your CPU.

I definitely cast my vote for getting a dual core anything for a CPU nowadays.
I'm not trying to sell single core CPUs or something. But if someone is looking for something specific, it might make sense. For example, they want something cheap and typically run applications that either aren't CPU intensive or the only thing going (like MAME, mentioned in the OP).

I wish I could remember exactly how I remember this... but I recall reading somewhere that MAME runs best with lots of RAM and a videocard that has a lot of onboard RAM as well. I'm wracking my brain trying to figure out exactly why, but I just recall that fact alone.
I wouldn't call that a fact. If you can look it up, please do, but unless we're talking about a very old system, neither RAM nor a video card will help MAME.

System RAM will only figure in if MAME (and even then, only the base code and game driver, not all of MAME) and the ROM are more than available RAM. 500 gigs is plenty and you can probably get by with 250. As for the video card, unless it's very old it, too, won't matter. It does no processing at all as far as the emulation goes, and only draws the screen. Video RAM will figure in with high resolutions, palettes, or something like that which can be set lower, if need be. 128 megs of video RAM should be plenty.

Either way, some MAME roms run like crap on my laptop which is a AMD Clawhammer 1.8GHZ, integrated ATI 200 video and a gig of ram, but runs like the wind on my desk top that I built with a AMD X2 4400, 2 gigs of ram and an Nvidia 7800GT.

First, check your RDTSC timing option in MAME and turn it off for the laptop. It's an option specifically for laptops. Then I'd check if you have MAME set to sync with your monitor refresh. Finally, I'd check your laptop itself to see if there's some odd setting or other with respect to throttling CPU speed because MAME might be trying to max it out completely. This is a bit of a derail; if you'd like we can continue this elsewhere but it should run fine.

A 1 ghz CPU will run most ROMs just fine even Neo Geo and CPS-2 (just nothing 3D). And, keep in mind, MAME also does emulate some things that no PC made yet can run well. MAME's purpose isn't to play games; it's more of a side-effect.

Stimpson J. Cat
1st August 2007, 01:58 PM
Kilgore Trout,

But that's not MAME itself benefiting from the dual core. Like I said, a dual core will help with background processes, but MAME is very much a single threaded program and has to be.
My point was that dual core systems not only help with multithreaded aplications and multiple CPU hungry apps, but also when using a single CPU hungry app (like MAME), with other apps in the background. Whether you consider this to qualify as MAME benefiting from the dual core is a matter of semantics. The point is that the dual core system handles running MAME better under these types of conditions than a single core system does.

A 2 ghz single core CPU will run MAME much faster than a 1.5 dual core.
I know. And it will strangle anything else you have running in the background, which was my whole point.

And it only makes sense that you'd turn off things like filesharing,
Why does that make sense? With a dual-core system I don't have to turn off filesharing. Again, that was my whole point. Dual core systems give you options you don't have with single core systems. I don't have to shut off my filesharing if I want to fire up MAME, or even worse, a modern game like Half-Life 2.

though if you must, there are many settings to keep MAME from using 100% of your CPU.
Sure, but then MAME ends up running a lot slower. I ended up doing exactly that on my single core system, and the result was that a bunch of the games that otherwise worked fine became unplayable. I ended up with the choice of either playing the games with filesharing off, or not being able to play the games. Like I said before, the point is that this is a benefit of a dual-core system.

Naturally if your goal is to get the most bang for your buck specifically for running exactly one process with nothing else in the background, then dual core is not a selling point. But I for one like to be able to fire up a quick game without first having to shut down everything I have going in the background. For most people who aren't going to be running CPU intensive multithreaded applications, this is the real selling point of a dual core system. It allows you to do cool things like play a game while something is running in the background, or watch a DVD on a second display while continuing to work smoothly on your main display, or encode a video in the background while continuing to use your PC as though nothing unusual was going on.


Dr. Stupid

Kilgore Trout
1st August 2007, 03:41 PM
I was taking all this in the context of the original post that specifically mentioned emulators. I understand your point; I even mentioned the benefit of a dual core with respect to programs running in the background with my first reply. Your problems with filesharing exemplify the benefits of a dual core CPU, yes, but that wasn't brought up by the original poster. I think you took this to be a bit more personal in a way. I don't care what applications you're trying to run, just Illogical.

The reason I brought this up at all is the common misconception that a dual core CPU will allow MAME to run faster, often because it is known that MAME is emulating games with more than one CPU. To point specifically to MAME as benefiting from a dual core CPU needs the footnote I thought I was providing. Had you only mentioned that all programs benefit, if indirectly, from a dual core that would have ended it.

rockoon
2nd August 2007, 12:09 AM
MAME, and all other emulators, will greatly benefit from the fastest I/O bus you can afford.

Your CPU speed really isnt an issue with these things (your CPU is a hundred times faster or more than the machine being emulated)

The speed at which the code being emulated can be fetched from ram is the biggest issue.

All the special caching tricks Intel and AMD are doing for program execution dont really help emulators. Intels trace cache for instance is nearly worthless. AMD's code cache, which is 50% of cache memory on AMD's, will be wasted space.

danielk
2nd August 2007, 12:36 AM
Get the dual core.

I agree.

This is because the majority of the time, a core is waiting for something external such as memory to be moved into its caches, or a disk drive to respond to a request.

With a dual core, you have a second core that isnt waiting just because the first one is, and can thus actualy do something that might even be productive.

I think this is misleading. Unless you're still using DOS, even a single core will most certainly not sit around idle when waiting for the disk. The operating system simply schedules another task. With regards to moving RAM into cache you have a point, although this is probably something that is also solved by hyper-threading within a single core.

PixyMisa
2nd August 2007, 01:26 AM
MAME, and all other emulators, will greatly benefit from the fastest I/O bus you can afford.

Your CPU speed really isnt an issue with these things (your CPU is a hundred times faster or more than the machine being emulated)
There's no direct connection between those two statements. It might well require a 3GHz CPU to emulate, in software, game hardware running at 4MHz. It depends.

The speed at which the code being emulated can be fetched from ram is the biggest issue.
Not if that code is in cache.

All the special caching tricks Intel and AMD are doing for program execution dont really help emulators. Intels trace cache for instance is nearly worthless. AMD's code cache, which is 50% of cache memory on AMD's, will be wasted space.
No. That's completely wrong.

If there were no cache at all, every instruction in the emulator itself would have to be fetched from main memory, reducing your 2GHz CPU to an effective speed of around 20MHz (the random access time for RAM these days is on the order of 50ns).

The higher-end Athlon 64 X2's have 1MB L2 cache per core; the Core 2 Duo has 4MB shared L2 cache. That's easily enough to hold the working set of these games, and very often the entire ROM.

MAME performance might well depend on memory bandwidth; I don't know for sure. But if it does, it's certainly not for that reason.

rockoon
2nd August 2007, 04:11 AM
Not if that code is in cache.


Emulators use jump tables, virtual functions, and large switch blocks. They do not execute the emulated machines code directly. (Download MAME's source if you do not believe me.)

I guess I need to explain why this is important.

The Core2 pipeline is at least 14 stages (possibly 15.. Intel isnt saying..) The minimum branch misprediction penalty is 15 cycles

AMD64's pipeline is at least 11 stages (possibly 12.. AMD isnt saying) The minimum branch misprediction penalty is 12 cycles

The caching of code is irrelevent. Branch misprediction penalties always exceed the latency of the L2 cache, and by as much as 50% in the Core2 case (100% in the case of a P4 with its whopping 20 stage pipeline)


No. That's completely wrong.


Not to toot my own horn, but when you need an expert on the subject of processor-specific efficiency issues, you would come to me. I'm not Agner Fog, but i'm possibly the next best thing.

The runtime is dominated first by the cost of an L2 cache miss, and second by the cost of a branch misprediction. If you don't believe me, run MAME under AMD's code-analyst and count the L2 misses and branch mispredictions.


If there were no cache at all, every instruction in the emulator itself would have to be fetched from main memory, reducing your 2GHz CPU to an effective speed of around 20MHz (the random access time for RAM these days is on the order of 50ns).

The higher-end Athlon 64 X2's have 1MB L2 cache per core


..and it is split into 512K code and 512K data.

If you are going to quote cache numbers in a refutation about data access, at least get the numbers right. Thanks.

An emulator is going to have a much larger footprint than this, and footprint is everything. Mame, and other old arcade game emulators, emulate everything right down to how long its been since a piece of ram was written to so that it can properly emulate memory not being refreshed.

Also, the random access latency for memory these days is between 50 cpu cycles and 200 cpu cycles on the available 2ghz machines, and is almost completely dependent on the very thing I am advocating.. the speed of the I/O interface which could range in performance from 25ns latency up to 100ns latency. Memory timings (mainly RAS, CAS, and waitstates) also plays a factor.. but not a significant one in comparison.


; the Core 2 Duo has 4MB shared L2 cache. That's easily enough to hold the working set of these games, and very often the entire ROM.


As I already noted, the working set is much larger than the combined ram's and rom's of the machine being emulated, especialy for the machines for which multiple threads are not advantageous.

Note that MAME isnt going to have a problem running about half of the available roms on a 1ghz machine (it says so right in their FAQ) - it is the other half of the roms that are important.

The roms you seem to be dismissing here are infact the only important ones.


MAME performance might well depend on memory bandwidth; I don't know for sure. But if it does, it's certainly not for that reason.

The minor admission that you really dont know is thanks enough. I really do know.

PixyMisa
2nd August 2007, 09:02 AM
Emulators use jump tables, virtual functions, and large switch blocks. They do not execute the emulated machines code directly. (Download MAME's source if you do not believe me.)
I know this. I've written a virtual machine; the same techniques are used. I wrote it in C, but I also crawled through the ASM output for multiple variations of the main loop and individual instruction handlers so that I could know precisely what was going on.

The caching of code is irrelevent. Branch misprediction penalties always exceed the latency of the L2 cache, and by as much as 50% in the Core2 case (100% in the case of a P4 with its whopping 20 stage pipeline)That's still far less than main memory latency, which is what you are claiming matters. That's on the order of 100 cycles.

Not to toot my own horn, but when you need an expert on the subject of processor-specific efficiency issues, you would come to me. I'm not Agner Fog, but i'm possibly the next best thing.You are obviously well-informed on the subject, but you've also made some statements that aren't true, or don't follow earlier statements.

The runtime is dominated first by the cost of an L2 cache miss, and second by the cost of a branch misprediction. If you don't believe me, run MAME under AMD's code-analyst and count the L2 misses and branch mispredictions.Okay, I can accept that. But that doesn't mean that performance wouldn't be far worse if it wasn't for the cache, which is what you said earlier:
AMD's code cache, which is 50% of cache memory on AMD's, will be wasted space.You make a lot more sense in this post, but I wasn't responding to this post.

..and it is split into 512K code and 512K data.No. Check the data sheet. (http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/33425.pdf) 1MB unified cache per core. See also this (http://developer.amd.com/articles.jsp?id=90&num=18), which is an AMD document on measuring performance of "the unified L2 cache".

(And the Core 2 Duo has 4MB unified L2 cache shared by two cores.)

If you are going to quote cache numbers in a refutation about data access, at least get the numbers right. Thanks.Words to live by.

An emulator is going to have a much larger footprint than this, and footprint is everything. Mame, and other old arcade game emulators, emulate everything right down to how long its been since a piece of ram was written to so that it can properly emulate memory not being refreshed.Okay, that's reasonable. The question still remains what effect this has on the cache. You're emulating the video memory, for example, and you're going to have a pretty high turnover on that. But enough to flush the emulated code from a 16-way associative LRU cache?

As I already noted, the working set is much larger than the combined ram's and rom's of the machine being emulated, especialy for the machines for which multiple threads are not advantageous.That's quite possible. But that's not what you said. You said something rather more specific:
The speed at which the code being emulated can be fetched from ram is the biggest issue.That's a claim not only that the working set is larger than cache - very plausible - but that the primary performance issue is the fetching of the emulated code from RAM. That would require both that the code is constantly being evicted from L2 cache and that it outweighs the other factors. I will admit that this is not impossible, but even that doesn't make cache useless.

So that statement I consider rather dubious, and this one:
AMD's code cache, which is 50% of cache memory on AMD's, will be wasted space.
Is flat-out wrong.

L2 cache misses tend to dominate something like this not because the L1 or L2 caches aren't greatly improving performance, but because - as you note - L2 cache misses are enormously expensive. Amdahl's Law, only backwards. Turn off the cache and every instruction is an L2 miss. Even if every instruction was already a branch mispredict (impossible in real code), you're still talking about a factor of 10 slowdown.

Now, it might be that you were saying specifically that AMD's dedicated L2 code cache is useless, because (say) the emulator (as opposed to emulated) code fits effectively in the L1 code cache. That would make sense, except that there's no such thing. I assumed you were talking about L1 code cache for that reason, which rendered the statement nonsensical.

PixyMisa
2nd August 2007, 09:10 AM
AMD64's pipeline is at least 11 stages (possibly 12.. AMD isnt saying) The minimum branch misprediction penalty is 12 cycles

The caching of code is irrelevent. Branch misprediction penalties always exceed the latency of the L2 cache
Just on that - I believe that current AMD 64 cores have a 14-cycle L2 latency. Minor detail, but there's no reason that branch mispredicts have to be longer than L2 cache latency.

Reclaimer
2nd August 2007, 08:59 PM
But that's not MAME itself benefiting from the dual core. Like I said, a dual core will help with background processes, but MAME is very much a single threaded program and has to be.

A 2 ghz single core CPU will run MAME much faster than a 1.5 dual core. And it only makes sense that you'd turn off things like filesharing, though if you must, there are many settings to keep MAME from using 100% of your CPU.


I'm not trying to sell single core CPUs or something. But if someone is looking for something specific, it might make sense. For example, they want something cheap and typically run applications that either aren't CPU intensive or the only thing going (like MAME, mentioned in the OP).


I wouldn't call that a fact. If you can look it up, please do, but unless we're talking about a very old system, neither RAM nor a video card will help MAME.

System RAM will only figure in if MAME (and even then, only the base code and game driver, not all of MAME) and the ROM are more than available RAM. 500 gigs is plenty and you can probably get by with 250. As for the video card, unless it's very old it, too, won't matter. It does no processing at all as far as the emulation goes, and only draws the screen. Video RAM will figure in with high resolutions, palettes, or something like that which can be set lower, if need be. 128 megs of video RAM should be plenty.



First, check your RDTSC timing option in MAME and turn it off for the laptop. It's an option specifically for laptops. Then I'd check if you have MAME set to sync with your monitor refresh. Finally, I'd check your laptop itself to see if there's some odd setting or other with respect to throttling CPU speed because MAME might be trying to max it out completely. This is a bit of a derail; if you'd like we can continue this elsewhere but it should run fine.

A 1 ghz CPU will run most ROMs just fine even Neo Geo and CPS-2 (just nothing 3D). And, keep in mind, MAME also does emulate some things that no PC made yet can run well. MAME's purpose isn't to play games; it's more of a side-effect.

I completely agree with you that you don't necessarily need a dual core CPU, for instance if you're just building a box to surf the web or do minor things, for instance a kid's computer or something for mere entertainment, by all means a single good CPU will do you just fine.
But for your only computer in your household or you want to be able to do a lot of different things, dual core or better is the way to go. Its good to be versatile.

As for the MAME thing, everything you said makes perfect sense. My line of thought comes from the fact I used to work with the guy who did all the front end graphics for the MAMEOx The Xbox MAME emulator. He was telling me that the Xbox can run a lot of roms well, it is limited with its 700MHZ Intel celeron and whatever GPU it had.. I think it was an Nvidia with 64 megs or something to that effect. While he worked with the dev team, it came to light that serious ram and a serious graphics card does better off than just a good CPU alone. But that was then, this is now. This was around 2001 or so.

egslim
4th August 2007, 03:20 AM
But for your only computer in your household or you want to be able to do a lot of different things, dual core or better is the way to go. Its good to be versatile.
I played with single, dual and quad core systems. One thing I found is that multi-core does not always equate to better I/O performance, I actually encountered situations where they performed worse.
The main advantage of multi-core is if you either run multi-threaded software (no-one ever disputed that) or if you run a CPU intensive thread with high priority (poor scheduling).

For consumers I advise dual core, since they are often clocked higher than single core models, but quad core is a waste. You have two unused cores sucking power and producing heat.

illogical
4th August 2007, 04:41 AM
He was telling me that the Xbox can run a lot of roms well, it is limited with its 700MHZ Intel celeron and whatever GPU it had.

MAME started using Direct 3D a few years ago. so the graphics card matters more than it used to. there are around 7 ways to do graphics on MAME, in my experience DirectX on win worked best.

it *might* benefit from the extra cpu. i say that because trimming down the linux or windows install increased the fps. i'll have to look at the benchmarks.