GistTree.Com
Entertainment at it's peak. The news is by your side.

Nth re-posting of CISC vs. RISC (or what is RISC, really)

0
Index
Home
About
Blog


Newsgroups: comp.arch
From: mash@mash.engr.sgi.com (John R. Mashey)
Arena: Re: Exception going by in PowerPC [no: RISC-vs-CISC, one more time]
Message-ID: <D4qAvo.EIz@odin.corp.sgi.com>
Date: Tue, 28 Feb 1995 21: 11: 47 GMT

In article <3ibmb6$so1@newsbf02.news.aol.com>, danhicks@aol.com
(DanHicks) writes:

|> >>>
|> Oh?  Possess in thoughts the problem of designing exception going by for the P6.
|> It be pipelining, out-of-portray and speculative execution that assassinate
|> exception going by tricky, and both the CISC and RISC people are doing
|> as indispensable of this as they may be able to.  It has tiny to assassinate with the instruction
|> living.
|> <<<
|> 
|> True, but, as you uncover, the excellence between RISC and CISC is
|> disappearing, to the level where the distinguished distinction is psychological. 
|> And that is the reason the finest bid:  The RISC designers can no longer be joyful of
|> the importance of designing an architecture edifying to one thing other
|> than benchmarks, whereas the CISC designers realize that accommodating
|> working system requirements is a extreme phase of their job.

1) Connected is the Nth repost of the discussion of RISC-vs-CISC
non-convergence, i.e., architecture != implementation, one more time.
When that you simply would be in a position to well presumably need followed this newsgroup for a while, that you simply would be in a position to well presumably need considered this earlier than.
I savor incorporated some followon discussions, and finished minuscule editing.

2) The comments about RISC designers above are merely execrable;
"All generalizations are false", but in particular this one.

3) re: Exception-going by: as I've indispensable for years in public talks (the
"Automobile" talk in particular), tricky exception-going by is the worth for
machines with increased parallelism and overlap.  Errat sheets are
customarily stuffed with bugs around exception cases.  Oddly adequate,
speculative-execution, o-o-o machines could presumably furthermore very effectively fare greater than
you would judge, provided that they already need mechanisms for undoinbg
(more-or-less) done directions.  From history, clutch that the
360/91, circa 1967, had some imprecise exceptions, as did the 360/67
(early 360 with digital reminiscence), so attractive exception going by savor
been with us for a while.

====REPOST====
Article 22850 of comp.arch:
Route: mips!mash
Arena: Nth re-posting of CISC vs RISC (or what is RISC, in actuality)
Message-ID: <2419@spim.mips.COM>

Most of that you simply would be in a position to well presumably need considered most of this several cases earlier than; there is a
tiny editing, nothing huge.  Some followup comments were
added.


PART I - ARCHITECTURE, IMPLEMENTATION, DIFFERENCES
PART II - ADDRESSING MODES
PART III - MORE ON TERMINOLOGY; WOULD YOU CALL THE CDC 6600 A RISC?
PART IV - RISC, VLIW, STACKS


PART I - ARCHITECTURE, IMPLEMENTATION, DIFFERENCES

WARNING: you would are attempting to print this one to read it...
(from previous discussion):
>Anyway, it's no longer a supreme comparison.  Not by a lengthy stretch.  Let's glimpse
>how the Nth era SPARC, MIPS, and 88Okay's assassinate (assuming they closing)
>in contrast to some novel originate from scratch.

Successfully, there is baggage and there is BAGGAGE.
One must savor a examine out to distinguish between ARCHITECTURE and IMPLEMENTATION:
	a) Architectures persist longer than implementations, in particular
	user-level Instruction-Space Architecture.
	b) Doubtlessly the most most distinguished member of an architecture family is regularly designed
	with the present implementation constraints in thoughts, and whenever you occur to could well be
	lucky, system people had some input.
	c) When that you simply would be in a position to well be in actuality lucky, you dwell up for 5-10 years of technology
	trends, and that modifies your belief of the ISA you commit to.
	d) It be rather laborious to delete anything from an ISA, excluding where:
		1) You can accumulate that NO ONE uses a feature
			(the 68020->68030 deletions talked about by any individual
			else).
		2) You suspect that that you simply would be in a position to well presumably furthermore entice and emulate the feature
		"like a flash adequate".
			i.e., microVAX strengthen for decimal ops,
			68040 strengthen for transcendentals.
Now, one could well divulge that the i486 and 68040 are RISC implementations
of CISC architectures .... and I judge there is some fact to this,
but I also judge that it'll confuse issues badly:

Somebody who has studied the history of computer originate knows that
high-performance designs savor primitive many of the the same tactics for years,
for the entire natural causes, that is:
	a) They utilize as indispensable pipelining as they may be able to, in some cases, if this
	design a high gate-count, then so be it.
	b) They utilize caches (separate I & D if helpful).
	c) They utilize hardware, no longer micro-code for the less complicated operations.
(As an instance, scrutinize on the evolution of the S/360 products.
Rob that the 360/85 primitive caches, help around 1969, and within a pair of
years, so did any mainframe or supermini.)

So, what disagreement is there among machines if the same implementation
tips are primitive?
A: there is a extraordinarily explicit living of traits shared by most
machines labeled RISCs, most of which could well be no longer shared by most CISCs.
The RISC traits:
	a) Are geared toward more performance from present compiler technology
	(i.e., adequate registers).
OR
	b) Are geared toward like a flash pipelining
		in a digital-reminiscence ambiance
		with the flexibility to quiet dwell on exceptions
		without inextricably increasing the number of gate delays
		(glance that I swear gate delays, NOT correct what number of gates).

Even supposing various RISCs savor made various choices, most of them savor
been very careful to miss those issues that CPU designers savor chanced on
subtle and/or costly to implement, and in particular, issues that
are painful, for somewhat tiny succeed in.

I would divulge, that even as RISCs evolve, they'll furthermore savor certain baggage
that they'd wish weren't there .... but no longer very indispensable.
In notify, there are a bunch of design traits shared by
RISC ARCHITECTURES that clearly distinguish them from CISC architectures.

I may give a pair of examples, followed by the detailed evaluation:

MOST RISCs:
	3a) Non-public 1 size of instruction in an instruction movement
	3b) And that size is 4 bytes
	3c) Non-public a handful (1-4) addressing modes) it's VERY
	laborious to count these items; will discuss about later).
	3d) Non-public NO indirect addressing in any originate (i.e., where you wish
	one reminiscence opt up admission to to determine on up the address of 1 other operand in reminiscence)
	4a) Non-public NO operations that combine load/retailer with arithmetic,
	i.e., like add from reminiscence, or add to reminiscence.
	(existing: this vogue in particular averting operations that utilize the
	designate of a load as input to an ALU operation, in particular when
	that operation can honest an exception.  Loads/stores with
	address modification can regularly be OK as they devise no longer savor some of
	the rotten effects)
	4b) Non-public no more than 1 reminiscence-addressed operand per instruction
	5a) Invent NOT strengthen arbitrary alignment of facts for hundreds/stores
	5b) Utilize an MMU for a facts address no more than as soon as per instruction
	6a) Non-public >=5 bits per integer register specifier
	6b) Non-public >= 4 bits per FP register specifier
These tips provide a somewhat certain dividing line among architectures,
and I judge there are somewhat solid technical causes for this, such
that there is one more interesting attribute: nearly every architecture
whose first instance appeared within the marketplace from 1986 onward obeys the
tips above .....
	Present that I didn't swear anything about counting the number of
	directions....
So, right here's a desk:
C: number of years since first implementation offered in this family
(or first bid which with right here's binary compatible).
Present: this desk used to be first finished in 1991, so year = 1991-(age in desk).
3a: # instruction sizes
3b: most instruction size in bytes
3c: number of certain addressing modes for having access to facts (no longer jumps)>
I didn't count register or
literal, but simplest ones that referenced reminiscence, and I counted assorted
formats with assorted offset sizes one at a time.  This used to be laborious work...
Additionally, even when a machine had assorted modes for register-relative and
PC_relative addressing, I counted them simplest as soon as.
3d: indirect addressing: 0: no, 1: yes
4a: load/retailer combined with arithmetic: 0: no, 1:yes
4b: most number of reminiscence operands
5a: unaligned addressing of reminiscence references allowed in load/retailer,
	without explicit directions
	0: no never (MIPS, SPARC, and so forth)
	1: every so regularly (as in RS/6000)
	2: correct about any time
5b: most number of MMU uses for facts operands in an instruction
6a: number of bits for integer register specifier
6b: number of bits for 64-bit or more FP register specifier,
	certain from integer registers

Present that every indubitably this form of are ARCHITECTURE issues, and it's regularly moderately
subtle to either delete a feature (3a-5b) or assassinate bigger the number
of right registers (6a-6b) given an initial isntruction living originate.
(yes, register renaming can again, but...)

Now: objects 3a, 3b, and 3c are a signal of the decode complexity
	3d-5b hint on the benefit or bid of pipelining, in particular
	within the presence of digital-reminiscence requirements, and want to creep
	like a flash while quiet taking exceptions sanely
	objects 6a and 6b are more associated to skill to take correct profit
	of present compilers.
	There are one other attributes which will furthermore be necessary, but I could well no longer
	bear in mind the finest blueprint to assassinate metrics for them without being very subjective;
	for instance "stage of sequential decode", "number of writebacks
	that you simply would are attempting to assassinate within the course of an instruction, but can no longer,
	since you wish to wait to assassinate certain you glimpse the entire instruction
	earlier than committing any scream, since the closing phase could well honest a
	online page fault,"  or "irregularity/assymetricness of register utilize",
	or "irregularity/complexity of instruction formats".  I would adore to
	utilize those, but correct create no longer know the finest blueprint to measure them.
	Additionally, I could well be relaxed to listen to corrections for all these.

So, right here's a desk of 12 implementations of various architectures,
one per architecture, with the attributes above.  Factual for relaxing, I'm
going to creep away the architectures coded first and fundamental, though I may identify
them later.  I may blueprint a line between H1 and L4 (clearly,
the RISC-CISC Line), and also, on the top of every column, I am going
to position a rule, which, in that column, many of the RISCs obey.
Any RISC that would no longer obey it's marked with a +; any CISC that DOES
obey it's marked with a *.  So...
	1991
CPU	Age	3a 3b 3c 3d	4a 4b 5a 5b	6a 6b	# ODD
RULE	<6	=1 =4 <5 =0	=0 =1 <2 =1	>4 >3
-------------------------------------------------------------------------
A1	4	 1  4  1  0	 0  1  0  1	 8  3+	1
B1	5	 1  4  1  0	 0  1  0  1	 5  4	-
C1	2	 1  4  2  0	 0  1  0  1	 5  4	-
D1	2	 1  4  3  0	 0  1  0  1	 5  0+	1
E1	5	 1  4 10+ 0	 0  1  0  1	 5  4	1
F1	5	 2+ 4  1  0	 0  1  0  1	 4+ 3+	3
G1	1	 1  4  4  0	 0  1  1  1	 5  5   -
H1	2	 1  4  4  0	 0  1  0  1	 5  4	-	RISC
---------------------------------------------------------------
L4	26	 4  8  201  2  2  4	 4  2	2	CISC
M2	12	12 12 15  01  2  2  4	 3  3	1
N1	10	21 21 23  1	 1  2  2  4	 3  3	-
O3	11	11 22 44  1	 1  2  2  8	 4  3	-
P3	13	56 56 22  1	 1  6  2 24	 4  0	-

A interesting exercise is to analyze the ODD cases.
First, glance that of 12 architectures, in simplest 2 cases does an
architecture savor an attribute that locations it on the execrable facet of the motorway.
Of the RISCs:
-A1 is a tiny habitual in having more integer registers, and no more FP
than frequent.  [Actually, slightly out of date, 29050 is different,
using integer register bank instead, I hear.]
-D1 is habitual in sharing integer and FP registers (that is what the
D1:6b == 0).
-E1 appears irregular in having a tall number of address modes.  I judge most of this
is an artifact of the design that I counted, as this architecture in actuality simplest
has a essentially cramped number of ways to assassinate addresses, but has several
assorted-sized offsets and combinations, but all within 1 4-byte instruction;
I believe about that or no longer it's addressing mechanisms are essentially MUCH less complicated
than, for instance, M2, or in particular N1, O3, or P3, but the notify number
would no longer clutch it very effectively.
-F1 .... is no longer offered any longer.
-H1 one could well argue that this course of has 2 sizes of directions,
but I would glance that at any level within the instruction movement, the directions
are either 4-bytes lengthy, or 8-bytes lengthy, with the environment finished by a mode bit,
i.e., no longer dynamically encoded in every instruction.

Of the processors referred to as CISCs:
-L4 occurs to be one by which that you simply would be in a position to well presumably furthermore uncover the length of the instruction
from the distinguished few bits, has a rather traditional instruction decode,
has somewhat few addressing modes, no indirect addressing.
Basically, a gargantuan subset of its directions are if fact be told rather RISC-like,
though one other subset is extremely CISCy.
-M2 has a myriad of instruction formats, but thankfully refrained from
indirect addressing, and if fact be told, MOST of directions simplest savor 1
address, excluding for a cramped living of string operations with 2.
I.e., in this case, the decode complexity could presumably furthermore very effectively be high, but most directions
can no longer flip into a pair of-reminiscence-address-with-facet-effects issues.
-N1,O3, and P3 are if fact be told rather orderly, orthogonal architectures, in
which most operations can constantly savor operands in either reminiscence or
registers, and there are somewhat few weirdnesses of special-cased uses
of registers.  Sadly, they also savor indirect addressing,
instruction formats whose very orthogonality nearly guarantees sequential
decoding, where or no longer it's laborious to even know how lengthy an instruction is till
you parse each portion, and which will savor facet-effects where you would desire to
assassinate a register write-help early, but either:
	must wait till you glimpse the entire instruction till you commit scream
or
	have to savor "undo" shadow-registers
or
	must utilize instruction-continuation with rather tricky exception
	going by to revive the scream of the machine
It's a ways on the entire interesting to existing that the distinctive member of the family to
which O3 belongs used to be somewhat less complicated in one of the most intense areas,
with simplest 5 instruction sizes, of most size 10 bytes, and no indirect
addressing, and requiring alignment (i.e., it used to be a substantial more RISC-like
originate, and it would be a moderately hypothesis to know if that
extra complexity used to be necessary in note).
Now, right here's the desk again, with the labels:
	1991
CPU	Age	3a 3b 3c 3d	4a 4b 5a 5b	6a 6b	# ODD
RULE	<6	=1 =4 <5 =0	=0 =1 <2 =1	>4 >3
-------------------------------------------------------------------------
A1	4	 1  4  1  0	 0  1  0  1	 8  3+	1	AMD 29Okay
B1	5	 1  4  1  0	 0  1  0  1	 5  4	-	R2000
C1	2	 1  4  2  0	 0  1  0  1	 5  4	-	SPARC
D1	2	 1  4  3  0	 0  1  0  1	 5  0+	1	MC88000
E1	5	 1  4 10+ 0	 0  1  0  1	 5  4	1	HP PA
F1	5	 2+ 4  1  0	 0  1  0  1	 4+ 3+	3	IBM RT/PC
G1	1	 1  4  4  0	 0  1  1  1	 5  5   -	IBM RS/6000
H1	2	 1  4  4  0	 0  1  0  1	 5  4	-	Intel i860
---------------------------------------------------------------
L4	26	 4  8  201  2  2  4	 4  2	2	IBM 3090
M2	12	12 12 15  01  2  2  4	 3  3	1	Intel i486
N1	10	21 21 23  1	 1  2  2  4	 3  3	-	NSC 32016
O3	11	11 22 44  1	 1  2  2  8	 4  3	-	MC 68040
P3	13	56 56 22  1	 1  6  2 24	 4  0	-	VAX

Neatly-liked observation: this could presumably furthermore sound irregular, but within the very lengthy timeframe, it can
be more straightforward to take care of a in actuality complicated bunch of instruction
formats, than with a complex living of addressing modes, because on the very least
the oldschool is more amenable to pre-decoding into a cache of
decoded directions which will furthermore be pipelined moderately, whereas the pipeline
on the latter can opt up very tricky (examples to practice).  This could presumably lead to
the silly assassinate that a somewhat "orderly", orthogonal archiecture could presumably furthermore very effectively
be more difficult to assassinate speed like a flash than individual that is less orderly.  Clearly, every
weirdness has or no longer it's penalties....  But take into tale the standard bid
of pipelining one thing like (on a VAX):
	ADDL	@(R1)+,@(R1)+,@(R2)+

(I.e., one thing that, could well theoretically arise from:
	register r1, r2;
	r2++ = r1++ + r1++;

Now, take into tale what the VAX has to assassinate:
1) Decode the opcode (ADD)
2) Discover first operand specifier from I-movement and work on it.
	a) Compute the reminiscence address from (r1)
		If aligned
			speed by MMU
				if MMU miss, fixup
			opt up admission to cache
				if cache miss, assassinate write-help/top off
		Elseif unaligned
			speed by MMU for first phase of facts
				if MMU miss, fixup
			opt up admission to cache for that phase of facts
				if cache miss, assassinate write-help/top off
			speed by MMU for second phase of facts
				if MMU miss, fixup
			opt up admission to cache for second phase of facts
				if cache miss, assassinate write-help/top off
		Now, in either case, we now savor a longword that has the
		address of the notify facts.
	b) Increment r1  [well, this is where you'd LIKE to do it, or
	in parallel with step 2a).]  On the opposite hand, glimpse later why no longer...
	c) Now, catch the notify facts from reminiscence, the utilize of the address correct
	got, doing the entire lot in step 2a) again, yielding the
	right facts, which we needto stick in a non permanent buffer, because it
	would no longer if fact be told creep in a register.
3) Now, decode the second operand specifier, which goes by the entire lot
that we did in step 2, simplest again, and leaves the leads to a second
non permanent buffer. Present that we would desire to be beginning this earlier than we opt up
finished with all of 2 (and I THINK the VAX9000 doubtless does that??) but
try to savor a examine out to bypass/interlock on doable facet-effects to
registers .... if fact be told, that you simply would be in a position to well presumably furthermore effectively have to preserve shadow copies of
every register that could presumably well furthermore opt up written within the instruction, since every
operand can utilize auto-increment/decrement. You're going to doubtless need badly to
try to compute the address of the second argument and assassinate the MMU
opt up admission to interleaved with the reminiscence opt up admission to of the distinguished, though the
skill of any operand to need 2-4 MMU accesses doubtless makes this
tricky.  [Recall that any MMU access may well cause a page fault....]

4) Now, assassinate the add. [could cause exception]

5) Now, assassinate the third specifier .... simplest, it will doubtless be rather assorted,
reckoning on the personality of the cache, that is, that you simply would be in a position to presumably well no longer adjust cache or
reminiscence, except you understand this could presumably full.  (Why? effectively, explain that
the region you are storing into overlaps with indubitably one of the most indirect-addressing
words pointed to by r1 or 4(r1), and explain that the retailer used to be unaligned,
and explain that the closing byte of the retailer crossed a online page boundary and
introduced on a online page fault, and that you simply would already written the distinguished 3 bytes.
When you doubtless did this straightforwardly, and then tried to restart the
instruction, it wouldn't assassinate the the same bid the second time.

6) When that you simply would be in a position to well be certain all is effectively, and the retailer is on its design, then you
can safely change the 2 registers, but you would greater wait till the tip,
or else, again copies of any modified registers till that you simply would be in a position to well be certain or no longer it's protected.
(I judge both were finished ??)

7) You can presumably furthermore swear that this code is unlikely, but it completely is correct, so the CPU must
assassinate it.  This vogue has the following effects:
	a) You can have to difficulty about unlikely cases.
	b) You're going to desire to assassinate the work, with predictable uses of functional
	fashions, but instead, they may be able to assassinate unpredictable calls for.
	c) You're going to desire to prick the amount of buffering and scream,
	but it completely fees you in both to creep like a flash.
	d) Easy pipelining is extremely, very tricky: for instance, it's
	rather laborious to assassinate indispensable about the following instruction following the
	ADDL, (excluding some early decode, perchance), without a lot of gates
	for special-casing.
	(I've constantly been amazed that CVAX chips are like a flash as they are,
	and VAX 9000s are REALLY impressive...)
	e) EVERY reminiscence operand can doubtless honest 4 MMU uses,
	and hence 4 MMU faults that could presumably well furthermore very effectively be online page faults...
	f) AND there are even worse cases, just like the addp6 instruction, that
	can require *40pages to be resident to full... 
8) Possess in thoughts how "slothful" RISC designers could presumably furthermore furthermore be:
	a) Every load/retailer uses precisely 1 MMU opt up admission to.
	b) The compilers are every so regularly free to re-arrange the portray, even accurate by
	what would were the following instruction on a CISC.
	This gets rid of some stalls that the CISC could presumably furthermore very effectively be stuck with
	(in particular reminiscence accesses).
	c) The alignment requirement avoids in particular the problem with
	sending the distinguished phase of a retailer on the design earlier than that you simply would be in a position to well be SURE
	that the second phase of it's protected to assassinate.

Lastly, to be supreme, let me add the 2 cases that I knew of that were more
on the borderline: i960 and Clipper:
CPU	Age	3a 3b 3c 3d	4a 4b 5a 5b	6a 6b	# ODD
RULE	<6	=1 =4 <5 =0	=0 =1 <2 =1	>4 >3
-------------------------------------------------------------------------
J1	5	 4+ 8+ 9+ 0      0  1  0  2      4+ 3+	5	Clipper
K1	3	 2+ 8+ 9+ 0	 0  1  2+ -      5  3+	5	Intel 960KB

(I judge an ARM could well be in this design as effectively; I judge somebody as soon as
sent me an ARM-entry, but I will be succesful to no longer accumulate it again; sorry.)


Present: microscopic modification (I may mix this in some unspecified time in the future):


From jfc@MIT.EDU  Mon Nov 29 12: 59: 55 1993
Arena: Re: Why are Motorola's slower than Intel's ? [really what's a RISC]
Newsgroups: comp.arch
Organization: Massachusetts Institute of Know-how

Because you made your desk IBM has released a pair chips that strengthen
unaligned accesses in hardware even accurate by cache line boundaries and
could presumably furthermore retailer phase of an unaligned object earlier than taking a online page fault on
the second half, if the item crosses a online page boundary.

These are the RSC (single chip POWER) and PPC 601 (in accordance to RSC core).
    John Carr (jfc@mit.edu)

(Lend a hand to me; jfc's comments are correct; if I had time, I would add one other
line to assassinate PPC ... which, in some sense replays the S/360 -> S/370
history of relaxing alignment restrictions a tiny bit.  I conjecture that
on the very least some of this used to be finished to again Apple s/w migration.)


SUMMARY:
	1) RISCs part certain architectural traits, though there
	are variations, and some of those variations topic a lot.
	2) On the opposite hand, the RISCs, as a crew, are indispensable more alike than the
	CISCs as a crew.
	3) At the least all these architectural traits savor rather
	extreme consequences on the pipelinability of the ISA, in particular
	in a digital-reminiscence, cached ambiance.
	4) Counting directions appears to be rather irrelevant:
		a) It be HARD to if fact be told count directions in a meaningful
		design... (whenever you occur to disagree, I may divulge that the VAX is RISCier
		than any RISC, on the very least for phase of its instruction living :-)
		Why: VAX has a MOV opcode, whereas RISCs regularly savor
	  	a entire living of opcodes for {LOAD/STORE} {BYTE, HALF, WORD}
		b) Extra directions are no longer what REALLY hurts you, anyplace
		cease to as indispensable aspects which could well be laborious to pipeline:
		c) RISCs can completely effectively savor string-strengthen, or decimal
		arithmetic strengthen, or graphics transforms ... or hundreds
		irregular register-register transforms, and it's a ways going to furthermore no longer honest
		issues .....  but compare that with the final result of
		including a single instruction that has 2-3 reminiscence operands,
		each of which would perchance creep indirect, with auto-increments,
		and unaligned facts...

PART II - ADDRESSING MODES

Article: 30346 of comp.arch
Route: odin!mash.wpd.sgi.com!mash
Arena: Updated addressing mode desk
Message-ID: 
Nntp-Posting-Host: mash.wpd.sgi.com

I promised to repost this with fixes, and people were inquiring for it,
so right here it's again: whenever you occur to observed it earlier than, all that's in actuality assorted
is some fixes within the desk, and some clarified explanations:

THE GIANT  ADDDRESSING MODE TABLE (Corrections happily permitted)
This desk goes with the greater-level desk of total architecture
traits.

Take care of mode summary
r	register
r+	autoincrement (put up)	[by size of data object]
-r	autodecrement (pre)	[by size,...and this was the one I meant]
>r	adjust noxious register	[generally, effective address -> base]
				NOTE: every so regularly this subsumes r+, -r, and so forth,
				and is more total, so I categorize it
				as a separate case.

d	displacement		d1 & d2 if 2 assorted displacements
x	index register
s	scaled index
a	absolute	[as a separate mode, as opposed to displacement+(0)
I	Indirect

Shown below are 22 distinct addressing modes [you can argue whether
these are right categories].  In the desk are the *numberof assorted
encodings/variations [and this is a little fuzzy; you can especially
argue about the 4 in the HP PA column, I'm not even sure that's
right].  As an instance, I counted as assorted variants on a mode the
case where the building used to be the the same, but there were assorted-sized
displacements that needed to be decoded.  Present that meaningfully counting
addressing modes is *on the very least as rottenas meaningfully counting opcodes;
I did the finest I could well, and I spect a lot of hours taking a scrutinize at manuals
for the chips I hadn't programmed indispensable, and in some cases, even after
hours, it used to be laborious for me to resolve out meaningful numbers... *Mostof
these archiectures are primitive in most cases-honest programs and *mostsavor
on the very least one version that uses caches: those are crucial because many
of the flaws in interested by addressing modes near from their
interactions with MMUs and caches... 


	1  2  3  4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20  21  22
							             r   r
						           r  r  r   +d1 +d1
	            r  r  r |              |   r  r |   r  r+ +d +d1 I   +s
	   r  r  r  +d +x +s|         s+ s+|s+ +d +d|r+ +d I  I  I   +s  I  
	r  +d +x +s >r >r >r|r+ -r a  a  r+|-r +x +s|I  I  +s +s +d2 +d2 +d2
	-- -- -- -- -- -- --|-- -- -- -- --|-- -- --|-- -- -- -- --- --- ---
AMD 29Okay	 1		    |		   |	    |   
Rxxx	    1		    |		   |	    |   
SPARC	    1  1  	    |		   |	    |   
88Okay         1  1  1	    |		   |	    |   
HP PA       2  1  1  4  1  1|		   |	    |   
ROMP     1  2		    |		   |	    |    
POWER       1  1     1  1   |		   |	    |    
i860        1  1     1  1   |		   |	    |    
Swrdfish 1  1  1	    |       1	   |	    |    
ARM      2  2     2  1     1| 1  1
Clipper  1  3  1            | 1  1  2      |	    |    
i960KB   1  1  1  1  	    |       2  2   |    1   |    

S/360       1  		    |		        1   |    
i486     1  3  1  1	    | 1  1  2      |    2  3|   
NSC32Okay      3		    | 1  1  3  3   |   	   3|    	  9 	  
MC68000  1  1		    | 1  1  2	   |	2   |
MC68020  1  1		    | 1  1  2	   |	2  4|  	      	      16  16
VAX	 1  3     1	    | 1  1  1  1  1| 1     3| 1  3  1  3  


COLUMN NOTES:

1) Columns 1-7 are addressing modes primitive by many machines, but very few,
if any clearly-RISC architectures utilize anything else.  They are all
characterised by what they devise no longer savor:
	2 adds needed earlier than producing the address
	indirect addressing
	variable-sized decoding

2) Columns 13-15 contain rather easy-taking a scrutinize addressing modes, which nonetheless,
*could presumably furthermorerequire 2 help-to-help adds beforet he address is offered.  [*maybecause some of them use index-register=0 or something to avoid
indexing, and usually in such machines, you'll see variable timing figures,
depending on use of indexing.]

3) Columns 16-22 utilize indirect addressing.

ROW NOTES
1) Clipper & i960, of present chips, are more on the RISC-CISC border,
or are originate of "novel CISCs".  ARM is also characterised (by ARM people,
Scorching Chips IV: "ARM is no longer a "pure RISC".

2) ROMP has a lot of traits assorted from the remainder of the RISCs,
you would call it "early RISC", and it's of course no longer made.

3) You can take into tale HP PA rather irregular, as it appears to savor more
addressing modes, within the the same design that CISCs assassinate, but I create no longer judge this
is the case: or no longer it's a ways a controversy of whether or no longer you call one thing several modes
or one mode with a modifier, correct as there is misfortune counting opcodes
(with & without modifiers).  From my scrutinize, neither PA nor POWER savor
in actuality "CISCy" addressing modes.

4) Inquire disagreement between 68000 and 68020 (and later 68Ks): a bunch of
extremely-total & complex modes bought added...

5) Present that the addressing on the S/360 is on the entire rather easy,
largely noxious+displacement, though RX-addressing does take 2 regs+offset.

6) A dimension *no longershown on this notify chart, but also highly
associated, is that this chart reveals the assorted *kindsof modes, *no longerwhat number of addresses could presumably furthermore furthermore be tell in each instruction.  That could presumably well furthermore very effectively be worth
noting also:
	AMD : i960	1	one address per instruction
	S/360 - MC68020	2	up to 2 addresses
	VAX		6	up to 6

By taking a scrutinize at alignment, indirect addressing,  and taking a scrutinize simplest at those
chips that savor MMUs,
take into tale  the number of cases an MMU *could wellbe primitive per instruction for
facts address translations:
	AMD - Clipper	2		[Swordfish & i960KB: no TLB]
	S/360 - NSC32Okay	4
	MC68Ks (all)	8
	VAX		24

When RS/6000 does unaligned, it must be within the the same cache line
(and thus also in associated MMU online page), and traps to system in any other case, thus
averting a lot of gruesome cases.

Present: in some sense, S/360s & VAXen can utilize an arbitrary number of translations
per instruction, with MOVE CHARACTER LONG, or the same operations & I create no longer
count them as more, because they're outlined to be interruptable/restartable,
saving scream in most cases-honest registers, in preference to hidden inside of scream.

SUMMARY:
1) Pc originate kinds largely modified from machines with:
	2-6 addresses per instruction, with variable sized encoding
	address specifiers were regularly "orthogonal", so as that any could well ggo
		anyplace in an instruction
	every so regularly indirect addressing
	every so regularly need 2 adds *earlier thanfantastic address is offered
	every so regularly with many doable MMU accesses (and that that you simply would be in a position to well presumably furthermore bear in mind exceptions)
		per instruciton, regularly buried within the course of the instruction,
		and rarely *afteryou would veritably are attempting to commit scream because
		of auto-increment or other facet effects.
to machines with:
	1 address per instruction
	address specifiers encoded in cramped # of bits in 32-bit instruction
	no indirect addressing
	never need 2 adds earlier than address available
	utilize MMU as soon as per facts opt up admission to

and we regularly call the latter crew RISCs.  I swear "modified" because
whenever you occur to position this desk alongside with the sooner one, which has the
age in years, the older ones were one design, and the more moderen ones are assorted.

2)  Now, ignoring any other aspects, but taking a scrutinize at this single attribute
(architectural addressing aspects and implementation effects therof),
it must be definite that the machines in
the distinguished phase of the desk are doing one thing *technicallyassorted
from those within the second phase of the desk.  Thus, people could presumably furthermore every so regularly
call one thing RISC that's no longer, for advertising and marketing and marketing causes, but the people
calling the distinguished batch RISC in actuality did savor some extreme technical issues at
coronary heart.

3) Yet one more time: right here's *no longerto divulge that RISC is more fit than CISC,
or that the few within the center are rotten, or anything like that ... but
that there are definite technical traits...
  
PART III - MORE ON TERMINOLOGY; WOULD YOU CALL THE CDC 6600 A RISC?

Article: 39495 of comp.arch
Newsgroups: comp.arch
From: mash@mash.engr.sgi.com (John R. Mashey)
Arena: Re: Why CISC is rotten (used to be P6 and Previous)
Organization: Silicon Graphics, Inc.
Date: Wed, 6 Apr 94 18: 35: 01 PDT

In article <2nii0d$kkn@crl2.crl.com>, dbennett@crl.com (Andrea Chen) writes:

|> You can presumably furthermore very effectively be correct on the introduction of the timeframe,  but RISC does
|> confer with a faculty of computer originate that dates help to the
|> early seventies.

Here's all getting rather fuzzy and subjective, but it completely appears very complicated
to designate RISC as a faculty of belief that dates help to the early 1970s.

1) One can swear that RISC is a faculty of belief that bought current within the
early-to-mid 80's, and purchased frequent industrial utilize then.

2) One can swear that there were a pair of people (like John Cocke & co at IBM)
who were doing RISC-vogue learn initiatives within the mid-70s.

3) But whenever you occur to'd desire to creep help, as has been talked about in this newsgroup regularly,
many other people creep help to the CDC 6600, whose originate started in 1960,
and used to be delivered in 4Q 1964.  Now, while this wouldn't precisely fit the
right parameters of present RISCs, a tall deal of the RISC-vogue strategy
used to be there within the central processor ISA:
	a) Load/retailer architecture.
	b) 3-address register-register directions
	c) Merely-decoded instruction living
	d) Early utilize of directions agenda by compiler, expectation
	   that you simply would regularly program in high-level language and no longer regularly ever
	   resort to assembler, as you would search facts from compiler to assassinate effectively.
	e) Extra registers than frequent on the time
	f) ISA designed to assassinate decode/bid easy

Present that the 360/91 (1967) offered a correct example of setting up a
CISC-architecture into a high-performance machine, and used to be a interesting
comparison to the 6600.

4) Perchance there is some blueprint to divulge that RISC goes help to the 1950s,
but in most cases, most machines of the 1950s and 1960s create no longer feel very
RISCy (to me).  Possess in thoughts Burroughs B5000s; IBM 709x, 707x, 1401s; Univac 110x;
GE 6xx, and so forth, and naturally, S/360s.   Easy load/retailer architectures
were laborious to accumulate; there were regularly attractive instruction decodings required;
indirect addressing used to be current; machines regularly had very few accumulators.

5) When you are attempting to verify out sticking this within the matrix I've printed earlier than,
as finest as I clutch, the 6600 ISA veritably seemed like:

CPU		3a 3b 3c 3d	4a 4b 5a 5b	6a 6b	# ODD
RULE		=1 =4 <5 =0	=0 =1 <2 =1	>4 >3
-------------------------------------------------------------------------
CDC 6600	 2  1  0	 0  1  0  1	 3  3	4 (but  ~1 if supreme)

That is:
2: it has 2 instruction sizes (no longer 1), 15 & 30 bits (nonetheless, were packed
into 60-bit words, so whenever you occur to had 15, 30, 30, the second 30-bitter would no longer
wrong phrase boundaries, but would commence within the second phrase.)

*: 15-and-30 bit directions, no longer 32-bit.

1: 1 addressing mode  [Note: Time McCaffrey emailed me that one might consider
	there to be more, i.e., you could set address register to combinations
	of the others to give autoincrement/decrement/Index+offset, etc).
	In any case, you compute an address as a simpel combination of 1-2
	registers, andthen use the address, without furhter side-effects.

0: no indirect addressing

1: have one memory operand per instruction

0: do NOT support arbitrary alignment of operands in memory
	(well, it was a word-addressed machine :-)

1: use an MMU for data translation no more than once per instruction
	(MMU used loosely here)

3,3: had 3-bit fields for addressing registers, both index and FP

Now, of the 10 ISA attributes I'd proposed for identifying typical RISCs,
the CDC 6600 obeys 6.  It varies in having 2 instruction formats, and in
having only 3 bits for register fields, but it had simple packing of the
instructions in to fixed-size words, and register/accumulators were
pretty expensive in those days (some popular machines only had one
accumulator and a few index registers, so 8 of each was a lot).  Put
another way: it had about as many registers as you'd conveniently build
in a high-speed machine, and while they packed 2-4 operations into a
60-bit word, the decode was pretty straighforward.  Anyway, given the
caveats, I'd claim that the 6600 would fit much better in the RISC part
of the original table...


PART IV - RISC, VLIW, STACKS

Article: 43173 of comp.arch
Newsgroups: comp.sys.amiga.advocacy,comp.arch
From: mash@mash.engr.sgi.com (John R. Mashey)
Subject: Re: PG: RISC vs. CISC was: Re: MARC N. BARR
Date: Thu, 15 Sep 94 18:33:14 PDT

In article <35a1a3$mlb@doc.armltd.co.uk>, Clive.Jones@armltd.co.uk writes:

|> Really? The Venerable John Mashey's table appears to contain as many
|> exceptions to the rule about number of GP registers as most others.
|> I'm sure if one were to look at the various less conventional
|> processors, there would be some clearly RISC processors that didn't
|> have a load-store architecture - stack and VLIW processors spring to
|> mind.

I'm not sure I understand the point.  One can believe any of several
things:
	a) One can believe RISC is some marketing term without technical
	   meaning whatsoever.  OR
	
	b) One can believe that RISC is some collection of implementation
	   ideas.  This is the most common confusion.
	
	c) One can believe that RISC has some ISA meaning (such as RISC ==
	   small number of opcodes) ... but have a different idea of RISC
	   than do most chip architects who build them.  If you want to pay
	   words extra money every Friday to mean something different than
	   what they mean to practitioners ... then you are free to do so,
	   but you will have difficulty communicating with practitioners
	   if you do so.
	   EX: I'm not sure how stack architectures are "clearly RISC" (?)
	   Maybe CRISP, sort of.  Burroughs B5000 or Tandem's original
	   ISA: if those are defined as RISC, the term has been rendered
	   meaningless.
	   EX: VLIWs: I don't know any reason why I'd call VLIWs, in
	   general, either clearly RISC or clearly not.  VLIW is a technique
	   for issuing instructions to more functional units than you
	   have the die space/cycle time to decode more dynamically.
	   There gets to be a fuzzy line between:
		a) A VLIW, especially if it compresses instructions in
		   memory, then expands them otu when brought into the cache.
		b) A superscalar RISC, which does some predecoding on the
		   way from memory->cache, adding "hint" bits or rearranging
		   what it keeps there, speeding up cache->decode->issue.
	   At least some VLIWs are load/store architectures, and the operations
	   they do look usually look like typical RISC operations.
	OR, you can believe that:

	c) RISC is a term used to characterize a class of relatively-similar
	ISAs mostly developed in the 1980s.  Thus, if a knowledgable
	person looks at ISAs, they will tend to cluster various ISAs
	as:
		1) Obvious RISC, fits the typical rules with few exceptions.
		2) Obviously not-RISC, fits the inverse of the RISC
		   rules with relatively few exceptions.  Sometimes
		   people call this CISC ... but whereas RISCs, as a group,
		   have realitvely similar ISAs, the CISC label is sometimes
		   applied to a widely varying st of ISAs.
		3) Hybrid / in-the-middle cases, that either look like
		   CISCy RISCs, or RISCy CISCs.  There are a few of these.
	Cases 1-3 are appropriate may apply to reasonably contemporaneous
		processors, and make some sense.  and then 4)
		4) CPUs for which RISC/CISC is probably not a very relevant
		   classification.  I.e., one can apply the set of rules
		   I've suggested, and get an exception-count, but it
		   may not mean much in practice, especially when
		   applied to older CPUs created with vastly different
		   constraints than current ones, or embedded
		   processors, or specialized ones.  Sometimes an older
		   CPU might have been designed with some similar
		   philosophies (i.e., like CDC 6600 & RISC, sort of)
		   whether or not it happend to fit the rules.
		   Sometimes, die-space constraints my have led to
		   "simple" chips, without making them fit the suggested
		   criteria either.  personally, torturous arguments
		   about whether a 6502, or a PDP-8, or a 360/44 or an
		   XDS Sigma 7, etc, are RISC or CISC ... do not
		   usually lead to great insight.  After a while such
		   arguments are counting angels dancing on pinheads
		   ("Ahh, only 10 angels, must be RISC" :-).
	In this belief space, one tends to follow Hennessy & Patterson's
	comment in E.9 that "In the history of computing, there has never
	been such widespread agreement on computer architecture."
	None of this pejorative of earlier architectures, just the observation
	that the ISAs newly-developed in the 1980s were far more similar
	that the earlier groups of ISAs.  [I recall a 2-year period in
	which I used IBM 1401, IBM 7074, IBM 7090, Univac 1108, and
	S/360, of which only the 7090 and 1108 bore even the remotest
	resemblance to each other, i.e., at least they both had 36-bit words.]

Summary: RISC is a designate most in most cases primitive for a living of ISA traits
chosen to ease the utilize of aggressive implementation tactics tell in
high-performance processors (no topic RISC, CISC, or irrelevant).
Here's a helpful shorthand, but that's all, though it doubtless makes
sense to make utilize of the timeframe thae design or no longer it's regularly supposed by individuals who assassinate chips for
a living.
 
-john mashey    DISCLAIMER: 
UUCP:    mash@sgi.com 
DDD:    415-390-3090	FAX: 415-967-8496
USPS:   Silicon Graphics 6L-005, 2011 N. Shoreline Blvd, Mountain Peek, CA 94039-7311

From: mash@mash.engr.sgi.com (John R. Mashey)
Newsgroups: comp.arch
Arena: Re: Who started RISC? [really:RISC & CISC are different kinds of 
	words]
Date: 23 Jun 1995 20: 38: 19 GMT
Organization: Silicon Graphics, Inc.

In article <3sabb4$m14@cronkite.cisco.com>, jahlstrom@cisco.com (John
Ahlstrom) writes:

|> : ifetch/decode/assassinate phases, and then Cray & Thornton perfected within the
|> : CDC 6400 & 6600 what we now call superpipelined RISC machines.  This

|> They were completely RISCy (who coined THAT fantastic timeframe?) were they
|> RISC?  Let's interrogate Mashey.

1) This has near up earlier than & I talked about it in PART III of the same outdated
gargantuan "What's RISC" bid that I repost on occasion.  My respond used to be that
while it didn't precisely fit the architectural parameters of classic RISC,
it can veritably savor fit more within the RISC phase of the desk.

2) On the opposite hand, some of this entire thread has gotten rather crazy:

	a) The timeframe RISC, primitive effectively, appears to be a necessary description
	for a living of ISAs that:
		- Were designed at roughly the the same time
		- Shared on the very least some assumptions in system and hardware
			technology.
		- Non-public adequate similarities that they originate a somewhat tight
		  "cluster" within the dwelling of all ISAs.  (The gargantuan posting
		   described all these attributes).

	b) Place one opposite direction, it has been the case that if somebody talked about they
	   had designed a RISC CPU, and knew what they were talking about,
	   and also you knew nothing else about it, you would guess that it can savor
	   32-bit directions, load/retailer architecture, easy addressing
	   modes, 32-bit integer registers (or the glaring 64-bit extensions),
	   separate integer and FP rgisters, and so forth ...
	   and also you would doubtless largely guess correct.

	   At some level, attempting to argue whether or no longer or no longer a 30-year-primitive
	   CPU used to be RISC or no longer is counting angels dancing on pinheads,
	   that is, it's no longer necessary.  It can well be necessary to take into tale the
	   various explicit attributes of ISAs and glimpse what used to be there.

	   Factual because a CPU has a low gate-count would no longer assassinate it RISC ...
	   but it completely is also no longer definite that calling it CISC tells you indispensable, since:

	c) CISC is *no longera timeframe like RISC, i.e., I judge it used to be invented
	   to mean "no longer-RISC", and people regularly utilize it that design, thus
	   including the entire architectural dwelling *excludingthat tiny
	   cluster effectively labeled RISC.  [Note: this is not a pejorative
	   comment on the term CISC, just a note that people often use
	   a RISC-vs-CISC style argument as though the two terms had the
	   same nature ... and they don't].  As an instance, explain any individual
	   tells you they've worked on a CISC.  What would you understand about
	   the personality of that ISA?
		Would it savor a vary of instruction sizes?
			perchance ... but there were a lot of ISAs with
			a single instruction size that is doubtless to be very laborious
			to call RISCs.
		Would it savor indirect addressing, and if that is the case, 1-level,
			or multi-level?
			Perchance, perchance no longer [S360's don't].
		Would it savor separate integer and FP registers?
			Perchance, perchance no longer. [VAX's don't]
		Would it be a Neatly-liked Register Machine, a stack machine,
			or some more or less reminiscence-reminiscence machine?
			Totally unclear.
	   Thus, there are a bunch of somewhat the same ISAs referred to as RISCs;
	   there are a substantial greater bunch referred to as CISCs, that frequently savor tiny
	   in frequent.  There are a pair of that lie on the border, in some
	   multi-dimensional dwelling of ISA attributes.

	d) As a result, it's rather subtle (for me) to assassinate indispensable sense of
	arguments about which machines are RISCier or CISCier - I create no longer
	know of any natural linear scale of such issues [recall that the
	charts I've posted explicitly avoided computing a single RISC/CISC
	number].

Anyway, it completely looked as if it would me that Seymour Cray had one of the most the same
strategy as later RISC designers, but that the present RISC wave in particular
started with John Cocke & co at T. J. Watson.
		 


-john mashey    DISCLAIMER: 
UUCP:    mash@sgi.com 
DDD:    415-390-3090	FAX: 415-967-8496
USPS:   Silicon Graphics 6L-005, 2011 N. Shoreline Blvd, Mountain Peek, CA 94039-7311

Index
Home
About
Blog

Read More

Leave A Reply

Your email address will not be published.