[Arm-netbook] [review] SoC proposal

lkcl luke luke.leighton at gmail.com
Wed Feb 8 22:37:57 GMT 2012


On Wed, Feb 8, 2012 at 10:09 PM, Alec Ari <neotheuser at ymail.com> wrote:

> You're asking for a ridiculously, and unrealistically optimized open source software engine,

 no, i'm asking for opinions on how it can be achieved, not for
opinions on how it will fail.  lots of people know how to fail: i'm
looking for people who know how it can be *done*.


> and to have 1080p playback.... Current x86 CPUs, take the AMD Phenom II for example isn't even that great for Mesa's softpipe (100% software driver for Gallium) and you're talking about a low power ARM processor,

 ah no, i'm not.  i didn't say ARM, did i?  i specifically avoided and
made no mention of the RISC CPU core selected.  i mentioned ARC, and
the reason i mentioned ARC is because the lead developer at Synopsys
for the ARC RISC core also worked on the Video Decode Engine as well.

 i _didn't_ mention that the CPU RISC core i'm looking at has VLIW
support that roughly trebles its perceived clock-rate (instruction
cycle execution rate).

 anyway.

 the ARC Video decode engine, we had a meeting with Synopsis and went
over it.  there are over 1,000 video instruction extensions, and the
Video DSP core in which those (SIMD) instructions are implemented has
its own separate 128-bit-wide memory bus that the main CPU does *not*
have access to.  it's a bit odd.

 we evaluated the possibility of coping with 1080p30 video decode, and
worked out that after one of the cores has been forced to deal with
CABAC decode all on its own, the cores could then carry out the
remaining parts of 1080p30 decode in parallel, at about 1ghz, quantity
4.

 so, technically, i am confident that the instruction sets can and do
exist: what i *don't* know is how much memory bandwidth is involved,
what the cache architecture will need to look like, the assembly code
size of the critical inner loops will they fit in 1st level caches
etc. etc.


 regarding 3D graphics: i've taken a quick peek at this:
http://www.mips.com/products/architectures/mips-3d-ase/

 in that document, it describes how if you provide SIMD reduction-add,
reduction-multiply, reduced-accuracy 1/x, reduced-accuracy 1/(x*x),
reduced-accuracy FP-to-int conversion, SIMD "fp-zero" tests and a few
others, you can actually get decent 3D performance.

 plugging that into Gallium3D and LLVMpipe in an *easy* fashion i.e.
with support in gcc built-in (similar to altivec) or whatever it
takes, i don't know, and that's what i'm asking.

 i sure as shit don't want a CPU designed that fails, do i??  i want a
CPU that succeeds and achieves the required goal, so that it doesn't
waste $20 to $30 million of potential investor's money.

 so yes, thank you for pointing out those failed projects, _why_ did
they fail, what was missing, and what, if they had near-carte-blanche
to _make_ them succeed, would they need _to_ succeed?

l.



More information about the arm-netbook mailing list