[Arm-netbook] [review] SoC proposal

Iliya Georgiev ikgeorgiev at gmail.com
Sat Feb 11 17:47:12 GMT 2012


2012/2/11 lkcl luke <luke.leighton at gmail.com>

> On Sat, Feb 11, 2012 at 8:51 AM, Iliya Georgiev <ikgeorgiev at gmail.com>
> wrote:
>
> > Hi,
> > According to my very rough estimation 100 Giga-MACs performance is at
> least
> > equivalent to the performance of AMD Radeon HD 6250 GPU.
>
>  but... but... that's just one processor, and i was naively thinking
> it'd be great to put down 8!  it can't _possibly_ be right that the
> power consumption is 6mW to do 100 Giga-MACs in 28nm @ 1ghz, and only
> 0.02 sq.mm surely??
>
> > That is the GPU
> > used in AMD G-Series SoC - the one you have chosen for future EOMA-68
> SoCs.
> > (I suggest you have to ask Xtensa about what stays behind "100 Giga-MACs
> > performance" - 1-, 2-, 4- or 8-core configuration?)
>
>  yeah the list of questions is getting considerable :)
>
>  thanks iliya.
>
>  l.
>
> _______________________________________________
> arm-netbook mailing list arm-netbook at lists.phcomp.co.uk
> http://lists.phcomp.co.uk/mailman/listinfo/arm-netbook
> Send large attachments to arm-netbook at files.phcomp.co.uk
>

Luke,
I have been a witness of many presentations, where the numbers were used to
show whatever the presenter wanted. Even the worst financial results can be
presented as a success.
So as you pointed correctly we have to find the measure. Back to the die
size measurment, most of die area in CPUs is occupied my caches. In the
specifications of Xtensa LX4 data caches are optional, so the reported die
size for 28 nm part of 0.02 mm^2 (for 45 nm part - 0.05 mm^2) should be
checked for the included features, caches and etc. For certain workloads
like graphics accerelation bigger caches is not necessary, unlike high
speed memory access. But for general purpose CPUs, it is better to have
bigger cache.

A. Here are the questions that I would ask Teselica application engineers
if I had to deal with them. First of all I will have an idea of workloads
that I need. For example:

1. For general processing -  a workload equivalent to Tegra 2 CPU
performance. How Xtensa compares to Tegra 2 CPU performance? What is the
Xtensa configuration to catch up that level of performance?
2. For video decoding -  a workload equivalent to fixed unit of video
decoding unit like in Allwinner 10. What is the Xtensa configuration to
catch up that level of performance?
3. For 3D graphics - a workload equivalent to Radeon 6250 GPU. What is the
Xtensa configuration to catch up that level of performance?
4. For base band...
Then after lot of discussions, tests and simulations I suppose that I will
have to decide on the right balance of features - number of cores, core
clocks, caches, memory access speed and etc.

B. To the key selling point of Tensilica, programmability, I would have the
following groups of questions:

1. Can all of the optional pre-defined execution units co-exist
simultaneously in one configuration? What is the price for the optional
pre-defined execution units?

Optional pre-defined execution units as in page 2 of the Xtensa LX4
specifications are:
- 32-bit multiplier and/or 16-bit multiplier and MAC
- Single-precision floating point unit
- Double-precision floating point acceleration
- 3-way 64-bit VLIW (VLIW3)
- Pre-defined 32-bit GPIO and FIFO-like Queue interfaces

2. Can all of the optional execution units for additional licensing
co-exist simultaneously with optional pre-defined execution units in one
configuration? Or they must substitute one or more of pre-defined units?
For example if only one unit for additional licensing can exists per
configuration and it occupies the 3-way 64-bit VLIW (VLIW3) unit, that
means that if we want audio using VLIW we will have difficulties to
implement 3D acceleration in the same core at the same time.
And what is the price of the licenses?

Optional execution units for additional licensing according the Xtensa LX4
specifications are:
- ConnX Vectra LX DSP engine
- ConnX Vectra VMB for baseband acceleration
- ConnX D2 DSP engine
- ConnX BBE16 Baseband engines
- HiFi 2 and HiFi EP Audio engines

3. Are the software/hardware development kits for programming and
configuration of Xtensa freely available for developers? If not, what is
the price?

4. What is the difference between Xtensa and Xilinx Zynq/Altera Cyclone
that have 28nm Dual-Core Cortex A9 with on-board FPGA? Both solutions have
fixed part and programmable part. Both solutions use SystemC as description
language. Xtensa claims that their solution is more close to software
programmability than pure hardware programmability in FPGAs. But is this
true?


Iliya
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.phcomp.co.uk/pipermail/arm-netbook/attachments/20120211/b7d20870/attachment-0001.html 


More information about the arm-netbook mailing list