ok so, from the anonymous benefactor (independent of the shakti team), the one who sponsored me with the zc706 FPGA developer board, he has just had an interesting meeting with MOSIS, and has confirmed that they have an LPDDR3 PHY, so in combination with the DDR3 *controller* that was developed and released by someone working at CERN, this would be the last major piece of the "interfaces" side of puzzle that, until then, blocked progress.
so he asked, also, what would it take to get things ready within a year, to hand over the ASIC-based design to MOSIS, for them to turn it into an ASIC, and i said "a team of engineers would need to be paid for". he asked - and please note the question very carefully - "would USD 250,000 be enough?" to which i replied (genuinely) yes... if done carefully. software also has to be taken care of.
please note: *that's as far as the conversation has gone so far*.
it is still exciting, and the next phase would be to get a strong committment and then i can start finding people to do software bring-up and also develop the VLSI / VHDL which will put all this together - mostly that's glue logic for the interfaces (putting them onto a "Tile" interface if using the rocket-chip or BOOM architecture) but also designing a multiplexer GPIO bank.
i'm also talking to jeff from nyuzi, he designed a software-driven "compute engine that happens to be reasonably good at 3D", the interesting bit being that he's focussed on working out which areas need performance improvements. this is something that's almost completely lacking in the published academic world... *because nobody in the academic world has designed and published a GPU!*
we worked out that nyuzi is approximately 1/16th the speed of MALI400. roughly. although area-for-area it's quite hard to tell whether that's a fair assessment because you can't *get* die areas for MALI400 (anyone know anything better than these estimates? https://forum.beyond3d.com/posts/1176110/ ) and it's the performance / mm^2 / watt that's critical, we worked out that if you put in 2 nyuzi cores and managed to halve the number of instructions / pixel by replacing critical path blocks with hardware-rendering ones, you'd end up at about 25% the performance of MALI400 and that i feel would be "good enough" for a first version. i'd be interested to hear what people think, here.
the *general-purpose compute* performance of nyuzi on the other hand is really good.
anyway lots of planning to do.
l.
--- crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68