December 2008 - Hardwarebug

The NEON coprocessor found in the Cortex-A8 operates asynchronously from the ARM pipeline, receiving its instructions from the ARM execution unit through a 16-entry FIFO. Furthermore, the NEON unit has its own load/store unit. This suggests that some mechanism exists to resolve data hazards between the ARM and NEON units such that memory operations appear as if the instructions were executed entirely in order.

Although clearly important with a view to code optimisation, the Cortex-A8 Technical Reference Manual unfortunately does not mention any details about these hazards. In fact, it does not mention them at all.

To sched some light on the situation, I ran a simple benchmark to determine two important parameters of ARM-NEON memory hazard resolution: granularity and latency.

Continue reading →

Monthly Archives: December 2008

ARM-NEON memory hazards

Recent Posts

Recent Comments

Categories

Archives

Meta