In this paper we present how Intel's Single-Chip-Cloud processor behaves for parallel macro pipeline applications. Subsets of the SCC's available cores can be arranged as a pipeline where each core processes one stage of the overall workload. Each of the independent cores processes a small part of a larger task and feeds the following core with new data after it finishes its work. Our case-study is a parallel rendering system which renders successive images and applies different filters on them. On normal graphics adapters this is usually done in multiple cycles, we do this in a single pipeline pass. We show that we can achieve a significant speedup by using multiple parallel pipelines on the SCC. We show that we can further improve performance by using SCC's controlling PC in conjunction with the SCC. We also identify aspects of the SCC that hinder the overall performance, mainly the lack of local memory banks for each core on the SCC. The results presented in this paper are not limited to only image processing, but users could expect similar experiences where macro pipelining is used in other applications on the SCC.

Original document

The different versions of the original document can be found in:

Back to Top

Document information

Published on 31/12/12
Accepted on 31/12/12
Submitted on 31/12/12

Volume 2013, 2013
DOI: 10.1109/ipdpsw.2013.136
Licence: CC BY-NC-SA license

Document Score


Views 1
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?