{"id":2037,"date":"2016-10-31T07:39:22","date_gmt":"2016-10-31T07:39:22","guid":{"rendered":"http:\/\/support.plunify.com\/en\/?p=1794"},"modified":"2016-10-31T07:39:22","modified_gmt":"2016-10-31T07:39:22","slug":"opencl-living-heterogeneously","status":"publish","type":"post","link":"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/","title":{"rendered":"OpenCL: Living heterogeneously"},"content":{"rendered":"<p>Of all the good things that OpenCL promises, the most attractive proposition is how different processors and cores in a multi-compute-core system can be utilised and maximised with a single programming framework. The ability to combine processing modules of different capabilities to perform particular tasks is, of course, the heterogenous computing concept that has been talked about over the years.<\/p>\n<p style=\"text-align: center\"><img loading=\"lazy\" class=\"aligncenter wp-image-1797 size-medium\" src=\"http:\/\/support.plunify.com\/en\/wp-content\/uploads\/sites\/5\/2016\/10\/Screen-Shot-2016-10-29-at-12.38.13-AM-300x105.png\" alt=\"screen-shot-2016-10-29-at-12-38-13-am\" width=\"300\" height=\"105\" \/>Host: \"Doesn't matter what device runs my C code,<br \/>\nas long as the most effective one does.\"<\/p>\n<p>Being in FPGA land most of the time, naturally we at dream about how software programmers can easily port relevant functions in C, C++, etc. to FPGAs, smoothly getting 10x and more performance gains. As usage of FPGAs keep expanding into new areas, we also want C-to-RTL tools to work really well so that good FPGA designs can propagate and show their value as efficient and fast processing engines. However, because of the formidable difficulties in producing good FPGA-ready RTL for every possible software function, there have been many false dawns over the years in terms of \"Is this finally the go-to C-to-RTL tool?\"<\/p>\n<p>Certainly there have been glowing assessments on the <a href=\"http:\/\/www.nallatech.com\/wp-content\/uploads\/imperial_college_white_paper.pdf\">business-readiness of OpenCL for FPGA flows<\/a>\u00a0- in this particular instance, for financial computation. But without going into\u00a0what the best OpenCL for FPGA or High-Level Synthesis (HLS) tool is, let's\u00a0take a look\u00a0at one of the common\u00a0issues facing a software developer when attempting a C-to-RTL flow; an issue that can be said to\u00a0impair the ease-of-use and adoption of OpenCL tools. Both software and hardware (in the traditional meaning of these words) knowledge are required in this type of porting effort.<\/p>\n<p><strong>Loop unrolling<\/strong><\/p>\n<p>In EE and CS classes, we've been exposed to loop unrolling, a technique that\u00a0generally trades hardware resources for faster execution times. Loop unrolling requires an understanding of data dependencies and it seems, the underlying hardware resources. The latter is where confusion and some amount of hair-pulling can occur.<\/p>\n<p>Take for instance, this (dated but still relevant) post:<br \/>\n<a href=\"https:\/\/forums.xilinx.com\/t5\/High-Level-Synthesis-HLS\/Why-I-can-only-Unroll-2-times\/m-p\/461728\/highlight\/true#M1789\" target=\"_blank\">https:\/\/forums.xilinx.com\/t5\/High-Level-Synthesis-HLS\/Why-I-can-only-Unroll-2-times\/m-p\/461728\/highlight\/true#M1789<\/a><\/p>\n<p><strong>Scenario:<\/strong><\/p>\n<p>Vivado HLS was used to convert a software function into RTL:<\/p>\n<p dir=\"ltr\"><span style=\"font-family: monospace, monospace\">void matpro(const float v[15],float d[15]){<br \/>\n<\/span><span style=\"font-family: monospace, monospace\">char i,i111,i112;<br \/>\n<\/span><span style=\"font-family: monospace, monospace\">float b_StateVectors[15];<br \/>\n<\/span><span style=\"font-family: monospace, monospace\">\u00a0 for (i = 0; i &lt; 15; i++) {<br \/>\n<\/span><span style=\"font-family: monospace, monospace\">\u00a0 \u00a0 d[i]=v[i]*b_StateVectors[i];<br \/>\n<\/span><span style=\"font-family: monospace, monospace\">\u00a0 \u00a0}<br \/>\n<\/span><span style=\"font-family: monospace, monospace\">}<\/span><\/p>\n<p>There are no dependencies in the computation, and the OP felt that\u00a0the tool should be able to use FPGA resources to unroll the loop and reduce the number of loop iterations from 15 to N. N can be 1 if there are enough resources to do the calculation. However, the tool's resulting analysis showed that the loop was only unrolled by 2x, plus some pipelining.<\/p>\n<p>Why only 2x?<\/p>\n<p>With advice from another user, the OP found that setting a ARRAY_PARTITION directive to use registers instead of 2-port memories was the answer. Because if you use a 2-port memory in this case, the amount of data transfer operations that can happen is severely limited.<\/p>\n<p><strong>Implications<\/strong><\/p>\n<p>One\u00a0takeaway is that user input is required as part of the C-to-RTL conversion in order to achieve an optimal design. The tools are continuously improving, so over time, parameters such as those mentioned above will be automatically included as the software becomes better at it. On the user's side of things, deeper knowledge of the FPGA devices and design tools will help bridge the gap, assuming the software engineer is interested in those details.<\/p>\n<p>More to come as we explore this area of optimisation to make <a href=\"http:\/\/www.plunify.com\/en\/product.php\">InTime<\/a> tweak C-to-RTL flows.<\/p>\n<p><strong>References<\/strong><\/p>\n<ul>\n<li>\"Is Altera\u2019s OpenCL SDK ready for business?\", 2014,\u00a0Gordon Inggs, Shane Fleming, David Thomas and Wayne Luk, Imperial College London<\/li>\n<li>\"Altera SDK for OpenCL\",\u00a0https:\/\/www.altera.com\/products\/design-software\/embedded-software-developers\/opencl\/overview.html<\/li>\n<li>\"Xilinx SDAccel Development Environment\",\u00a0https:\/\/www.xilinx.com\/products\/design-tools\/software-zone\/sdaccel.html<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Of all the good things that OpenCL promises, the most attractive proposition is how different processors and cores in a multi-compute-core system can be utilised and maximised with a single programming framework. The ability to combine processing modules of different capabilities to perform particular tasks is, of course, the heterogenous computing concept that has been [&hellip;]<\/p>\n","protected":false},"author":5,"featured_media":1797,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_links_to":"","_links_to_target":""},"categories":[99],"tags":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v17.0 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>OpenCL: Living heterogeneously - Plunify \u65e5\u672c\u8a9e\u30d8\u30eb\u30d7\u30c7\u30b9\u30af<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"OpenCL: Living heterogeneously - Plunify \u65e5\u672c\u8a9e\u30d8\u30eb\u30d7\u30c7\u30b9\u30af\" \/>\n<meta property=\"og:description\" content=\"Of all the good things that OpenCL promises, the most attractive proposition is how different processors and cores in a multi-compute-core system can be utilised and maximised with a single programming framework. The ability to combine processing modules of different capabilities to perform particular tasks is, of course, the heterogenous computing concept that has been [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/\" \/>\n<meta property=\"og:site_name\" content=\"Plunify \u65e5\u672c\u8a9e\u30d8\u30eb\u30d7\u30c7\u30b9\u30af\" \/>\n<meta property=\"article:published_time\" content=\"2016-10-31T07:39:22+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"plunify\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebSite\",\"@id\":\"https:\/\/support.plunify.com\/jp\/#website\",\"url\":\"https:\/\/support.plunify.com\/jp\/\",\"name\":\"Plunify \\u65e5\\u672c\\u8a9e\\u30d8\\u30eb\\u30d7\\u30c7\\u30b9\\u30af\",\"description\":\"Plunify \\u65e5\\u672c\\u8a9e\\u30b5\\u30dd\\u30fc\\u30c8\\u30b5\\u30a4\\u30c8\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/support.plunify.com\/jp\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/#primaryimage\",\"inLanguage\":\"en-US\",\"url\":\"\",\"contentUrl\":\"\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/#webpage\",\"url\":\"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/\",\"name\":\"OpenCL: Living heterogeneously - Plunify \\u65e5\\u672c\\u8a9e\\u30d8\\u30eb\\u30d7\\u30c7\\u30b9\\u30af\",\"isPartOf\":{\"@id\":\"https:\/\/support.plunify.com\/jp\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/#primaryimage\"},\"datePublished\":\"2016-10-31T07:39:22+00:00\",\"dateModified\":\"2016-10-31T07:39:22+00:00\",\"author\":{\"@id\":\"https:\/\/support.plunify.com\/jp\/#\/schema\/person\/0702317d75b841ce991ca9936b72f8b0\"},\"breadcrumb\":{\"@id\":\"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/support.plunify.com\/jp\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"OpenCL: Living heterogeneously\"}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/support.plunify.com\/jp\/#\/schema\/person\/0702317d75b841ce991ca9936b72f8b0\",\"name\":\"plunify\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/support.plunify.com\/jp\/#personlogo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/79e7edc12624b682db2df4112ff7210b?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/79e7edc12624b682db2df4112ff7210b?s=96&d=mm&r=g\",\"caption\":\"plunify\"},\"url\":\"https:\/\/support.plunify.com\/jp\/author\/plunify\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"OpenCL: Living heterogeneously - Plunify \u65e5\u672c\u8a9e\u30d8\u30eb\u30d7\u30c7\u30b9\u30af","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/","og_locale":"en_US","og_type":"article","og_title":"OpenCL: Living heterogeneously - Plunify \u65e5\u672c\u8a9e\u30d8\u30eb\u30d7\u30c7\u30b9\u30af","og_description":"Of all the good things that OpenCL promises, the most attractive proposition is how different processors and cores in a multi-compute-core system can be utilised and maximised with a single programming framework. The ability to combine processing modules of different capabilities to perform particular tasks is, of course, the heterogenous computing concept that has been [&hellip;]","og_url":"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/","og_site_name":"Plunify \u65e5\u672c\u8a9e\u30d8\u30eb\u30d7\u30c7\u30b9\u30af","article_published_time":"2016-10-31T07:39:22+00:00","twitter_card":"summary_large_image","twitter_misc":{"Written by":"plunify","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebSite","@id":"https:\/\/support.plunify.com\/jp\/#website","url":"https:\/\/support.plunify.com\/jp\/","name":"Plunify \u65e5\u672c\u8a9e\u30d8\u30eb\u30d7\u30c7\u30b9\u30af","description":"Plunify \u65e5\u672c\u8a9e\u30b5\u30dd\u30fc\u30c8\u30b5\u30a4\u30c8","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/support.plunify.com\/jp\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"ImageObject","@id":"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/#primaryimage","inLanguage":"en-US","url":"","contentUrl":""},{"@type":"WebPage","@id":"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/#webpage","url":"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/","name":"OpenCL: Living heterogeneously - Plunify \u65e5\u672c\u8a9e\u30d8\u30eb\u30d7\u30c7\u30b9\u30af","isPartOf":{"@id":"https:\/\/support.plunify.com\/jp\/#website"},"primaryImageOfPage":{"@id":"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/#primaryimage"},"datePublished":"2016-10-31T07:39:22+00:00","dateModified":"2016-10-31T07:39:22+00:00","author":{"@id":"https:\/\/support.plunify.com\/jp\/#\/schema\/person\/0702317d75b841ce991ca9936b72f8b0"},"breadcrumb":{"@id":"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/support.plunify.com\/jp\/2016\/10\/31\/opencl-living-heterogeneously\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/support.plunify.com\/jp\/"},{"@type":"ListItem","position":2,"name":"OpenCL: Living heterogeneously"}]},{"@type":"Person","@id":"https:\/\/support.plunify.com\/jp\/#\/schema\/person\/0702317d75b841ce991ca9936b72f8b0","name":"plunify","image":{"@type":"ImageObject","@id":"https:\/\/support.plunify.com\/jp\/#personlogo","inLanguage":"en-US","url":"https:\/\/secure.gravatar.com\/avatar\/79e7edc12624b682db2df4112ff7210b?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/79e7edc12624b682db2df4112ff7210b?s=96&d=mm&r=g","caption":"plunify"},"url":"https:\/\/support.plunify.com\/jp\/author\/plunify\/"}]}},"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/support.plunify.com\/jp\/wp-json\/wp\/v2\/posts\/2037"}],"collection":[{"href":"https:\/\/support.plunify.com\/jp\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/support.plunify.com\/jp\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/support.plunify.com\/jp\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/support.plunify.com\/jp\/wp-json\/wp\/v2\/comments?post=2037"}],"version-history":[{"count":0,"href":"https:\/\/support.plunify.com\/jp\/wp-json\/wp\/v2\/posts\/2037\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/support.plunify.com\/jp\/wp-json\/wp\/v2\/media\/1797"}],"wp:attachment":[{"href":"https:\/\/support.plunify.com\/jp\/wp-json\/wp\/v2\/media?parent=2037"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/support.plunify.com\/jp\/wp-json\/wp\/v2\/categories?post=2037"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/support.plunify.com\/jp\/wp-json\/wp\/v2\/tags?post=2037"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}