Other Parts Discussed in Thread: FFTLIB, SYSBIOS
Tool/software: TI-RTOS
First I wanted to give an update to one of my previous threads: https://e2e.ti.com/support/processors/f/791/t/754731
I was able to get the FFTLIB working properly on the AM5728 DSP core using the direct access and EDMA FFT's . Now I'm trying to implement the OpenMP version which only uses EDMA and I'm running into some issues. I'm able to successfully build the project, but at line 224 of the main.c file I get the following error on the second DSP core:
[C66xx_DSP2] ti.sysbios.heaps.HeapMem: line 221: out of memory: handle=0x9a625f00, size=12
xdc.runtime.Error.raise: terminating execution
This only occurs if I press the play button and debug at full speed. If I step through the code on both cores at the same time it does not throw the error. Then when it reaches the next parallel edma command to allocate the memory it fails with the generic message "EdmaMgr_alloc() failed " from the fft_omp_assign_edma_resources function.
I've attempted to modify the heap size in my .cfg file and no matter what size I assign, it gives the above error at the same location and size. Looking at the HeapMem and HeapMemMP in ROV did not provide me much information either:
Attached are the files where I think the problem lies, but I haven't been able to pin point it yet. First is the fft_c6678_config.c file which is where I had to make changes to work with the am5728 which has only 1 EDMA instance verses the c6678 which has 3, as well as the change of register locations and interrupt numbers. Lines edited were 43, 242-275, and 309-328.
In the omp_config.cfg file, I merged the examples from the FFTLIB project fft_opm_sp_1d_r2c_k1_66_LE_ELF example with the config mentioned in http://downloads.ti.com/mctools/esd/docs/openmp-dsp/building_openmp_app.html#running-applications-within-ccs. I'm using the rtsc platform ti.runtime.openmp.platforms.am57x found in the openmp_dsp_am57xx_2_06_02_01 library. I modified the DDR locations slightly as shown here:
I feel like I may not have configured the EDMA entirely correctly since it runs on one core, but not two. Also, if I run the commands linearly then both cores succeed. Maybe that when they're doing the parallel execution they're trying to access the same memory? That is my hypothesis at this point anyway. I'm hoping that someone might be able to help figure out why the edmamgr library functions from framework_components_3_40_02_07 do not seem to be working on multiple cores with OpenMP.
I can export and upload my entire project if that would help as well.
/* ======================================================================= */ /* TEXAS INSTRUMENTS, INC. */ /* */ /* FFTLIB FFT Library */ /* */ /* Copyright (C) 2013 Texas Instruments Incorporated - http://www.ti.com/ */ /* */ /* */ /* Redistribution and use in source and binary forms, with or without */ /* modification, are permitted provided that the following conditions */ /* are met: */ /* */ /* Redistributions of source code must retain the above copyright */ /* notice, this list of conditions and the following disclaimer. */ /* */ /* Redistributions in binary form must reproduce the above copyright */ /* notice, this list of conditions and the following disclaimer in the */ /* documentation and/or other materials provided with the */ /* distribution. */ /* */ /* Neither the name of Texas Instruments Incorporated nor the names of */ /* its contributors may be used to endorse or promote products derived */ /* from this software without specific prior written permission. */ /* */ /* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS */ /* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT */ /* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR */ /* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT */ /* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, */ /* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT */ /* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, */ /* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY */ /* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT */ /* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE */ /* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* */ /* ======================================================================= */ #include <xdc/std.h> #include <ti/sdo/edma3/rm/edma3_rm.h> #include <ti/sdo/fc/edma3/edma3_config.h> #define EDMA_MGR_NUM_EDMA_INSTANCES 1 //3 /* In the arrays below, each bit of a 32-bit word corresponds to a single */ /* PaRAMSet/EDMAChannel/QDMAChannel/TCC owned by the corresponding region, */ /* i.e., can be used for general purpose EDMA tranfers, or reserved for */ /* EDMA transfers by hardware peripherals (cannot be used for general */ /* purpose EDMA tranfers) */ #define DMA_CHANNEL_TO_EVENT_MAPPING_0 (0x00000000u) #define DMA_CHANNEL_TO_EVENT_MAPPING_1 (0x00000000u) /* EDMA3_InstanceInitConfig sample0 with region neither owning nor */ /* reserving any EDMA resources */ #define regionSample0 \ { \ /* Resources owned by Region */ \ /* ownPaRAMSets */ \ {0xFFFFFFFFu, 0xFFFFFFFFu, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u}, \ \ /* ownDmaChannels */ \ {0xFFFFFFFFu, 0x00000000u}, \ \ /* ownQdmaChannels */ \ {0x000000FFu}, \ \ /* ownTccs */ \ {0xFFFFFFFFu, 0x00000000u}, \ \ /* Resources reserved by Region */ \ /* resvdPaRAMSets */ \ {0xFFFFFFFFu, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u}, \ \ /* resvdDmaChannels */ \ {DMA_CHANNEL_TO_EVENT_MAPPING_0, DMA_CHANNEL_TO_EVENT_MAPPING_1}, \ \ /* resvdQdmaChannels */ \ {0x00000000u}, \ \ /* resvdTccs */ \ {DMA_CHANNEL_TO_EVENT_MAPPING_0, DMA_CHANNEL_TO_EVENT_MAPPING_1} \ } /* EDMA3_InstanceInitConfig sample1 with region owning PaRAM sets 64-105, */ /* and EDMA channel 0-7, but not reserving any EDMA resources */ /* Note that the first N PaRAM sets (N=number of EDMA channels available */ /* on an EDMA instance) are reserved in EDMA3 LLD ). */ #define regionSample1 \ { \ /* Resources owned by Region */ \ /* ownPaRAMSets */ \ {0x00000000u, 0x00000000u, 0xFFFFFFFFu, 0xFFFFFFFFu, \ 0xFFFFFFFFu, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u}, \ \ /* ownDmaChannels */ \ {0x0000FFFFu, 0x00000000u}, \ \ /* ownQdmaChannels */ \ {0x00000000u}, \ \ /* ownTccs */ \ {0x0000FFFFu, 0x00000000u}, \ \ /* Resources reserved by Region */ \ /* resvdPaRAMSets */ \ {0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u}, \ \ /* resvdDmaChannels */ \ {DMA_CHANNEL_TO_EVENT_MAPPING_0, DMA_CHANNEL_TO_EVENT_MAPPING_1}, \ \ /* resvdQdmaChannels */ \ {0x00000000u}, \ \ /* resvdTccs */ \ {DMA_CHANNEL_TO_EVENT_MAPPING_0, DMA_CHANNEL_TO_EVENT_MAPPING_1} \ } /* EDMA3_InstanceInitConfig sample2 with region owning PaRAM sets 106-147, */ /* and EDMA channel 8-15, but not reserving any EDMA resources */ #define regionSample2 \ { \ /* Resources owned by Region */ \ /* ownPaRAMSets */ \ {0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0xFFFFFFFFu, 0xFFFFFFFFu, 0xFFFFFFFFu, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u}, \ \ /* ownDmaChannels */ \ {0xFFFF0000u, 0x00000000u}, \ \ /* ownQdmaChannels */ \ {0x00000000u}, \ \ /* ownTccs */ \ {0xFFFF0000u, 0x00000000u}, \ \ /* Resources reserved by Region */ \ /* resvdPaRAMSets */ \ {0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u}, \ \ /* resvdDmaChannels */ \ {DMA_CHANNEL_TO_EVENT_MAPPING_0, DMA_CHANNEL_TO_EVENT_MAPPING_1}, \ \ /* resvdQdmaChannels */ \ {0x00000000u}, \ \ /* resvdTccs */ \ {DMA_CHANNEL_TO_EVENT_MAPPING_0, DMA_CHANNEL_TO_EVENT_MAPPING_1} \ } /* EDMA3_InstanceInitConfig sample3 with region owning PaRAM sets 148-189, */ /* and EDMA channel 16-23, but not reserving any EDMA resources */ #define regionSample3 \ { \ /* Resources owned by Region */ \ /* ownPaRAMSets */ \ {0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0xFFFFFFFFu, 0xFFFFFFFFu, 0xFFFFFFFFu, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u}, \ \ /* ownDmaChannels */ \ {0x00000000u, 0x0000FFFFu}, \ \ /* ownQdmaChannels */ \ {0x00000000u}, \ \ /* ownTccs */ \ {0x00000000u, 0x0000FFFFu}, \ \ /* Resources reserved by Region */ \ /* resvdPaRAMSets */ \ {0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u}, \ \ /* resvdDmaChannels */ \ {DMA_CHANNEL_TO_EVENT_MAPPING_0, DMA_CHANNEL_TO_EVENT_MAPPING_1}, \ \ /* resvdQdmaChannels */ \ {0x00000000u}, \ \ /* resvdTccs */ \ {DMA_CHANNEL_TO_EVENT_MAPPING_0, DMA_CHANNEL_TO_EVENT_MAPPING_1} \ } /* EDMA3_InstanceInitConfig sample4 with region owning PaRAM sets 190-231, */ /* and EDMA channel 24-31, but not reserving any EDMA resources */ #define regionSample4 \ { \ /* Resources owned by Region */ \ /* ownPaRAMSets */ \ {0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0xFFFFFFFFu, \ 0xFFFFFFFFu, 0xFFFFFFFFu, 0x00000000u, 0x00000000u}, \ \ /* ownDmaChannels */ \ {0x00000000u, 0xFFFF0000u}, \ \ /* ownQdmaChannels */ \ {0x00000000u}, \ \ /* ownTccs */ \ {0x00000000u, 0xFFFF0000u}, \ \ /* Resources reserved by Region */ \ /* resvdPaRAMSets */ \ {0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u, \ 0x00000000u, 0x00000000u, 0x00000000u, 0x00000000u}, \ \ /* resvdDmaChannels */ \ {DMA_CHANNEL_TO_EVENT_MAPPING_0, DMA_CHANNEL_TO_EVENT_MAPPING_1}, \ \ /* resvdQdmaChannels */ \ {0x00000000u}, \ \ /* resvdTccs */ \ {DMA_CHANNEL_TO_EVENT_MAPPING_0, DMA_CHANNEL_TO_EVENT_MAPPING_1} \ } #define NUM_EDMA_INSTANCES 1 //3 const EDMA3_InstanceInitConfig C6678_config[NUM_EDMA_INSTANCES][EDMA3_MAX_REGIONS] = { /* EDMA3 INSTANCE# 0 */ { regionSample0, regionSample0, regionSample0, regionSample0, regionSample0, regionSample0, regionSample0, regionSample0 } /* EDMA3 INSTANCE# 1 { regionSample1, regionSample2, regionSample3, regionSample4, regionSample0, regionSample0, regionSample0, regionSample0 }, */ /* EDMA3 INSTANCE# 2 { regionSample0, regionSample0, regionSample0, regionSample0, regionSample1, regionSample2, regionSample3, regionSample4 } */ }; const EDMA3_InstanceInitConfig edmaMgrInstanceInitConfig[EDMA_MGR_NUM_EDMA_INSTANCES][EDMA3_MAX_REGIONS] = { /* EDMA3 INSTANCE# 0 */ { regionSample0, regionSample0, regionSample0, regionSample0, regionSample0, regionSample0, regionSample0, regionSample0 } /* EDMA3 INSTANCE# 1 { regionSample1, regionSample2, regionSample3, regionSample4, regionSample0, regionSample0, regionSample0, regionSample0 }, */ /* EDMA3 INSTANCE# 2 { regionSample0, regionSample0, regionSample0, regionSample0, regionSample1, regionSample2, regionSample3, regionSample4 } */ }; int32_t edmaMgrRegion2Instance[EDMA3_MAX_REGIONS] = {0,0,0,0,0,0,0,0}; //{1,1,1,1,2,2,2,2}; /* Driver Object Initialization Configuration */ EDMA3_GblConfigParams edmaMgrGblConfigParams [EDMA_MGR_NUM_EDMA_INSTANCES] = { { /* EDMA3 INSTANCE# 0 */ /** Total number of DMA Channels supported by the EDMA3 Controller */ 16u, /** Total number of QDMA Channels supported by the EDMA3 Controller */ 8u, /** Total number of TCCs supported by the EDMA3 Controller */ 16u, /** Total number of PaRAM Sets supported by the EDMA3 Controller */ 128u, /** Total number of Event Queues in the EDMA3 Controller */ 2u, /** Total number of Transfer Controllers (TCs) in the EDMA3 Controller */ 2u, /** Number of Regions on this EDMA3 controller */ 8u, /** * \brief Channel mapping existence * A value of 0 (No channel mapping) implies that there is fixed association * for a channel number to a parameter entry number or, in other words, * PaRAM entry n corresponds to channel n. */ 1u, /** Existence of memory protection feature */ 1u, /** Global Register Region of CC Registers */ (void *)0x01D10000u, //0x40D10000u, //0x02700000u, /** Transfer Controller (TC) Registers */ { (void *)0x01D05000u, //0x40D05000u, //0x02760000u, (void *)0x01D06000u, //0x40D06000u, //0x02768000u, (void *)NULL, (void *)NULL, (void *)NULL, (void *)NULL, (void *)NULL, (void *)NULL }, /** Interrupt no. for Transfer Completion */ 16u, //38u, /** Interrupt no. for CC Error */ 27u, //32u, /** Interrupt no. for TCs Error */ { 28u, //34u, 29u, //35u, 0u, 0u, 0u, 0u, 0u, 0u, }, /** * \brief EDMA3 TC priority setting * * User can program the priority of the Event Queues * at a system-wide level. This means that the user can set the * priority of an IO initiated by either of the TCs (Transfer Controllers) * relative to IO initiated by the other bus masters on the * device (ARM, DSP, USB, etc) */ { 0u, 1u, 0u, 0u, 0u, 0u, 0u, 0u }, /** * \brief To Configure the Threshold level of number of events * that can be queued up in the Event queues. EDMA3CC error register * (CCERR) will indicate whether or not at any instant of time the * number of events queued up in any of the event queues exceeds * or equals the threshold/watermark value that is set * in the queue watermark threshold register (QWMTHRA). */ { 16u, 16u, 0u, 0u, 0u, 0u, 0u, 0u }, /** * \brief To Configure the Default Burst Size (DBS) of TCs. * An optimally-sized command is defined by the transfer controller * default burst size (DBS). Different TCs can have different * DBS values. It is defined in Bytes. */ { 128u, 128u, 0u, 0u, 0u, 0u, 0u, 0u }, /** * \brief Mapping from each DMA channel to a Parameter RAM set, * if it exists, otherwise of no use. */ { EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, EDMA3_RM_CH_NO_PARAM_MAP, /* DMA channels 16-63 DOES NOT exist */ EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS, EDMA3_MAX_PARAM_SETS }, /** * \brief Mapping from each DMA channel to a TCC. This specific * TCC code will be returned when the transfer is completed * on the mapped channel. */ { EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, EDMA3_RM_CH_NO_TCC_MAP, /* DMA channels 16-63 DOES NOT exist */ EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC, EDMA3_MAX_TCC }, /** * \brief Mapping of DMA channels to Hardware Events from * various peripherals, which use EDMA for data transfer. * All channels need not be mapped, some can be free also. */ { 0x00000000u, 0x00000000u } }, }; int32_t *ti_sdo_fc_edmamgr_region2Instance = (int32_t*)&edmaMgrRegion2Instance[0]; EDMA3_GblConfigParams *ti_sdo_fc_edmamgr_edma3GblConfigParams = (EDMA3_GblConfigParams*)&edmaMgrGblConfigParams[0]; EDMA3_InstanceInitConfig *ti_sdo_fc_edmamgr_edma3RegionConfig = (EDMA3_InstanceInitConfig*)&edmaMgrInstanceInitConfig[0][0];
/****************************************************************************** * FILE: omp_hello.c * DESCRIPTION: * OpenMP Example - Hello World - C/C++ Version * In this simple example, the master thread forks a parallel region. * All threads in the team obtain their unique thread number and print it. * The master thread only prints the total number of threads. Two OpenMP * library routines are used to obtain the number of threads and each * thread's number. * AUTHOR: Blaise Barney 5/99 * LAST REVISED: 04/06/05 ******************************************************************************/ #include <ti/omp/omp.h> //#include <omp.h> #include <stdio.h> #include <xdc/std.h> #include <time.h> #include <stdlib.h> #include <limits.h> #include <math.h> #include <c6x.h> //#include "omp/omp_config.h" #include <fft_omp_sp_1d_r2c/fft_omp_sp_1d_r2c.h> #include "fft_edma.h" #include <ti/dsplib/src/DSPF_sp_fftSPxSP/DSPF_sp_fftSPxSP.h> //#include <ti/runtime/openmp/omp.h> extern cregister volatile unsigned int DNUM; #define ACTIVE_THREAD_COUNT (2) #define FFT_EDMA_STATE_INIT (0) #define FFT_EDMA_STATE_ALLOCATED (1) #define OMP_MAX_NUM_CORES (2) int fftEdmaState[ACTIVE_THREAD_COUNT]; FFT_EDMA_Struct gEdmaState[OMP_MAX_NUM_CORES]; #pragma DATA_SECTION (gEdmaState, ".msmc_mem") /* ======================================================================== */ /* Kernel-specific alignments */ /* ======================================================================== */ #pragma DATA_SECTION(x_i, ".ddr_mem"); #pragma DATA_SECTION(y_i, ".ddr_mem"); #pragma DATA_SECTION(w_i, ".ddr_mem"); #pragma DATA_SECTION(x_cn, ".ddr_mem"); #pragma DATA_SECTION(y_cn, ".ddr_mem"); #pragma DATA_SECTION(w_cn, ".ddr_mem"); #pragma DATA_ALIGN(x_i, 8); #pragma DATA_ALIGN(x_cn, 8); #pragma DATA_ALIGN(w_cn, 8); #pragma DATA_ALIGN(y_i, 8); #pragma DATA_ALIGN(y_cn, 8); #pragma DATA_SECTION(x_i_work, ".ll2_mem"); #pragma DATA_SECTION(y_i_work, ".ll2_mem"); #pragma DATA_SECTION(y_i_temp, ".ll2_mem"); #pragma DATA_SECTION(w_i_work, ".ll2_mem"); #pragma DATA_ALIGN(w_i, 8); #pragma DATA_ALIGN(w_i_work, 8); #pragma DATA_ALIGN(x_i_work, 64); #pragma DATA_ALIGN(y_i_work, 64); /* ======================================================================== */ /* Parameters of fixed dataset. */ /* ======================================================================== */ #define MAXN (2024*2048) #define M (2*MAXN) #define M_i (4*2048) #define PAD (0) /* ======================================================================== */ /* Initialized arrays with fixed test data. */ /* ======================================================================== */ float x_i [M + 2 * PAD]; float x_cn[M + 2 * PAD]; float y_i [M + 2 * PAD]; float y_cn[M + 2 * PAD]; float x_i_work [M_i*2*NUMOFLINEBUFS + 2 * PAD]; float y_i_work [M_i*2*NUMOFLINEBUFS + 2 * PAD]; float y_i_temp [M_i*NUMOFLINEBUFS + 2 * PAD]; float w_i_work [2 + 2048/2 + 2*2048 + 2 * PAD]; float w_i [2 + 2048/2 + 2*2048 + 2 * PAD]; float w_cn[M + 2 * PAD]; float magDiv = 3.0/8.0; /* ======================================================================== */ /* Generate pointers to skip beyond array padding */ /* ======================================================================== */ float *const ptr_x_i = x_i + PAD; float *const ptr_x_cn = x_cn + PAD; float *const ptr_w_i = w_i + PAD; float *const ptr_w_cn = w_cn + PAD; float *const ptr_y_i = y_i + PAD; float *const ptr_y_cn = y_cn + PAD; float *const ptr_y_i_temp = y_i_temp + PAD; float *const ptr_y_i_work = y_i_work + PAD; float *const ptr_x_i_work = x_i_work + PAD; float *const ptr_w_i_work = w_i_work + PAD; void fft_assert(int statement, int node_id, const char *error) { volatile int dbg_halt = 1; if(!statement) { printf("%s (%d)\n",error,node_id); while(dbg_halt); } } void fft_memory_request (int nbufs, FFTmemBuffer_t *bufs) { int i; printf ("FFT memory buffers:\n"); printf (" Buffer Size(bytes) Alignment\n"); for (i = 0; i < nbufs; i++) { printf (" %3d %8d %4d \n", i, (int)bufs[i].size, (int)bufs[i].log2align); } bufs[0].base = ptr_x_i; bufs[1].base = ptr_y_i; bufs[2].base = ptr_w_i; bufs[3].base = ptr_x_i_work; bufs[4].base = ptr_y_i_work; bufs[5].base = ptr_w_i_work; bufs[6].base = ptr_y_i_temp; } /* fft_memory_request */ void *fft_omp_assign_edma_resources(void) { /* * The edmaInstances are indexes into the C6678_config[] array defined in * fft_c6678_config, which is used to specify how EDMA resources are * divided between cores. */ void *ret = (void *) (&gEdmaState[0]); #pragma omp parallel { if ( fftEdmaState[DNUM] != FFT_EDMA_STATE_ALLOCATED ) { gEdmaState[DNUM].num_channels = 0; while ( gEdmaState[DNUM].num_channels < FFT_NUM_EDMA_CH ) { fft_assert( ((gEdmaState[DNUM].channel[gEdmaState[DNUM].num_channels]) = EdmaMgr_alloc(FFT_MAX_EDMA_LINKS)) != NULL , DNUM, "EdmaMgr_alloc() failed "); gEdmaState[DNUM].num_channels++; } } fftEdmaState[DNUM] = FFT_EDMA_STATE_ALLOCATED; } return ret; } void fft_omp_free_edma_resources(void *edma) { /* * The edmaInstances are indexes into the C6678_config[] array defined in * fft_c6678_config, which is used to specify how EDMA resources are * divided between cores. */ int ret_val; #pragma omp parallel { if ( fftEdmaState[DNUM] == FFT_EDMA_STATE_ALLOCATED ) { while ( gEdmaState[DNUM].num_channels > 0 ) { gEdmaState[DNUM].num_channels--; ret_val = EdmaMgr_free(gEdmaState[DNUM].channel[gEdmaState[DNUM].num_channels]); fft_assert( ret_val == EdmaMgr_SUCCESS, DNUM, "EDMA free failed!"); } } fftEdmaState[DNUM] = FFT_EDMA_STATE_INIT; } } void fft_memory_release (int nbufs, FFTmemBuffer_t *bufs) { /* do nothing for now */ } /* fft_memory_request */ int main (int argc, char *argv[]) { int i, j, N, k = 0; clock_t t_start, t_stop, t_overhead, t_opt; float diff, max_diff = 0, absReal, absImg, max, min; fft_plan_t p; fft_callout_t plan_fxns; N = MAXN; //initialize hardware timers TSCL=0;TSCH=0; // initialize callout functions plan_fxns.memoryRequest = fft_memory_request; plan_fxns.memoryRelease = fft_memory_release; plan_fxns.ecpyRequest = fft_omp_assign_edma_resources; plan_fxns.ecpyRelease = fft_omp_free_edma_resources; // initialize ECPY omp_set_num_threads (ACTIVE_THREAD_COUNT); #pragma omp parallel { fft_assert( (EdmaMgr_init(DNUM, NULL) == EdmaMgr_SUCCESS), DNUM, "EdmaMgr_init() return error!"); fftEdmaState[DNUM] = FFT_EDMA_STATE_INIT; } //Force uninitialized arrays to fixed values memset (x_i, 0x55, sizeof (x_i) ); memset (x_cn, 0x55, sizeof (x_cn)); memset (y_i, 0xA5, sizeof (y_i) ); memset (y_cn, 0xA5, sizeof (y_cn)); // Initialize input vector temporarily printf("Initializing input vectors\n"); for (j = 0; j < N; j++) { x_i[j] = sin (2 * 3.1415 * 1000 * j / (double) N); } // Create fft plan printf("Creating plan\n"); p = fft_omp_sp_plan_1d_r2c (N, FFT_ECPY, plan_fxns); //Compute the overhead of allocating and freeing EDMA t_start = _itoll(TSCH, TSCL); p.edmaState = fft_omp_assign_edma_resources(); fft_omp_free_edma_resources(p.edmaState); t_stop = _itoll(TSCH, TSCL); t_overhead = t_stop - t_start; // ecpy fft printf("FFT executing\n"); t_start = _itoll(TSCH, TSCL); fft_execute (p); t_stop = _itoll(TSCH, TSCL); t_opt = (t_stop - t_start) - t_overhead; // calculate clock cycles printf("Clock cycles for execute: %d\n", t_opt); // Calculate magnitude printf("Starting FFT magnitude (MAX-MIN)\n"); t_start = _itoll(TSCH, TSCL); // start counter for (i = 0; i < N; i+=2) { absReal = _fabsf(y_i[i]); absImg = _fabsf(y_i[i+1]); if(absReal > absImg) { max = absReal; min = absImg; } else { max = absImg; min = absReal; } y_cn[k] = (max + (min * magDiv)); k++; } t_stop = _itoll(TSCH, TSCL); // stop counter t_opt = (t_stop - t_start) - t_overhead; // calculate clock cycles printf("Clock cycles for magnitude (MAX-MIN): %d\n", t_opt); fft_destroy_plan (p); //compute difference and track max difference diff = 0; max_diff = 0; for(i=0; i<2*N; i++) { diff = _fabs(ptr_y_cn[i] - ptr_x_i[i]); if (diff > max_diff) max_diff = diff; } printf("fft_omp_sp_1d_r2c_ecpy\tsize= %d\n", N); printf("max_diff = %f", max_diff); return 0; }
Thanks!
John