This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Some question about Arm Embedded Evaluation Kit

Ask the Arm ML Embedded Evaluation Kit(review.mlplatform.org/.../documentation.md

1.In the "Memory considerations" section of this article, it is mentioned that there are three memory modes that can be set. May I ask why the Dedicated_Sram mode can only be used on Ethos-U65?

Is it because of some hardware limitation that it can only be used on U65?

2.Why does activate_buf_sz in use_case.cmake refer to different memory when Shared_Sram and Dedicated_Sram are used? Under Shared_Sram, it refers to the size of SRAM, and under Dedicated_Sram, it refers to the size of DRAM?

The activate buffer I know is to put the tensor arena and should be placed in the cache. Not sure why the size of this value for different modes refers to different things.

(Dedicatd_Sram_mode)(Shared_Sram_mode)

3.I have tried putting a model (FSRCNN, github.com/.../FSRCNN_Tensorflow)
Convert it to tflite, and use the Vela compiler to convert tflite into an optimized model and put it on the FVP of Ethos-U65 to simulate, and encountered a problem.
"tensor allocation failed!"

This situation was encountered before because the value of activate_buf_sz in use_cased.cmake was set too small. At this time, we need to check the memory of the repoert of Vela comilper to adjust its size.

I am using dedicated Sram mode this time, so I have to check the DRAM used value of Memory uesd to adjust.
But this time the situation is different. The value I adjusted has exceeded the required value.

Still tensor allocation failed!
Would like to ask is there any reason why he failed?

(vela report)

(activate_buf_sz in use_cased.cmake)

Thanks in advance for your answers

Top replies

Sandeep Singh over 2 years ago +1 suggested

1. Dedicated SRAM is a memory mode where tensor arena & model live in DRAM. SRAM is only used as a cache. Now, U55 is not designed to use DRAM as it's memory interface will give a lower bandwidth than...

Parents

0 Sandeep Singh over 2 years ago in reply to Danter

issue seems that because your cmake changes of are not getting built. Are you building any specific use case:
for e.g if you build inference runner make changes in <path of eval kit>/source/use_case/inference_runner/usecase.cmake
<snip>

USER_OPTION(${use_case}_ACTIVATION_BUF_SZ "Activation buffer size for the chosen model"

0x01700000

STRING)

<snip>

Now in the evak kit follow this:

1. mkdir build;cd build

2. cmake .. -DUSE_CASE_BUILD=inference_runner -DETHOS_U_NPU_ID=U65 -DCMAKE_TOOLCHAIN_FILE=./scripts/cmake/toolchains/bare-metal-gcc.cmake -DETHOS_U_NPU_MEMORY_MODE=Dedicated_Sram -DCPU_PROFILE_ENABLED=1 -DLOG_LEVEL=LOG_LEVEL_TRACE -Dinference_runner_MODEL_TFLITE_PATH=./fscrn/fsrcnn_720p_vela.tflite

You must confirm that your cmake changes ACTIVATION_BUF_SZ coming in logs.
<snip>
-- ETHOS_U_NPU_CACHE_SIZE=393216
-- ETHOS_U_NPU_MEMORY_MODE=Dedicated_Sram
-- ETHOS_U_NPU_CONFIG_ID=Y256
-- ETHOS_U_NPU_TIMING_ADAPTER_ENABLED=ON
-- TA_CONFIG_FILE=./cmake/timing_adapter/ta_config_u65_high_end.cmake
-- inference_runner_ACTIVATION_BUF_SZ=0x01700000
-- inference_runner_DYNAMIC_MEM_LOAD_ENABLED=OFF
-- inference_runner_MODEL_TFLITE_PATH=./fscrn/fsrcnn_720p_vela.tflite

<snip>

3. make // your application will be build.

Similar way you have to do this if you are using any diff use-case. Refer : review.mlplatform.org/.../building.md
Cancel
Up 0 Down

Cancel

Reply

0 Sandeep Singh over 2 years ago in reply to Danter

issue seems that because your cmake changes of are not getting built. Are you building any specific use case:
for e.g if you build inference runner make changes in <path of eval kit>/source/use_case/inference_runner/usecase.cmake
<snip>

USER_OPTION(${use_case}_ACTIVATION_BUF_SZ "Activation buffer size for the chosen model"

0x01700000

STRING)

<snip>

Now in the evak kit follow this:

1. mkdir build;cd build

2. cmake .. -DUSE_CASE_BUILD=inference_runner -DETHOS_U_NPU_ID=U65 -DCMAKE_TOOLCHAIN_FILE=./scripts/cmake/toolchains/bare-metal-gcc.cmake -DETHOS_U_NPU_MEMORY_MODE=Dedicated_Sram -DCPU_PROFILE_ENABLED=1 -DLOG_LEVEL=LOG_LEVEL_TRACE -Dinference_runner_MODEL_TFLITE_PATH=./fscrn/fsrcnn_720p_vela.tflite

You must confirm that your cmake changes ACTIVATION_BUF_SZ coming in logs.
<snip>
-- ETHOS_U_NPU_CACHE_SIZE=393216
-- ETHOS_U_NPU_MEMORY_MODE=Dedicated_Sram
-- ETHOS_U_NPU_CONFIG_ID=Y256
-- ETHOS_U_NPU_TIMING_ADAPTER_ENABLED=ON
-- TA_CONFIG_FILE=./cmake/timing_adapter/ta_config_u65_high_end.cmake
-- inference_runner_ACTIVATION_BUF_SZ=0x01700000
-- inference_runner_DYNAMIC_MEM_LOAD_ENABLED=OFF
-- inference_runner_MODEL_TFLITE_PATH=./fscrn/fsrcnn_720p_vela.tflite

<snip>

3. make // your application will be build.

Similar way you have to do this if you are using any diff use-case. Refer : review.mlplatform.org/.../building.md
Cancel
Up 0 Down

Cancel

Children

No data