This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Reusage of already instantiated class templates



Hello out there,

Im currently looking after saving quite a few Kb of memory and came across something strange. Large parts of my codebase are written as C++ Templates which by itself is nothing bad, but if i use the same template again in another corner of my software the resulting binary size suggests that the template is instantiated again. This same template with the same args was already used somewhere else. With this in mind i come to the conclusion, that instantiated templates are not by default reused but rather instantiated in place again. Any tips how to avoid such behaviour? Maybe template specialization is the right thing to obtain an object file with the specialized template contents. My thought is to offload the linker doing it that way. Maybe someone can shine some light on the internals of the compiler/linker regarding the issue.

Kind Regards Tobias

  • Hello Tobias,

    Can you specify which compiler and version you are using?

    Thanks

    ki

  • latest ti compiler for C2000

  • Thanks. I will bring this thread to the attention of the compiler experts. Please note that it is a local holiday today, hence you should hear a response tomorrow.

  • Do you build with --abi=eabi?  This affects some behind the scenes details on how templates are instantiated.

    I presume we are mostly talking about template functions, which means you see an increase in code size.

    Do you see multiple copies of the same function?  Those could only be static functions, and not global functions.  

    Or, it might be the case that the compiler inlines calls to these template functions  To defeat that,  use --opt_level=4 --opt_for_speed=0.  Please search for both of those options in the C28x compiler manual.  The option --opt_level=4 must be used when compiling and linking.

    Thanks and regards,

    -George

  • yes im compiling with abi=eabi. Template functions and classes although the templated functions are rather small they appear in large quantity in lots of translation units. From my observations the following happens: template is purely in the header and is completely inlined also classes are not reused outside the translation unit. When i specialize the template and put the definition in a source file than it reduces the binary size significantly for the template class with this arguments for example.
    What im looking for is an equivalent to a template cache. Ive seen some compiler have it to achieve exactly this. Pure header templates with no inlined class definitions but rather its ability to reuse.

  • Please pick one source file where the code size reduces a lot when you ...

    specialize the template and put the definition in a source file

    For that source file, please follow the directions in the article How to Submit a Compiler Test Case two times.  Once when you do not specialize any templates, and once when you do.  I'll build them, and try to explain the difference in code size.

    Thanks and regards,

    -George

  • Hello George,

    i will do it tonight or tomorrow through out the day. Thanks for your effort.

  • Hello George,
    Im currently gathering up some files to provide the test case and from what i see it may be to obvious. In detail im facing the expected behaviour how templates that are purely in a header vs in a source file are distributed/linked together in different translation units. The header ones are inlined and multiple seperated but identical instances are compiled into the translation unit they are used. On the other hand the specialized ones that reside in a seperate source file are than linked against where they are used. So far nothing special. But im looking for a mechanism similar to the GCCs template cache in combination with noinline flag IIRC. This enables, that a template that is instantiated in one point in the program may automatically used somewhere else i.e. it does the specialization and then linking against only this instance behind the scenes. I would like to do the same as manually instantiation is a bit tedious but works so far.
    Kind Regard Tobias

  • Please provide the requested test case.  That is the only way to move this conversation from vague description to specific details.

    Thanks and regards,

    -George

  • The compiler version used is 21.6.0LTS. This is only the non-sensitive part of the firmware. Actually there are a lot more templates. Hopefully this is enough to reproduce. Ive used a mixture some are specialized and some dont in different translation units.
    Kind Regards

    
    
    
     
    
    
    
    
    
    
     
    
    
    template<>
    float limit(float arg, float max, float min);
    
    template<>
    double limit(double arg, double max, double min);
    
    template <typename T>
    T limit(T value, T max, T min)
    {
        value = value > max ? max : value;
        value = value < min ? min : value;
    
        return value;
    }
    
    int calc(void);
    
    
    int main(void)
    {
        float a = limit<float>(5, 2, 0);
        int = limit<int>(a, 1, 0);
    
        calc();
    
    	return 0;
    }
    
    
    
    
    
     
    
    
    
    
    
    
     
    
    
    template<>
    float limit(float arg, float max, float min);
    
    template<>
    double limit(double arg, double max, double min);
    
    template <typename T>
    T limit(T value, T max, T min)
    {
        value = value > max ? max : value;
        value = value < min ? min : value;
    
        return value;
    }
    
    int calc(void);
    
    
    
    template <>
    float limit(float value, float max, float min)
    {
        value = value > max ? max : value;
        value = value < min ? min : value;
    
        return value;
    }
    
    template <>
    double limit(double value, double max, double min)
    {
        value = value > max ? max : value;
        value = value < min ? min : value;
    
        return value;
    }
    
    int calc(void)
    {
        int a = limit<int>(4,1,2);
        return a;
    }
    

  • Please add the build option --gen_func_subsections and let me know if that reduces code size.  I think it will.  If that is correct, then I'll explain it.

    Thanks and regards,

    -George

  • Hello George,

    this was spot on. It saved a lot of memory. Now im very interested to hear the explanation and if i need template specialization to save memory anymore (havent tried that as it is not so easy to revert without the expectation to have a benefit).

    kind regards Tobias

  • First, you need to know the terms input section and output section.  They are related to the linker.  An explanation is in the first part of the article Linker Command File Primer.

    A template specialization like ...

    template <>
    double limit(double value, double max, double min)
    {
        value = value > max ? max : value;
        value = value < min ? min : value;
    
        return value;
    }

    ... causes that template function to be instantiated.  That is, the function is created from the limit template with type double and code is generated for it.  By default, that code is in the same .text input section with all the other functions generated in the same source file.  The linker, when forming an output section from input sections like this one, can decide to use the entire input section, or none of it.  Since some of the functions in the input section are called, it is used.  

    By using --gen_func_subsections, each function is in a separate input section.  When forming an output section from this set of input sections, the linker leaves out the ones that are not used.  Thus the input section that contains the function instantiated from the limit template with type double, which is never called, is left out.

    Thanks and regards,

    -George