This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Compiler/TMS570LC4357: reproducible build
Tool/software: TI C/C++ Compiler
I need cl6x compiler to provide bitwise reproducible output (see also https://reproducible-builds.org.e. multiple compilations of the same source base (done by different users, in their directories) should give exactly the same binary. I am using CGT 7.3.23.
Two issues found:
I've found random bytes changed in a binary. I've found that compiler is creating temporary file, which then compiles, and that temporary file name is included into binary' .symtab section :
$ readelf -s .symtab myobject.obj
Symbol table '.symtab' contains 999 entries: Num: Value Size Type Bind Vis Ndx Name 0: 00000000 0 NOTYPE LOCAL DEFAULT UND1: 00000000 0 FILE LOCAL HIDDEN ABS 07894VfUHkc[...]
How to get rid of this? This entry here is meaningless, as mentioned temporary file is anyway removed after compilation.I did some reverse-engineering, and it seems that compiler is using some sort of gen_tempname function (https://github.molgen.mpg.de/git-mirror/glibc/blob/master/sysdeps/posix/tempname.c)., as I can see getpid and getimeofday syscalls when I execute the compiler (using strace tool). But I am unable to use LD_PRELOAD, as compiler is statically linked...
Build path is included into the binary's debugging symbols. I would like to be able to map vairable string into some arbitrary one. Similarly, GCC provides the following option: -ffile-prefix-map (see description here: https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html
We are glad that we were able to resolve this issue, and will now proceed to close this thread.
If you have further questions related to this thread, you may click "Ask a related question" below. The newly created question will be automatically linked to this question.
I have similar problem in https://e2e.ti.com/support/microcontrollers/hercules/f/312/t/743189. I have found the following workaround. I set "keep generated assembly language (.asm) file (--keep_asm, -k). in Assembler Options. This results in deterministic names.
In reply to Lukas Sk:
I saw that request, but do not see any solution, and topic was locked, thus created my own (and I have issue with cl6x, not arm compiler). Marvelous! That somewhat workarounds first issue I mentioned. Not a production though... Those assembly files will significantly increase build dir size (4 times of object file size). When I have thousands of objects (yeah, quite a big project), then I can count that in gigabytes... In worst case, I'll just implement yet-another-cl6x-wrapper (among "line buffering" wrapper and similar ones), which will instantly remove those files afterwards. Or use --asm_directory=$(BUILDDIR)/trash, and remove it after build. Need to rethink... Thanks anyway, that is some good initial approach until we get something production-ready from TI experts.
In reply to Bartlomiej Kucharczyk:
Now, when using --keep_asm, I also get some numbers, which are the same as last modification timestamp of the compiled *.asm file. Looking at binary hexdump, I see strings like the following: /path/to/asm/file.asm:$C$L6:1546604415 Looking at file.asm: $ stat /path/to/asm/file.asm -c "%Y" 1546604415 So, it turned out that setting --keep_asm does not solve my issue...
A good summary on this topic is in this forum thread.
Consider using the utility objdiff from the cg_xml package. By default, it ignores the debug information and the symbols. This reduces the constraints imposed on the build.
Thanks and regards,
TI C/C++ Compiler Forum ModeratorPlease click This Resolved My Issue on the best reply to your questionThe CCS Youtube Channel has short how-to videosThe Compiler Wiki answers most common questionsTrack an issue with SDOWP. Enter your bug id in the Search box.
In reply to George Mock:
Hello George, Thanks for the answer. It shed some light on the topic. I agree that some aspects of build process are not compiler/linker responsibility (e.g. maintaining order of inputs), but some other are, and I think that my request address such things. When I execute the same command, on the same host, in the same directory, I'd expect exactly same result (i.e. md5sum/sha256sum should match in both).Or I'd expect at least some easy method to fake build environment, so that compiler gives predictable results...
In such case, I can keep only fingerprint (e.g. md5sum hash) of an executable + environment description (a few kilobytes), and compare rebuilt binaries with it, to assure I got exactly the same content (using tools that are available on any linux box). I cannot imagine how to achieve this efficiently with objdiff... Argument that "we don't test something, thus not delivering" does not seem to be relevant in this discussion. It's not a matter of testing or not, but willingness to support this kind of use case, and actually start doing anything related to this. And, based on amount of similar questions to mine, it seems there are some people who would be interested in bitwise identical binaries. So, maybe question should be: will you add tests (and support) for this?
The solution currently provided by TI compilers does not work this way ...
Bartlomiej KucharczykWhen I execute the same command, on the same host, in the same directory, I'd expect exactly same result (i.e. md5sum/sha256sum should match in both).
Instead, some executable or library is established as the baseline, and then objdiff is used to test whether subsequent builds are the same.
Bartlomiej Kucharczykmaybe question should be: will you add tests (and support) for this?
Unfortunately, that is not on our roadmap.
Hmm... that's sad news.
Can anything be done you add this topic into your roadmap?
Anyway, how I could compute a fingerprint (e.g. MD5 hash) of an executable/library, that could be used later to compare against newly built executable/library?
It is also acceptable for me to get some way to strip those debugging symbols (strip6x tool did not work for me -- still some build paths were in the objects).
Bartlomiej KucharczykCan anything be done you add this topic into your roadmap?
I filed CODEGEN-5738 in the SDOWP system. This does not report a bug, but requests support in the compiler for reproducible builds. You are welcome to follow it with the SDOWP link below in my signature. (However, it seems SDOWP is having problems today. It should be resolved soon.)
All content and materials on this site are provided "as is". TI and its respective suppliers and providers of content make no representations about the suitability of these materials for any purpose and disclaim all warranties and conditions with regard to these materials, including but not limited to all implied warranties and conditions of merchantability, fitness for a particular purpose, title and non-infringement of any third party intellectual property right. No license, either express or implied, by estoppel or otherwise, is granted by TI. Use of the information on this site may require a license from a third party, or a license from TI.
TI is a global semiconductor design and manufacturing company. Innovate with 100,000+ analog ICs andembedded processors, along with software, tools and the industry’s largest sales/support staff.