The memset() implementation from newlib that is used during .bss initialisation is very inefficient.
Thus, when having a .bss of > ~3kBytes in RAM, .bss initialisation takes too long and the watchdog, that initially runs at SMCLK=MCLK speed, times out on my MSP430F5XXX series MCU. This jibes with the fact that memset() uses 9 CPU cycles (!!) for each byte in its core loop.
As a workaround, I have moved some variables to the .noinit section which works fine for my use case. But this is only a workaround.
As memset() really shouldn't mess with watchdog configuration (this is up to the application user), the CRT must implement this initial .bss clearing without memset() and periodically reset the watchdog timer if cleared memory section is too large.
Also, a proper memset() implementation for MSP430 would only take 3 cycles per byte.