This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

"Page Program" timing problem with Linux SPI flash driver

Hello,

I am experiencing a problem when writing to Winbond flash (W25Q64) from TI's MontaVista Linux distribution.  Allow me to present some background information.  I have partitioned the SPI flash into 5 MTD partitions.  Using the mtd_debug utility, I am able to successfully erase, read, and write raw data to any of these partitions.  So far, so good.  However, I found some strange behavior when writing more than 256 bytes to any of the MTD partitions using mtd_debug write (or dd, for that matter).  The first 256-byte page always writes successfully, but the following one or two pages are not written -- the memory remains erased (all bits high).  The following page is written successfully, and the pattern repeats.

Considering that entire 256-byte pages were either being written successfully or not, I figured that I was likely seeing some sort of intermittent problem with the Winbond flash "Page Program" instruction (0x02).  Perhaps some sort of timing problem was causing certain "Page Program" instructions to be ignored, thus leaving the page-size "holes" in the flash.  I noticed the following documentation concerning Page Program in the W25Q64 manual:

As with the write and erase instructions, the /CS pin must be driven high after the eighth bit of the last byte has been latched. If this is not done the Page Program instruction will not be executed. After /CS is driven high, the self-timed Page Program instruction will commence for a time duration of tpp (See AC Characteristics).  While the Page Program cycle is in progress, the Read Status Register instruction may still be accessed for checking the status of the BUSY bit.  The BUSY bit is a 1 during the Page Program cycle and becomes a 0 when the cycle is finished and the device is ready to accept other instructions again.

I threw a logic analyzer on the SPI signals, and sure enough I found that "Page Program" instructions are being executed back-to-back, that is, without any "Read Status Register" (0x05) instructions to check for the status of the BUSY bit.  A typical page program cycle takes 0.7 - 3 milliseconds (this is the tpp AC characteristic referred to in red above).  The back-to-back "Page Program" instructions were being executed as little as 150 microseconds apart; much too short for the program operation to complete.  As a result, writing the first page succeeded, but writing the second and third failed because the "Page Program" operation was still executing on the flash chip when these instructions were executed by the Linux SPI flash driver.

What I haven't been able to figure out is how to fix the Linux source to properly check the BUSY bit before returning from a page program instruction.  There is some code in drivers/mtd/devices/spi_flash.c that seems to address this issue, but it must not be working properly:

static int spi_flash_write(struct mtd_info *mtd, loff_t to,
      size_t count, size_t *retlen,
      const u_char *buf)
{
 char *ptr;
 int status;
 int tx_cnt;
 int size_limit;
 unsigned int addr;
 struct spi_transfer xfer;
 struct spi_message msg;
 struct mtd_spi_flash_info *priv_dat = mtd->priv;

 addr = (u32) (to);
 *retlen = 0;
 memset(&xfer, 0, sizeof xfer);

 if ((count <= 0) || ((addr + count) > mtd->size))
  return -EINVAL;

 /* take the smaller of buffer size and page size */
 /* Want to make buffer size > than page size for better performance */
 if (priv_dat->page_size <= SPI_FLASH_BUFFER_SIZE)
  size_limit = priv_dat->page_size;
 else
  size_limit = SPI_FLASH_BUFFER_SIZE;

 mutex_lock(&priv_dat->lock);
 while (count > 0) {
  spi_flash_write_enable(mtd);

  spi_message_init(&msg);
  xfer.tx_buf = ptr = priv_dat->tx_buffer;

  msg.complete = spi_flash_wait_complete;
  msg.context = priv_dat->spi;

  /* set the write command */
  ptr[0] = MTD_SPI_FLASH_WRITE;
  ptr[1] = (addr >> 16) & 0xFF;
  ptr[2] = (addr >> 8) & 0xFF;
  ptr[3] = (addr & 0xFF);

  /* figure out the max data able to transfer */
  tx_cnt = size_limit - (addr & (priv_dat->page_size - 1));
  if (count < tx_cnt)
   tx_cnt = count;

  /* copy over the write data */
  ptr = &ptr[SPI_FLASH_CMD_SIZE];
  memcpy(ptr, buf, tx_cnt);
  xfer.len = SPI_FLASH_CMD_SIZE + tx_cnt;

  spi_message_add_tail(&xfer, &msg);
  status = spi_sync(priv_dat->spi, &msg);

  count -= tx_cnt;
  buf += tx_cnt;
  addr += tx_cnt;
  *retlen += tx_cnt;
 }
 mutex_unlock(&priv_dat->lock);

 return (0);
}

static void spi_flash_wait_complete(void *context)
{
 int i;
 struct spi_device *spi = context;

 for (i = 0; i < 5000; i++) {
  if ((spi_flash_read_status(spi) & SPI_FLASH_STAT_BUSY) == 0)
   return;
 }
 printk(KERN_WARNING "SPI FLASH operation timeout\n");
}

I am not receiving the KERN_WARNING "SPI FLASH operation timeout", unless I am looking in the wrong place (tried dmesg and cat /var/log/messages)?  Anyway, I'm not much of a kernel hacker, so I'm hoping that someone out there has some idea on how I should tackle this problem.  Thanks in advance!