This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM335x ADC lock-up with kernel v4.4.12

A year ago I posted "Lock-up reading AM335x ADC on 3.14.35 kernel".

Now I'm updating to the 4.4.12 kernel using the latest linux-ti-staging in the meta-ti layer for Yocto builds. I'm again finding problems with ADC lock-ups.

I have two drivers that regularly read ADCs. Plus a third driver that does just one read at start-up. Either one of the drivers running by itself is okay, unless I manually do a read of /sys/bus/iio/devices/iio:device0/in_voltageX_input, after which my driver locks up. My two drivers running together lock up. I am left with a couple of kworker processes stuck in the D state.

I wonder if there may be a similar root cause. But the patch that fixed it in the 3.14.x kernel doesn't seem applicable in the 4.4.x kernel.

I've found it's not so easy to demonstrate the lock-up just on the command line for 4.4.x kernel. It seems that reads through /sys/bus/iio/devices/iio:device0/in_voltageX_input work okay (unlike in the 3.14.x kernel case), but kernel driver reads through the iio_read_channel_processed() function lock up. The function call uses a mutex which the sysfs attribute read doesn't have.

However, when reading from /sys/bus/iio/devices/iio:device0/in_voltageX_input I am seeing occasional "Resource temporarily unavailable" error messages (errno EAGAIN), which is a bug I think which may be related.

  • I will ask the software team to comment. Just want to note that v4.4.12 is not an official TI release, at least in this moment. Latest official release is v4.1.18: www.ti.com/.../PROCESSOR-SDK-AM335X
  • Craig,

    Can you help me replicate your issue by sending me one of the modules (or a dumbed down version of the module) so that when I do a read of in_voltageX_input I can lock up the driver on my setup?

    Once I get this I should be able to go back and forth with the kernel development team to try and root cause this issue.

    Jason Reeder
  • Sure, that should be possible, though I'd prefer to send as a private e-mail. I'll see if I can do that.

  • I've sent some files by e-mail.

    Meanwhile (or alternatively), can you confirm if you get occasional "Resource temporarily unavailable" error messages when doing the following:

    while true; do cat /sys/bus/iio/devices/iio\:device0/in_voltage?_raw; done
  • Craig,

    Looks like I'm missing a header file that is stopping me from being able to rebuild your module.

    I do not seem to get the 'Resource temporarily unavailable' error message when running your 'while true...' command. I used your provided dtsi file to enable the ADCs on my BeagleBone black and I am using TI's upcoming 4.4.12 kernel. All I see are ADC values scrolling past.

    Jason Reeder
  • Sorry, I've sent the header file through.

    Could you try running that "while true ..." command in two SSH consoles at the same time, and then see if you get the "Resource temporarily unavailable" messages?

  • Craig,

    When I ran three consoles simultaneously with your while loop I did see the 'resource temporarily unavailable' message.

    Now that I have the missing header file I can also build your module and recreate your lockup issue as well. I'll let you know what I find.

    Jason Reeder

  • Craig,

    As you mentioned, it doesn't appear like the sysfs attribute wraps its ADC accesses in a mutex.

    Can you try using a mutex in the iio_read_channel_info function in the drivers/iio/industrialio-core.c file as shown at the bottom of this post? You'll have to rebuild the kernel after doing so.

    After doing this on my setup I was able to perform a while loop on both of your drivers as well as a while loop on the sysfs interface with nothing locking up.

    Jason Reeder

    static ssize_t iio_read_channel_info(struct device *dev,
                                         struct device_attribute *attr,
                                         char *buf)
    {
            struct iio_dev *indio_dev = dev_to_iio_dev(dev);
            struct iio_dev_attr *this_attr = to_iio_dev_attr(attr);
            int vals[INDIO_MAX_RAW_ELEMENTS];
            int ret;
            int val_len = 2;
    
            mutex_lock(&indio_dev->info_exist_lock);
    
            if (indio_dev->info->read_raw_multi)
                    ret = indio_dev->info->read_raw_multi(indio_dev, this_attr->c,
                                                            INDIO_MAX_RAW_ELEMENTS,
                                                            vals, &val_len,
                                                            this_attr->address);
            else
                    ret = indio_dev->info->read_raw(indio_dev, this_attr->c,
                                        &vals[0], &vals[1], this_attr->address);
    
            mutex_unlock(&indio_dev->info_exist_lock);
    
            if (ret < 0)
                    return ret;
    
            return iio_format_value(buf, ret, val_len, vals);
    }
  • Thanks, I have tried that. It does prevent lock-ups from occurring.

    I think this should be considered a work-around rather than a fix though, because this isn't the primary purpose of that mutex. The ADC driver should work correctly even without this mutex being added.

    I'm still finding that occasionally, ADC reads via iio_read_channel_raw() etc return -EAGAIN. This seems to happen more often during the early stages of boot, but it can also happen randomly at other times during run-time. So it seems there's still some bug that needs fixing.

  • Craig McQueen said:

    I'm still finding that occasionally, ADC reads via iio_read_channel_raw() etc return -EAGAIN. This seems to happen more often during the early stages of boot, but it can also happen randomly at other times during run-time. So it seems there's still some bug that needs fixing.


    I found another driver I'm using was doing ADC reads in another way (calling the source ADC .read_raw() function directly) that bypassed that mutex on the iio_read_channel_raw() etc calls. I've changed it to use the proper IIO ADC API functions, and now that seems to fix the -EAGAIN errors.

    Still, the TI ADC driver should be able to function without this need for the mutex being added.

  • Craig,

    I agree, the fix should be in the ti_am335x_*** drivers. After my last post I notified our kernel development team about the issue in the logic around the UNINTERRUPTIBLE sleep in the drivers/mfd/ti_am335x_tscadc.c driver. They've acknowledged it and are looking into it.

    I'll keep you updated on what they come back with.

    Jason Reeder

  • Craig,

    Check out these two commits in the ti-linux-4.4.y branch of the ti-linux-kernel repo (git.ti.com/.../ti-linux-4.4.y):

    http://git.ti.com/ti-linux-kernel/ti-linux-kernel/commit/8fd4c435aee44f5bd8e8234c33054814e4f69bdc

    http://git.ti.com/ti-linux-kernel/ti-linux-kernel/commit/e369b7fe8a11da8de8e832d72d7f091352d7829a

    The first commit addresses your issue where you occasionally got a return value of EBUSY and the second commit fixes the lockup issue that you were seeing by adding a mutex to the TI ADC driver.

    These fixes will be included in the next release of the Linux Processor SDK.

    Jason Reeder

  • I tried the latest revision d811ff93 in the meta-ti for Yocto, which points to kernel revision 429279fc. But for some reason the kernel didn't boot at all. I'll have to look into it. I suppose in the meantime, I could try just those two specific commits.

  • I tried cherry-picking the two specific commits, and removing my own work-around for the lock-up. The commits look good to me. I don't get any lock-ups, and I also don't get EBUSY on reads. Thanks!

  • Craig,

    That's good to hear! Thanks for coming back and confirming this fixed the issue.

    Jason Reeder