盲人用户在可教的对象识别器中访问其培训图像

论文标题

盲人用户在可教的对象识别器中访问其培训图像

Blind Users Accessing Their Training Images in Teachable Object Recognizers

论文作者

Hong, Jonggi, Gandhi, Jaina, Mensah, Ernest Essuah, Zeraati, Farnaz Zamiri, Jarjue, Ebrima Haddy, Lee, Kyungjun, Kacorri, Hernisa

论文摘要

培训和评估机器学习模型的迭代是提高其性能的重要过程。但是，尽管可教导的接口使盲人用户能够在其独特环境中拍摄的照片训练和测试对象识别器，但训练迭代和评估步骤的可访问性很少受到关注。迭代对培训照片进行视觉检查，这对于盲人用户来说是无法访问的。我们通过MyCam探索了这一挑战，Mycam是一种移动应用程序，该应用程序结合了自动估计的描述符，以在用户培训集中对照片进行非视觉访问。我们探讨了盲人参与者（n = 12）如何通过他们家中的评估研究与mycam和描述符相互作用。我们证明，实时照片级描述符使盲人用户可以用裁剪的对象减少照片，并且参与者可以通过迭代并访问其训练集的质量来增加更多的变化。此外，参与者发现该应用程序易于使用，表明他们可以有效地训练它，并且描述符很有用。但是，主观反应并未反映在其模型的性能中，部分原因是训练和杂乱的背景的变化很小。

Iteration of training and evaluating a machine learning model is an important process to improve its performance. However, while teachable interfaces enable blind users to train and test an object recognizer with photos taken in their distinctive environment, accessibility of training iteration and evaluation steps has received little attention. Iteration assumes visual inspection of the training photos, which is inaccessible for blind users. We explore this challenge through MyCam, a mobile app that incorporates automatically estimated descriptors for non-visual access to the photos in the users' training sets. We explore how blind participants (N=12) interact with MyCam and the descriptors through an evaluation study in their homes. We demonstrate that the real-time photo-level descriptors enabled blind users to reduce photos with cropped objects, and that participants could add more variations by iterating through and accessing the quality of their training sets. Also, Participants found the app simple to use indicating that they could effectively train it and that the descriptors were useful. However, subjective responses were not reflected in the performance of their models, partially due to little variation in training and cluttered backgrounds.

下载PDF全文

下载文献需遵守相关版权规定

论文标题