Week 2: Teachable Machines, Dog Breed This or That

Reading

**Excavating AI: The Politics of Images in Machine Learning Training Sets by Kate Crawford and Trevor Paglen.**

“Automated interpretation of images is an inherently social and political project, rather than a purely technical one”
relationship between image + meaning is very nuanced and complex
3 layers: (The Japanese Female Facial Expression(JAFFE) Database)
- overall taxonomy(ex. facial expressions depicting the emotions of Japanese women)
- individual classes(ex. happiness, sadness, surprise, disgust, fear, anger, neutral etc)
- individually labeled image(content ex. a woman looking surprised)
ImageNet- “map out the entire world of objects”
- synsets, representing a distinct concept, organized into nested hierarchy
- “Chair”→ artifact > furnishing > furniture > seat > chair
- Restricted to nouns
Assumptions under visual AI systems
- concepts are fixed, universal, and consistent
- fixed and universal correspondences between images and concepts
- Uncomplicated and measurable ties between images, referents, labels
- All concrete nouns are created equally, abstract nouns also express themselves concretely/visually(ex. anti-Semitism)

Reflect on the relationship between labels and images in a machine learning image classification dataset. Who has the power to label images and how do those labels and machine learning models trained on them impact society?

“Ceci n’est pas une pipe” emphasizes that labels do not always reflect truths and meanings of images
Some images are mimicking/performed(image of woman with angry expression is not the same as woman mimicking angry expression)
Researchers, developers, corporations hold the power to shape the meaning of visual data
These labels hold biases which are applied to ML models → impact how AI classifies/perceives the world, which are integrated into the world- hiring, education, healthcare, etc

Making

This week, I was inspired by the “this or that” filters on Instagram, Snapchat, TikTok, etc. These filters present users with 2 options, and users can choose their preference by performing specific gestures or actions such as tilting their head or pointing.

I decided on a dog themed game where users narrow down their favorite dog breed from 4 options using hand gestures, detected and classified by Teachable Machines.

How it works:

Users use an open hand to select the right option and a closed fist to select the left option.
Users hold their hand gesture and click the “next round” button to make their choice and advance to the next round.