Jotham Kingston▸ Organising Data (Task 4)

(Define and decompose real-world problems…ACTDIP027)

Here's a concept for teaching data cleaning – probably to an advanced group of 7/8 students.


Tell students the hypothetical app is being developed by a pet clothing company. The concept is that people take a photo of their dog/s, and the app makes an automatic order for a dog jacket to be delivered to their address.

The pet food company want their app to be intelligent, so that the app can determine from the photo whether the client needs a SMALL, MEDIUM or LARGE jacket, in either BLACK or RED, to suit their colouring. Is this actually possible?


Give each team a paper printout of about 50 photos of dogs (preferably cut out) (Source it from Google Images. Easy)
Using nothing but the photos, students have to SORT the photos in order to:
determine the jacket size and colour for each client
determine if some clients need more than one jacket (multiple dogs)
determine if some photos do not give conclusive data


1. What details were we looking at to come to our decisions?
— How did we determine size?
— How did we choose colour?
2. Could we give the user extra instructions about taking the photo to make sorting data easier?
3. Try to write simple instructions to train another person to sort photos. (if we can do this, we're on our way to writing instructions a computer can follow)


Cleaning text data is quite easy. There are simple php functions, for example that will find and replace unwanted characters.
It’s cleaning other data like images that is more difficult!

G+ Comments

2 plus ones, 3 comments

  • Claudia Szabo: That puppy! 🙂
  • Jotham Kingston: Thanks Claudia. Trying to put a bit of effort in with the tasks in the mooc 😉
  • Claudia Szabo: You’re doing great!

+ There are no comments

Add yours