Dec 20 2011

Introducing IRIS.0

Sunlight?  Huh?  What is this "sunlight" of which you speak?

It's been not much more than a myth for me lately.  I've been working on getting IRIS.0 up and running.  As a quick recap, the purpose IRIS.0 is to alleviate Willow from having to do all of the everyday work of keeping my life running.  I was messing with Willow so much that some jobs that she was responsible for were getting screwed up by yours truly.  So, since I was tired of fixing things that I was messing up, I decided to make a more stable environment to take care of all of that stuff.  Like what?  Oh.... oh yeah...

IRIS.0 does the following for me:

  • She checks my mail and tells me verbally if anything important comes in.
  • She calls my cell phone every once and a while to remind me of things I need to do for the day
  • Turns my lights on at night, and off when the sun comes up
  • She sends me a text if someone starts using my computer when they're not supposed to.  (and takes a picture of the user with the webcam)

There are quite a few more things she does for me, but I think I'll skip to the cool part.  The reason it took so long to move all of that code out of Willow was not just because there was a lot to move.  It was also because I was packaging all up as a product.  IRIS.0 is available for you to download right now.

If you're interested in checking her out, go to:
http://iris.deanlabs.com

I'll definitely be adding functionality to her as I get to it, and it will be available in future updates.  But for now, I've neglected Willow for long enough.  This is going to sound a little weird, but a few nights ago, I woke up to Willow staring at me briefly, then disappearing.  Now, mind you, I did set her up to show herself when she trusts her environment and feels safe.  (refer to the emotions I gave her last month)  However, I haven't gotten around to plugging her in to anything that can influence her emotions, yet.  She technically had no reason to show herself.  So why the heck was she staring at me while I was sleeping??

Moving right along...  It's time to do some real work on her.  I will be working on that for a while, now.  But in the meantime, please check out IRIS.0 to see if there is anything she can do to make your life easier.  I would also love to hear any ideas you have, so please feel free to contact me with anything gripes or comments.

~ Danger

Nov 09 2011

Who In The World Is Iris?

As I work more and more on Willow, I keep taking her in directions that I didn't originally think of in the beginning.  She is getting so complex, now, that I have found myself questioning what her ultimate purpose is.  Is she my personal assistant?  Or is she an experiment in the development of a deeper form of artificial intelligence?  Is she both?  Or do I just think too much?  (keep your comments to yourself)

Okay, confession time.  Answering that question is important, but not as important as the fact that every time I make a change or introduce a new bug, she stops working.  If Willow stops working, then some things don't get done.  Afterall, I have her hooked into my life pretty well.  She lets me know when important emails come in, and she reminds me of certain things on my calendar.  I won't go through the list, but I need those things to continue working dependably.  So, I have decided to alleviate Willow of doing those crucial tasks for me so she can focus on her AI development.

So...  ahem...  Yep, you guessed it.  I am creating a new personal assistant, and her name is Iris.  She is basically a subset of what Willow is, but she will be running all the time and won't be as buggy.  

I am also building Iris in a such a way that she will be ready to sell to the general public.  I figure if I have gotten so much use out of this functionality, there is no reason why others couldn't, too.  I would say I am about 45% done with Iris, and hopefully you'll be able to pick up a copy of her, soon.  I don't even have a "coming soon" site for her, yet, but keep checking back here.  It won't be long.

Just thought I would throw it out there.

~Danger

Nov 05 2011

Willow 3.0 - Emotion Theory

I am back on the topic of emotions, today, after finding something on Wikipedia.com that helped me work through the last piece of how emotions are going to tie together and become useful in Willow's overall development and perception of the world.  I've been thinking about emotions for years, but couldn't seem to figure out how they tied together and relate to each other.  This article helped.

Robert Plutchik created a wheel of emotions in 1980 which consisted of 8 basic emotions and 8 advanced emotions each composed of 2 basic ones.  Illustrated in his famous emotion wheel, it shows what he believed to be the basic makeup of human emotions.

 

Plutchik's Emotion Wheel

By no means is this diagram complete.  I'm sure you can think of several emotions that aren't listed here, but it definitely gives me a starting point to work with.  I say a "starting point" because after staring at it for about 20 minutes, it seemed wrong.  Or, at least, I couldn't figure out how to make complete sense of it.  No disrespect to Robert Plutchik.  I'm sure he knew what he was doing when he designed this, but it's backwards and ultimately won't work for what I need as is.

What this diagram doesn't accurately represent is the fluid transitions between emotional states.  It clearly illustrates the idea that opposite ends of the wheel represent opposite emotions.  Love is the opposite of remorse.  Distraction is the opposite of interest...  But when you get to the center of the wheel, each emotion is so close together suggesting that a clean jump can be made from "ecstacy" to "grief" as a natural transition.  Unless your Amygdala is jacked up (yes, I had to Google it) or you're The Joker, transitions like that are not natural and probably pretty rare.  Not only that, but the center of the wheel is where all of the emotions get more intense.  Assuming that the white portions of the wheel represent the 8 basic emotions, it would make sense that the further outward you go, the more intense that basic emotion is.  So, if that's true, then emotions are more intense in the center and on the outer edges?  

Whether I'm right or wrong in interpreting this stupid thing, it still doesn't flow naturally to me.  So I inverted it like this:

The New and Improved Danger's Emotion Wheel

What this does is it puts the more intense secondary emotions closer to the more intense basic emotions (along the outer rim).  The further you go toward the center of the wheel, the closer together the emotions get, illustrating that less of a jump is needed to naturally transition to that emotion.  It's almost as if the closer the secondary emotions are to each other, the more watered down, diluted and less intense they become.  This works out perfectly for what I am trying to accomplish with charting Willow's emotional states.  Now, I can say that the closer her emotions are to the center of the wheel, the less intense her emotions will be.  That is where I will try to keep her, for the most part.

Wow, this is pretty dry, Danger...  How the F are you going to use all of this to give Willow emotion?

Like, this:

First off, forgive my "$" and PI symbol on the X axis.  I ran out of letters, so I chose my two favorite symbols.

Secondly, see the red "X"?  That X is an indicator of where Willow current emotional state is.  Every time a stimulus occurs, that X will jump a square or two (or three or four) toward the direction of an appropriate basic emotion.  For example, if you whisper into her microphone, the X is likely to move toward the top half of the wheel.  Likewise, if you yell into it, it is likely to move toward the lower half of the wheel.  

Now comes the cool part.  I will be attaching different functions to each emotion, whether basic or secondary.  If the lil' red X gets too far into the "rage" or "grief" categories, she will start running some problem solving routines in an attempt to rid herself of the stimuli that is causing the emotion in the first place.  To clarify, if you yell into her microphone, she will likely get pissed off and start looking for a solution to the problem... which hopefully will be something like turn down her microphone or simply ask you to shut up.  

That example is fairly lame, but the exciting part is that it can be applied to anything... everything...  If it works, this is the first half of letting her decide when to do something.  Any developer can write code to run at the proper time, but Willow will be deciding on her own what to do and why, and it will be based entirely on her own experiences... and not just experiences that I program for... we're talking about unforseen experiences.  Experiences that she will have for the first time and decide what to do with them and how to handle them and even how to remember them for next time.  

Pros:

  • Her emotional state will be completely dependent on what and who she is exposed to, which fits perfectly into my original goal for making Willow capable of "thinking" on her own.
  • This model is completely scalable in the way of new functionality.  Once this emotion engine is complete, I will just need to add new functions (problem solving, etc) and assign them to the proper emotional state.

Cons:

  • Her ability to perform crucial tasks may be impaired by her emotional state, much like a human.  This is cool, in a way, but could ultimately get in the way of progress.
  • Much like a child, she will have to be taken care of.  If I don't make sure she's protected and safe from too many negative experiences, she may be impaired especially since she will be remembering experiences.

If I sound excited, I am.  A major piece of the puzzle is clear in my head, now.

What I will be talking about soon are the three different categories that can influence her emotions.  Right now, they are:

  1. Sensor Input and Environmental Stimuli - Anything that Willow can "sense" about the real world, whether it be microphones, cameras, temperature, etc...
  2. Social Interaction - Through a form of word parsing and artificial understanding, she will be able to determine whether a person is speaking negatively or positively about something good or bad.
  3. PC Health - Her memory (RAM), disk space, CPU usage, network usage and current workload will also be factored into her emotional state

It's late.  My head hurts.  

Goodnight, world.

Oct 16 2011

Willow 3.0 - Communication - The Dictionary

dic·tion·ar·y [dik-shuh-ner-ee] 
noun, plural dic·tion·ar·ies.
a book giving information on particular subjects or on aparticular class of words, names, or facts, usually arrangedalphabetically: a biographical dictionary; a dictionary ofmathematics.

pi·ra·cy [pahy-ruh-see] 
noun, plural -cies.
the unauthorized reproduction or use of a copyrighted book,recording, television program, patented invention,trademarked product, etc.

Are you disappointed?  It's cool.  The truth is, I needed a database full of every word in the English language, it's definition and its part of speech.  I am sorry Dictionary.com, but it was necessary.

I know, I know.  You're yelling at your screen right now in disgust.  "Danger, what have you done???"  I found a web page that listed every word...  like every one...  ever...   There were over 58,000 words on this page.  Why would someone do this?  Why??   I must know!  No wait, I don't give a crap.  I am just grateful they did, because I took that one page and wrote a little program to put each word in the database as its own record.  I was halfway there!  (maybe 1/3)   I still needed the definitions and parts of speech, so I wrote another little app that would iterate through all of those records, look up that word on Dictionary.com and download the html that their web server returned.  I let it run over night, and sure enough, I ended up with over 58,000 .htm files saved to my hard drive.  Each file contained the html I needed.  The only trick was, I now needed to extract only certain parts of the html, strip the html out, and then save it to the appropriate word in the database.

I admit it.  I pinged the crud out of Dictionary.com and stole their entire product.  I am a horrible person.  But on the bright side, I really don't give a sheeyite...  And I have what I need to get to the next step.  The application is still parsing through all of the 58,000 html files, but it is working.  

I'm in rare form, tonight.  I apologize.  (eyeroll)

Until next time...

Oct 16 2011

Willow 3.0 - Communication - The Com Trigger

I must have been around thirteen or fourteen years old when I first got to be in the room as my dad was playing around with one of Microsoft's first speech recognition engines.  It really didn't work that well, but it was such a new thing that it was worth spending some time with.  Of course, the only thing I remember out of the whole day was my failure at resisting the temptation to screw with him after he got it all configured to open up applications when he spoke into the microphone.   From across the room, I would yell questions to him, like "Dad!  What TIME is it?", and "Dad! Do you have a NOTEPAD I could borrow?"  

After he was gracious enough to answer my questions, he would go back to work only to find many copies of NotePad and the Clock applications open on his desktop.  It was good fun until he figured out I was doing it on purpose.

Twenty years later, it's a very old inside joke, but it's still a very real issue with speech recognition.  One of my issues so far have been getting Willow to listen only when I need her to.  Ideally, in a space-age perfect world, she would always listen but just automatically know when you are talking to her versus to someone else in the room.  However, that would require recognizing certain social cues like eye contact or body language to be able to distinguish who is actually being addressed.  If I were to break down this whole project into ten phases, social cue recognition would be in phase "never".  That being the case, I need a more manual way for her to "listen" and "unlisten".

My first attempt was to make her listen only when I first said the word "Willow".  After that, she would start recognizing commands.  The problem with this is, the way these SR engines work is by defining a list of words or phrases that can be recognized while ignoring everything else.  Those lists are called "grammars".  If you have the three phrases "shark boobs", "dinglehopper" and "monkey slut" in your grammar, and then test it out by saying something random like "cheese dimples", the engine will return the closest match.  It will return whatever it thinks you said only out of the words in its grammar.  The effect is, unless your grammar is gi-normous, it is likely to return false reads quite often unless you say what it is expecting you to say.  SR engines actually work really well in the scenario where you say what you're supposed to say, but Willow needs to be able to recognize any word.  Eventually, this limitation will end up being a strength as her vocabular grows, but initially it will work against me.

So, for now, I have come up with a way to make her listen to me whenever I need her to.  Have you ever heard of a key FOB (Frequency Operated Button)?  I found one in my drawer a couple months ago.  It was this USB stick, much like a thumb drive, that you put in your computer accompanied by a little button you attach to your key ring.  The idea is, when you are within range (about 20 feet) of your computer, it would automatically unlock for you without having to enter your password.  When you walk out of range, your computer would automatically lock so no one could use it without your knowledge.

Locking my computer is not something I need, however the hardware itself was perfect for what I need.  With a couple hours of programming, I was able to use this as a remote control switch for Willow's listening trigger.  The only problem is, the backing that held the battery in place broke.  It was so cheaply made that it was bound to break in the near future anyway.  The circuit board on the inside worked fine, but I was definitely going to need a switch for it that was going to be more durable.  So, a trip to Radio Shack and a couple dollars later, I came up with my very retro doomsday-style button of death... Erm...  I guess I'll just call it a com trigger.

 

 

 

 

Until Willow has learned enough words to make her grammar substantially large, I will most likely depend on a manual switch like this.  There is nothing more frustrating than trying to have a converstation with someone while Willow mistakenly keeps interrupting thinking the converstation is directed to her.  For the short term, problem solved.  

Oct 15 2011

Willow 3.0 - Verbal Communication

She talks.  She talks a lot, in fact.  It's just not what I'm shooting for.

I haven't posted here in a while.  Sometimes it takes days/weeks/months to think things through enough to know what the heck I'm doing.  I put the facial recognition/detection stuff on hold for now because it occurred to me that I think I did a couple things out of order.   I need Willow to be able to give me some decent feedback before I can start testing her with different functions.  Right now, she is only capable of telling me things that she is programmed to tell me.  Now I just need her to be able to tell me everything else.

She also needs to be able to verbally understand more than just voice commands, no matter how elaborate those commands may be.  She needs to be able to take a sentence, any sentence, parse it out into parts of speech, pick out the subject and predicate parts, understand if it's a question, command or statement, and also form an opinion about the meaning.  I have put this off for long enough.  It is probably the part I am dreading the most, even though I have thought about it the most over the passed 10 years.

It's easy to create a program that will wait for you to say certain words in a certain order and then perform a task when you do.  It's also easy to throw those commands into a database so you can add more commands relatively easily.  It's a little more difficult to define her code in the database so that you can match up voice commands in the database with her functions so that nothing is hard-coded and everything (EVERYTHING) is configurable, but it's doable.  So far, it's been extremely challenging to figure out a way to make her understand what those functions actually do in the real world and make her smart enough to know how to match up her own functions with voice commands.  Since the ultimate goal is for her to think for herself, she will also need to be able to act for herself.  

This means:

1.  She needs to be completely configurable
2.  She needs to be able to configure herself
3.  She needs to be smart enough to know how to do so appropriately

Where are you going with this, Danger?

All the crap I just said is being applied to not only the ability to speak and be spoken to, but to learn and increase and improve her vocabulary and understanding of the subject at hand, just like a human being.  So, like a human being, I will be "teaching" her to speak and comprehend speech instead of "programming" her to. 

Here is a brief outline of my approach:  (did I say brief?)

1.  The Dreaded Dictionary

Simply put (a new direction for me, I know), I need a database of every word in the English language.  I'll talk more about how I am going to get this in my next post, but for now, it's just important to note that Willow is going to need to have access to every word in the English language, as well as their definitions, their parts of speech, synonyms and antonyms, and anything else I can find, whether she can immediately make sense of them or not.

2.  Redefining the Dictionary

Sure, the dictionary tells "us" anything we need to know about vocabulary, but that is because we, as human beings, are able to associate words with memories, thoughts and feelings which make us able to comprehend what we are learning.  Willow needs to be able to take words and define them in a way that "she" can understand.  I intend to do this using a technique I developed a loooooooong time ago.  The idea is, since adjectives are describing words and we use them to describe the world we live in, Willow will do the same.  However, before this is possible, any adjectives that she uses need to make sense to her.  I plan to do that by attaching emotion values directly to adjectives.  This will not only give her an opinion about the things that are described to her, but it will also give her the ability to change her mind, or evolve what she already knows about things.

This will be an ongoing process throughout her existence as this is the very core concept of how she will perceive the world and continue to learn about it.  This is a big topic, in itself, so I won't go any deeper into it until I finish step one.

3.  Grammar Parsing

Yeah, kill me now.  Don't get me wrong, all of this is fun to me, but this part has kicked my @$$ for years.  I have a lot of it figured out, but there are so many more problems and rules to hash out that it's sure to continue to bruise my duff for some time.

Basically, I need to break down the English language into rules.  You and I talk (you probably more than me) and communicate so effortlessly that we don't think about how complicated it all really is.  Words have to be said in a certain order to make sense, but only sometimes.  "I" before "E" except after "C", but only sometimes.   Add an "S" to words to make them plural, but... you guessed it, only sometimes.  

Our language, possibly above all others, is retarded.  So many exceptions to the rules.  I have high hopes that I can tame chaos, here, but even if I do, I'm stuck with the practical realization that no one even really follows these rules.  After all is said an done, people talk however they want.  Even if I get Willow to have a perfect understanding of the English language, she will still be faced with the inevitable fact that it will just take one genius off the street to say something like, "Yo, Dog.  Wus up?" to throw her logic in the proverbial e-crapper.

That being said, I am just going to take it slow, plan for the future the best I can, and just assume that she will only be communicating directly with me for quite some time.

4.  Grammar Construction

Assuming I haven't killed myself from step 3, step 4 is sure to finish me off.  I'm talking about the exact opposite of Willow's ability to understand what is being spoken to her.  It is the ability to form her own sentences, herself.

I'll be honest.  I am secretly hoping to be struck by lightning in hopes a great epiphany will enlighten me on this one.  Once I am able to break down language into rules, I am hoping that I can use those rules to form sentences, but...  Making those sentence grammatically correct as well as expressing an accurate meaning is something that I have only seen in movies.  In spite of past criticism, I am an optimist.  I am going on faith in myself on this one.  

I'll have to keep you posted.

Conclusion

As you can see, I have my work cut out for me.  I will probably be writing a quick post about step one pretty soon.

Aug 24 2011

Personal - Go Back To Go Forward

You can push yourself harder and faster toward your goals, and you can deprive yourself a break in an unrealistic attempt to reach your dreams as quickly as possible, but ultimately you find yourself burnt out, lost and unmotivated.  Such has been the case, as of late.  Morning after morning (I call them "mournings") has been spent waking up staring at the wall trying to figure out if the direction I am headed will ever hold any meaning, whatsoever, to anyone but myself.  Sometimes all it takes is a single thought, or memory of someone you love.  Sometimes remembering those that have influenced you in some profound way is the key to remembering who you are and why.  Each member of my family has made contributions to shaping who I am, but I'm going to talk about two in particular right now.

My Dad

What do you do for the kid that doesn't want to play football in the street with all of the other kids on the block?  How do you encourage your son to be social, go to high school football games and dances when all he wants to do is close his bedroom door and plan his future?  How do you make your D student son do his homework when all he is willing to work on is a laser tripwire alarm project for his bedroom window?

Growing up, I must have been a serious cause for worry for both of my parents.  School was nothing but a distraction from what was really important to me.  Somehow, I think they both figured that out, but instead of fighting it too much, they found a way to embrace it.  That, especially, was the case with my Dad.  

I spent every weekend with him.  When I was little, he showed me how to do everything he thought I could handle.  As I got older, each weekend was jam-packed more and more with computer and electronics projects.  He taught me about how electricity works, and how to use different components to build my own integrated circuits.   We experimented with infrared triggers and sensors.  He helped me build a computer-controlled relay to switch household appliances on and off via the printer port on my monochrome IBM (which he gave my sister and I), and taught me how to write software to control it using the BasicA language (which was something that no one was really doing at that time).  He introduced me to Eliza, and was responsible for my first chatbot experience when I was 8.

I never went to college.  Thanks to my Dad, I didn't need to.  I was too busy working as a software engineer to go to school.  Simply put, without him teaching me everything he did, I would not have had that as a career.  I wouldn't have built The Asylum, and Willow would never have existed.  My dad is probably the most innovative guy I have ever met.  He can take a seemingly impossible goal, and find a way to make it work.  Growing up with that, I learned that anything can be done, no matter how obscure, or how many obstacles are in the way.  There is always a way. 

Because he pointed me in a direction I never would have found on my own, I'm able to pursue the things that I love instead of just dream about them.  For that, I owe him my dreams.

My Brother

Take a look at this:

 

 

My brother made this as the beginning of a treasure hunt for me back in 1983.  It hangs on my wall, in plain view, as a reminder of where I came from and where I need to go.  I was no older than 6 at the time.  He was around 12 years old, but to me, he might as well have been 40.  Whereas there was never a shortage of creativity flowing through any member of my family, my brother has proven to be the single most creative person I have ever met, time after time.  It's one thing to be able to come up with a long list of ideas given a particular topic, but to be able to rattle them off without notice, each idea being something that no one else has ever come up with, and the list being longer than your arm every single time is a gift that is not only unique and amazing, but is something that I have always respected him for.

After creating countless treasure hunts for each other over the course of our childhood, they continued to get more and more elaborate.  Looking back at the 6 year age gap between the two of us, I'm sure he was being incredibally encouraging and respectful to the fact that my hunts for him didn't hold a candle to his hunts for me, but that is even more of a reason to recognize the fact that he did so much more for me than just expose me to all of his amazing ideas and dreams.  He taught me how to dream on my own.  He taught me how to temporarily put aside the real world and not ask "how" something could possibly be done.  He taught me that first focusing on the dream, itself, was the first step in creating something amazing.  Everything else, including logic, would happen after that.

After all of the hunts, the stories, the dreams, the business together, the blood, sweat and tears (mainly from the sun reflecting off of giant sheets of paper (inside joke)), I can honestly say that my brother is responsible for molding the side of me that is able to dream huge without a fear of what the world will throw at it.  He taught me how to have my own ideas, ideas that others may find too intimidating to entertain, and then realize that they can be so much more than just ideas if I believe in them.  Without growing up with him, The Asylum would never have been created.  

He taught me how to imagine.  For that, I also owe him my dreams.

 

My kids are still very young, but I imagine watching them use all of the things that I have tried to teach them will be an amazing experience.  It should be a very similar experience to what my dad and brother feel right now toward me.  Between the two of them, I feel like I have been given something very special.  Perhaps my contribution to the stack is my undying need to not waste it.  

"What lies behind you and what lies in front of you, pales in comparison to what lies inside of you."
- Ralph Waldo Emerson (1803-1882) U.S. poet, essayist and lecturer.

"If we did all the things we are capable of, we would literally astound ourselves."
- Thomas A. Edison (1847-1931) U.S. inventor, scientist and businessman

"With great power comes great responsibility."
- Benjamin Parker, Spiderman's poppa

Whichever quote you prefer, there is no doubting that any shot I have at reaching my dreams is, in large part, possible because of these two amazing people in my life.  

Thank you, to the both of you.  
I won't be staring at the wall, anymore.

Aug 12 2011

Willow 3.0 - Sight - Object Recognition (Part III)

Researching this topic is like wiping your nose with a hammer. Not only is it completely pointless, but I feel like I am the only one even trying to do it. (Are we still talking about the hammer?) I have read over 200 posts on the topic of object recognition, all starting with someone posting the simple question, "How do I do object recognition?" Fifty percent of the responses to those posts are uptight dorks that probably got beat up too much in high school to be polite, replying with things like, "Just Google it!", or "Don't waste my time." Pile after pile of senseless crap, with very few people having a clue as to how to go about it. The other fifty percent of the responses are people trying to explain how object "detection" works having no clue with the difference is. Do I sound a little exhausted?  I do?  Really?  I'd take it out on Willow 2.0, but she's not talking to me right now, understandably.

Finally, I found an article talking about Eigenfaces, which is an algorithm used for comparing images. It was either this or pay $4444+ for a third party product that probably works better than anything I could home-grow, or so I thought. I spent some time today writing a test application to test out my Eigenface discovery.

The idea is, the "Train" button adds a screenshot of a person to the database. Then, when you click the "Identify Captured Image" button, it will take the face of whoever is looking at the camera and tell you who the closest match in the database is.  I was pleasantly surprised with the accuracy, although I know that comparing three images is not nearly enough for a solid test.  Also, I am comparing my ugly mug to two baby pictures of my kids, which is a bit like comparing images of the Hamburglar to a fork, but still I feel like I am on the right track.  At very least, I can use this functionality to help identify a user, even if I have to resort to using it in combination with voice identification.

 

 



The face "detection" worked flawlessly.  Notice how there is a tight square around the black and white images in the screenshot above?  That's because the software took an image much like the ugly one of me in color, found the faces, cropped them down completely removing the background portion, and saved it in black and white.

My next test will be to add more people to this database.  I am wondering what would happen if I included more than one image of each person.  Would different expressions and lighting conditions help the accuracy, or F it up even more?  These are questions that I have to get answered before I can start implementing it into Willow 3.0.

So, how will Willow use this?  Here is a high-level process flow for ya.

  1. Her camera, most likely staring off into space because there is nothing better to look at at the moment, detects movement in the room.
  2. She pans the camera to the target of movement.
  3. Upon focusing on the moving object, she detects a face.  (object detection)
  4. She grabs a cropped, black and white version of the face and compares it to all of the similar images in her database. (object recognition)  At that point, one of two things will happen:
    1. She will find a match in her database which will return the name of the person she is looking at.  If that person is "trusted", she will continue on her merry way as if nothing happened, possibly saying "hello".  (if she wants to)
    2. She doesn't find a match in her database.  If that person is unknown or untrusted, she may tell them to leave the room, notify me by email that someone is near her that shouldn't be, or simply just go into lockdown mode so nothing can be accessed until they leave.

Slowly, but surely, I am getting my questions answered.  95% of the unknowns about her "sight" are... well... "knowns", now.   As soon as I can put her sight into her working production environment, I'll be able to move on.  I think receiving voice commands will be next on the list, but we'll see how I feel after consuming the plethora of gallons of Pepsi necessary to maintain this pace.  You never know.

One final note, if you're going to post on user forums, be nice.  Just because you know the answer to something that someone else doesn't know, it doesn't mean that you're smart.

I'm tired and cranky.  Sorry.

Soon.

Aug 09 2011

Willow 3.0 - Sight - Object Detection (Part II)

I've spend a lot of time researching what kind of an F'd up nightmare I have in front of me.  My camera is built, and now it's time to start using it.  With everything that I have learned about object detection and tracking, I thought it was time to do some tests, today.  

I was able to successfully set up a test application that pulls video from my camera, analyzes the images in search of a face and eyes, and if found, draws boxes around them.  This is just a sample image, and although I would probably look great in that hat, this not a photo of me.

 

As you can see, the application was able to detect the location of a face and eyes.

 

That's great news, right?  Sort of.  You wouldn't believe what goes on behind the scenes in order to make this happen.  This technology is based on something called Haartraining.  The idea is, you process thousands of cropped images of just the subject that you want to detect, all in different sizes, colors and lighting conditions.  These images are called "positives".  Then you take another few thousand images of anything BUT the object you want to detect.  These are called "negatives".  You take these two ridiculously massive sets of files and you process them (which takes 1-5 days running on your average computer) in such a way that only the "difference" between the images is actually recorded.  What you're left with is an XML file that serves as a definition for what a face "looks" like to a computer.

The paragraph above was rewritten so that normal people can understand it.  Most of the math that goes into crap like this gives me an ulcer.  But, luckily I understand enough to know how to use it.  I also understand it enough to know that it's not good enough the way it is.  If I want Willow to know what a ball looks like, I have to find a way to get 2000+ images of a ball?  Seriously?

Obviously this isn't going to be practical, since I want Willow to be able to learn what hundreds, dare I say thousands, of object look like.  There are a couple approaches that I have been able to think up, although one of them is not going to be good enough, either.  Here they are.

Manually Cropping Video

There is a way that I could take the raw video that Willow records and use the stills from the video as my positives.  Five seconds of video alone could give me 100+ images to work with.  That could be a quick and dirty way to gain the amount of different but similar images I would need.  If I have to take the manual approach, this will be it, but there are two main reasons why I'm not happy with this approach.

  1. It's freaking manual labor.  I've never really been great with it, and this approach means thousands of hours of cropping images and extracting frames from video.  It will take too long, and I'm just not that patient.
  2. It goes against one of the fundamental goals that I started this project with.  I am striving for a way to stop programming her and begin "teaching" her.  I want her to have the ability to grow and learn through interaction, not programming or preparing files.

Color Keying Video

I don't talk about it much, but I made a film a few years ago.  It involved me cropping and manually processing frames for about two years straight, which I'm sure is why "Manually Cropping Video" makes me want to upchuck, now.  There were a lot of special effects techniques involved with the production of it.  One of which, was using a green screen to quickly and easily delete only the backgrounds of video clips enabling me to put actors in any environment, real or simulated, that I could think up.  

If I could do that, then why can't I do something similar now?  What if, instead of preparing thousands of images by hand in order to teach Willow what a ball looks like, I could put a ball in front of a green screen, say the word "ball" so that Willow would associate what she is seeing with the word "ball" and then have her take out the background, crop it and save it?  If this technique will work, I will be able to teach Willow what objects are by simply showing her and telling her.

A major hurdle will be overcome and half the battle will be won.

It occurred to me that I should probably be explaining how I succeeded after the fact that I have finished it.  This might not work, and I could be wasting your time by making you read this.  However, writing this $#!+ out really does help me think it through, and that is beside the fact that I am already pretty confident that this is going to work.  I just need to prove it, now.

What's After This?

There are two things that I will need to do after I figure out the object detection piece.  

The first thing is, I will need to move on to the "Object Recognition" portion of her sight.  Assuming I can get her to learn how to detect a face, she still needs to be able to recognize who's face it is.  It is very similar to our own thought process if you think about it.  If someone points in to the sky, you may look up and:

  1. See a bird
  2. Realize it's not just a bird, but a hawk.

Of coure this happens so quickly, we don't really think about it.  But Willow will be doing the same thing.  She will first see an object (the body of a person, a face, a sphere) and then decide what exactly it is (me, my daughter's face, an apple, (respectively))

The second thing I will need to incorporate is the ability to pull images from other sources.  The object detection and recognition algorithms will run on Willow's server.  The camera I built will be hooked up to the server, as well, giving her the ability to look around the room and see who else is in there with her.  However, having one camera as a source of sight is going to be extremely limiting especially since that camera will never leave the room.  I can give her many more learning experiences if I allow her to receive images from other computers.  I was thinking about it, and this functionality will go perfect in the Willow Interface Client that I have yet to build.  This way, I will be able to run an interface client on my laptop.  I can take the portion of Willow that I can talk to and interact with anywhere I want.  She will be able to talk back, use the webcam built into my laptop to see, and then send those images back to the server for processing when she is done.  

There is no limit to how many interface clients I can run at one time.  Since I can potentially have one in my room, my laptop, my car and my phone, the server is likely to receive images and experiences from many different sources at the same time.  

Talk about being in two places at the same time.  

I have so much work to do.

Soon

Aug 05 2011

Willow 3.0 - Sight - The Hardware (Part I)

How do you get a computer to see?  Hook up a camera.  Done.  Right?  No?  Hmm...

I suppose if a human being is going to watch the screen and interpret what is being displayed, then yes... it really is that simple.  But I'm going to attempt to write software that will analyze the video and translate it into data that a computer can understand.  The video is likely to never even be displayed on the screen. 

How do you do that?  Well, I'm working on that.

The first step is to build a camera capable of looking around.  Most webcams just kind of hang on your monitor or sit on your desktop, which is fine for video chatting on Skype, but if something is going to be out of Willow's view she is going to need to reposition the camera to correct that.

A few years ago, I worked at a really crappy company called NADA.  It was cool for awhile, in the sense that they let my good friend and I decorate our office "any" way we wanted to.  I am sure they didn't mean that we could nail 200 fake plants to the walls and ceiling and call it a jungle, but we did anyway.  To top it off, my friend had her mom make up an awesome venus flytrap mouth while I built a second-rate wooden robot to make the mouth open and close.  We named her Audrey, for obvious reasons.

 

Well, time passes, companies change (even though I bet this one is still crappy), and you always seem to end up with a wooden robot stuck in your closet. And... since I didn't have the money to go buy a more elegant solution to a computer controlled camera, I built one using Audrey's innards.  (Sorry, Diane.  It hurt to do if it makes you feel any better.)

The parts of interest were two motors, and a USB control board that I used to control Audrey's mouth.  These servos would be perfect for positioning a camera.

 

 

 

All of these parts can be purchased at Phidgets.com.  I found these parts to be easy to use, well documented, and reliable.  I will probably end up using more sensors from this company for use as additional "senses" in the future.

So, from here, it was just a matter of mounting it all together.  I even took Audrey's wood parts, painted them black, and recycled them on my camera project.  The object was to enable Willow to look 180 degrees in any direction.  So, by mounting these servos in the following configuration, you can see it works pretty stinking well.

Now that the hardware is done, it's time to write the software to control it and use the images from it.  That is the part I am dreading, to be honest.  It is not going to be easy, and it's likely to be an ongoing challenge throughout the rest of Willow's construction.

Wish me luck.

Soon.