Thursday, 22 December 2016

Mark Zuckerberg Shows off Jarvis-How much of what Zuckerberg achieved with Jarvis can the average engineer set up in his own home?

How much of what Zuckerberg achieved with Jarvis can the average engineer set up in his own home?


Okay, let’s break this down into smaller parts. As long as you have a Mac and an iPhone, along with compatible hardware, all the software is free to use. There’s nothing that Facebook specifically has that we don’t.
First, at the basic level, you need to build the communication protocol. Zuckerburg started out with texting Jarvis. This is done through implementing an SMS gateway to a computer so that basically a computer can send/receive texts just like a phone can. This can be done with the Twilio API for a very economical cost and pretty easy setup.
Then, he used text-to-speech to allow for Jarvis to directly understand his voice without the need for an SMS middleman. Both the iPhone and Android keyboards have a text-to-speech machine built in, so it is easy to simply speak into your keyboard and have it send the message via Facebook Messenger. Jarvis would then run some kind of node.js or Python script that would monitor Facebook messages and respond accordingly. This isn’t too hard to implement via Python or Node.js.
He used this basic functionality to use his computer to monitor his Facebook messages and respond to them. I myself implemented this on Github where I was able to play music from my computer via the Facebook chat api through AppleScript and Node.js. Feel free to check it out!
The hardest part of Zuck’s Jarvis is the Natural Language Processing involved. It takes a lot for a machine to know exactly what a human means since natural language is so fluid and can have so many ways of communicating the same idea. This is where I would differ from Zuck. Instead of using PHP like he did (who uses PHP for anything anymore?), I would use Python’s amazing NLTK library, or a Recurrent Neural Network in Tensorflow (preferably the latter).
A RNN will allow for a machine to constantly be trained on newer and newer inputs. For example, if I have a conversation with you, and I say “What’s the weather” and you say “it’s pretty hot outside,” I then take your comment and use it to create a new output like “Yeah, I know right?” However, a regular neural network would simply take that information without the previous context and be confused as to what you mean by that. An RNN, however, is context-aware and is able to take multiple previous inputs to generate a new output.
Once you’ve done this, the next part is Facial Recognition. This is actually pretty easy; just use OpenCV in Python for a simple facial recognition library.
Once you’ve done all of this, the only remaining thing is just hooking stuff up to your new computer. This is pretty simple and can be done with a Raspberry Pi and an Arduino (about $70 altogether). The Raspberry Pi would be responsible for monitoring your Facebook messages and processing them, while the Arduino would be responsible for maintaining the circuit balance and actually controlling all the electronics.

No comments:

Post a Comment