neural-network

Deep Learning is a trending buzzword in the Machine Learning environment. All the major players in Silicon Valley are heavily investing in these topics and US universities are improving their courses offer.

I’m really interested in artificial intelligence both for fun and for work and I spent a few hours in the last weeks searching for best MOOCs about this topic. I found only a few courses but they are from the most notable figures in Deep Learning and Neural Networks environment.

Machine Learning
Stanford University on Coursera, Andrew Ng

Andrew Ng is Chief Scientist at Baidu Research since 2015, founder of Coursera and Machine Learning lecturer at Stanford University. He also founded the Google Brain project in 2011. His Machine Learning (CS229a) course at Stanford is quite mythical and, obviously, was my starting point.

machine-learning-ng

Machine Learning, Coursera

Neural Networks for Machine Learning
University of Toronto on Coursera, Geoffrey Hinton

Geoffrey Hinton is working at Google (probably on Google Brain) since 2013 when Google acquire his company DNNResearch Inc. He is a cognitive psychologist most noted for his work on artificial neural networks. His Coursera course on Neural Networks is related to 2012 but seem to be one of the best resource about these topics.

neural-networks-for-machine-learning

Neural Networks for Machine Learning, Coursera

Deep Learning (2015)
New York University on TechTalks, Yann LeCun (videos on techtalks.tv)

In 2013 LeCun became the first director of Facebook AI Research. He is well known for his work on optical character recognition and computer vision using convolutional neural networks (CNN), and is a founding father of convolutional nets. 2015 Deep Learning course at NYU is the last course about this topic hold by him.

Yann LeCun. CIFAR NCAP pre-NIPS' Workshop. Photo: Josh Valcarcel/WIRED

Yann LeCun. CIFAR NCAP pre-NIPS’ Workshop. Photo: Josh Valcarcel/WIRED

Big Data, Large Scale Machine Learning
New York University on TechTalks, John Langford and Yann LeCun

Another interesting course about Machine Learning hold by LeCun and John Langford, researcher at Yahoo Research, Microsoft Research and IBM’s Watson Research Center.

langford

John Langford, NYU

Deep Learning Courses
NVIDIA Accelerated Computing

This is not a college course. NVIDIA was one of the most important graphic board manufacturer in the early 2000s and now, with the experience of massive parallel computer on GPUs, is heavily investing in Deep Learning. This course is focused on usage of GPUs on most common deep learning framework: DIGITS, Caffe, Theano and Torch.

deep-learning-course

Deep Learning Courses, NVIDIA

Mastering Apache Spark
Mike Frampton, Packt Publishing

Last summer I had the opportunity to collaborate in review of this title. Chapter about MLlib contains a useful introduction to Artificial Neural Networks on Spark. Implementation seems still young but is already possible to distribute the network over a Spark cluster.

mastering-apache-spark

Mastering Apache Spark

[UPDATE 2016-01-31]

Deep Learning 
Vincent Vanhoucke, Google, Udacity

Google, a few days ago, releases on Udacity a Deep Learning course focused on TensorFlow, its deep learning tool. It’s the first course officially sponsored by a big companym is free and seems a great introduction. Thanks to Piotr Chromiec for pointing 🙂

deep-learning-google

data_science

During the last year I refined my RSS collection about big-data, data science and analytics. I usually check it everyday in order to discover a ton of new cool technologies and have fun. Here is the updated list.

Bloggers

News about emerging technologies, scalability and data

Data companies, social networks and search engines

Companies supporting e distributing big-data processing products

Recently I discovered the awesome data science list that contains a list of interesting blogger I haven’t time to check yet. You can surely find something more in it. I’ll try to publish an update when I’ll check it.

[UPDATE 2014-09-22 11:35]

Thanks to @onurakpolat for correcting my link to awesome data science list. Previous link was to his fork, the original repo is https://github.com/okulbilisim/awesome-datascience by @okulbilisim

For several years I thought MapReduce was the only paradigm for distributed data processing. Only a few month ago, watching “Clash of the Titans Beyond MapReduce” (a really interesting talk at The Hive meetupMatei Zaharia (@matei_zaharia), CTO of Databricks and co-creator of Spark, cited Dryad, a programming paradigm developed by Microsoft. I don’t know any open project which implement it but was widely used by Microsoft Research.


microsoft_research_logo

Title: Dryad: Distributed Data-Parallel Programs from Sequential
Building Blocks (PDF), March 2007
Authors: Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly

Abstract

Dryad is a general-purpose distributed execution engine for coarse-grain data-parallel applications. A Dryad application combines computational “vertices” with communication “channels” to form a dataflow graph.

Dryad runs the application by executing the vertices of this graph on a set of available computers, communicating as appropriate through files, TCP pipes, and shared-memory FIFOs. The vertices provided by the application developer are quite simple and are usually written as sequential programs with no thread creation or locking. Concurrency arises from Dryad scheduling vertices to run simultaneously on multiple computers, or on multiple CPU cores within a computer. The application can discover the size and placement of data at run time, and modify the graph as the computation progresses to make efficient use of the available resources.

Dryad is designed to scale from powerful multi-core single computers, through small clusters of computers, to data centers with thousands of computers. The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: scheduling the use of computers and their CPUs, recovering from communication or computer failures, and transporting data between vertices.


Check out the list of interesting papers and projects (Github).