Presto: Interacting with petabytes of data at Facebook

Facebook is a data-driven company. Data processing and analytics are at the heart of building and delivering products for the 1 billion+active users of Facebook. We have one of the largest data warehouses in the world, storing more than 300 petabytes. The data is used for a wide range of applications, from traditional batch processing…



How to analyze 100 million images for $624 #hipsteranalytics

Jetpac is building a modern version of Yelp, using big data rather than user reviews. People are taking more than a billion photos every single day, and many of these are shared publicly on social networks. We analyze these pictures to discover what they can tell us about bars, restaurants, hotels, and other venues around the…



#Bigdata startups pull in big money in 2013

Investors have pumped $3.6 billion into startups focused on big data this year. Not too shabby: It’s almost three quarters of all the money that’s gone into such companies from 2008 to 2012, according to a new infographic out today from burgeoning site Big Data Startups.

The infographic includes a roundup of the 10 data startups…



How Netflix Reinvented #HR

Sheryl Sandberg has called it one of the most important documents ever to come out of Silicon Valley. It’s been viewed more than 5 million times on the web. But when Reed Hastings and I (along with some colleagues) wrote a PowerPoint deck explaining how we shaped the culture and motivated performance at Netflix, where…



RACI Model: An Accountability Matrix for Every Team

RACI stands for responsible, accountable, consulted, and informed. That’s four levels of “answerability.” A RACI model is a table that lists the members of a team and delineates their level of answerability for every aspect of a project, as shown in the sample chart above.

via RACI Model: An Accountability Matrix for Every Team.



2014 – The Year You Learn to Master #BigData‏ and #Analytics

Is the hype of big data and analytics has passed, 2014 will see organisations needing to generate intelligence, ROI, profit and value from their structured and unstructured data to grow profits, market share and edge out competition.

But how can this data strategy be executed while juggling the moving parts of architecture, analytics, strategy, process, technology,…



Data Scientist versus Business Analyst #analytics #bigdata

Business analysts focus on data base design (database modeling, at a high level, including defining metrics, dashboard design, retrieving and producing executive reports and designing alarm systems), ROI assessment on various business projects and expenditures, and budget issues. Some work on marketing or finance planning and optimization, and risk management. Many work on high-level project…



Data Scientist versus Statistician #bigdata

Many statisticians think that data science is about analyzing data, but it is more than that. Data science also involves implementing algorithms that process data automatically, to provide automated predictions and actions, such as:

Automated bidding systems Estimating (in real time) the value of all houses in the United States (Zillow.com) High-frequency trading Matching a Google Ad with a…



Six categories of Data Scientists #datascience #bigdata

There  are now at 9 categories after a few updates. Just like there are a few categories of statisticians (biostatisticians, statisticians, econometricians, operations research specialists, actuaries) or business analysts (marketing-oriented, product-oriented, finance-oriented, etc.) we have different categories of data scientists. First, many data scientists have a job title different from data scientist, mine for instance…



5 Tips For Your First IT Internship

You’ve gotten your first IT internship, and it starts soon. How can you not only do well in this internship, but make it count towards your long term career success? Learn some tips for your first IT internship in this article.

via 5 Tips For Your First IT Internship.