This is part 2 of a clustering demo in R. You can read Part 1 here which deals with assessing clustering tendency of the data and deciding on cluster number. This part looks at performing clustering using Partitioning Around Medoids algorithm and validating the results. Continue reading
Cluster analysis more usually referred to as clustering, is a common data mining task. In clustering the goal is to divide the data set into groups so that objects in the same group are similar to one another while objects in different groups are different to one another. In other words the goal is to minimize the intra-cluster distance while maximizing the inter-cluster distance.
The General Data Protection Regulation (GDPR) is a new data protection regulation that will be effective across the EU from 25th May 2018. The GDPR applies to all companies that process data of EU citizens regardless of where the companies are based. It replaces the Directive 95/46/EC normally referred to as the Data Protection Directive which dates back to the 1990’s.
Data Science, Data Mining, Machine Learning, Artificial Intelligence, Big Data … the list goes on. All terms that eager cheerleaders of the data revolution are highlighting that organisations need to embrace. With all the hype and attention it’s not surprising that businesses feel they need to be become more data-driven or risk losing competitive advantage. And definitely there is substance to the hype otherwise companies like IBM wouldn’t be pouring literally billions of dollars in investment into their big data capabilities.
I have been interested in parapsychology ever since picking up a copy of the excellent Eysenck and Sargent book – Explaining the Unexplained many years ago. It’s a bit dated now but still a great introduction for anyone interested in learning more about the topic.
Just like with data science there is sometimes confusion about what parapsychology is. Perhaps it’s easier to start with what it’s not. It’s not astrology, ghost busting, monster hunting, fortune telling or investigating UFO sightings though these are things that often come to mind when one thinks of the paranormal mainly because of the influence of television shows on the ‘paranormal’.
There is huge hype at the moment about Data Science and it seems like everybody is trying to get in on the game. While hype might help bring the topic to popular attention, it can also serve to obscure and confuse. What do people mean when they talk about data science? Is it all just hype? Continue reading
Sometimes people assume that data science and related areas are all about consumer facing businesses trying to learn ways to sell more product. I’m not sure whether this viewpoint is more naive or more cynical. Data science is about translating raw data into knowledge and insight to enable better decision making. Therefore it has wide and varied application. Data science and related fields like data mining, machine learning and big data have huge potential to drive innovation in social enterprise and business for social good.
Education too is being impacted by the data revolution. This isn’t something that may happen in the future. It has already begun. Schools in America are already using systems that combine data points like attendance and grades to predict which students are at risk of school dropout years in advance. Continue reading