Learn the general concepts of data mining along with basic methodologies and applications. Then dive into one subfield in data mining: pattern discovery.
Learn in-depth concepts, methods, and applications of pattern discovery in data mining. We will also introduce methods for pattern-based classification and some interesting applications of pattern discovery. This course provides you the opportunity to learn skills and content to practice and engage in scalable pattern discovery methods on massive transactional data, discuss pattern evaluation measures, and study methods for mining diverse kinds of patterns, sequential patterns, and sub-graph patterns.
For this course you need basic computing proficiency including some programming experience in a typical programming language, such as C++, Java, or Python, knowledge of basic concepts of databases, artificial intelligence, and statistics.
Join us on the frontier of bioinformatics and learn how to look for hidden messages in DNA without ever needing to put on a lab coat. In the first half of this course, we'll investigate DNA replication, and ask the question, where in the genome does DNA replication begin? You will learn how to answer this question for many bacteria using straightforward algorithms to look for hidden messages in the genome.
Learn to program with Java in an easy and interactive way! In this introductory Java programming course, you will be introduced to powerful concepts such as functional abstraction, the object oriented programming (OOP) paradigm and Application Programming Interfaces (APIs). Examples and case studies will be provided so that you can implement simple programs on your own or collaborate with peers.
This course begins a series of classes illustrating the power of computing in modern biology. Please join us on the frontier of bioinformatics to look for hidden messages in DNA without ever needing to put on a lab coat.
World and internet is full of textual information. We search for information using textual queries, we read websites, books, e-mails. All those are strings from the point of view of computer science. To make sense of all that information and make search efficient, search engines use many string algorithms. Moreover, the emerging field of personalized medicine uses many search algorithms to find disease-causing mutations in the human genome.
With every smartphone and computer now boasting multiple processors, the use of functional ideas to facilitate parallel programming is becoming increasingly widespread. In this course, you'll learn the fundamentals of parallel programming, from task parallelism to data parallelism. In particular, you'll see how many familiar ideas from functional programming map perfectly to to the data parallel paradigm.
Case Study - Predicting Housing Prices In our first case study, predicting house prices, you will create models that predict a continuous value (price) from input features (square footage, number of bedrooms and bathrooms,...). This is just one of the many places where regression can be applied. Other applications range from predicting health outcomes in medicine, stock prices in finance, and power usage in high-performance computing, to analyzing which regulators are important for gene expression.In this course, you will explore regularized linear regression models for the task of prediction and feature selection. You will be able to handle very large sets of features and select between models of various complexity. You will also analyze the impact of aspects of your data -- such as outliers -- on your selected models and predictions. To fit these models, you will implement optimization algorithms that scale to large datasets.
This course, which is designed to serve as the first course in the Recommender Systems specialization, introduces the concept of recommender systems, reviews several examples in detail, and leads you through non-personalized recommendation using summary statistics and product associations, basic stereotype-based or demographic recommendations, and content-based filtering recommendations.
Cloud computing systems today, whether open-source or used inside companies, are built using a common set of core techniques, algorithms, and design philosophies—all centered around distributed systems. Learn about such fundamental distributed computing "concepts" for cloud computing.
Want to learn the basics of large-scale data processing? Need to make predictive models but don’t know the right tools? This course will introduce you to open source tools you can use for parallel, distributed and scalable machine learning.
You've learned the basic algorithms now and are ready to step into the area of more complex problems and algorithms to solve them. Advanced algorithms build upon basic ones and use new ideas. We will start with networks flows which are used in more obvious applications such as optimal matchings, finding disjoint paths and flight scheduling as well as more surprising ones like image segmentation in computer vision or finding dense clusters in the advertiser-search query graphs at search engines. We then proceed to linear programming with applications in optimizing budget allocation, portfolio optimization, finding the cheapest diet satisfying all requirements, call routing in telecommunications and many others. Next we discuss inherently hard problems for which no exact good solutions are known (and not likely to be found) and how to solve them approximately in a reasonable time. We finish with some applications to Big Data and Machine Learning which are heavy on algorithms right now.
Data analysis has replaced data acquisition as the bottleneck to evidence-based decision making --- we are drowning in it. Extracting knowledge from large, heterogeneous, and noisy datasets requires not only powerful computing resources, but the programming abstractions to use them effectively. The abstractions that emerged in the last decade blend ideas from parallel databases, distributed systems, and programming languages to create a new class of scalable data analytics platforms that form the foundation for data science at realistic scales.
The primary topics in this part of the specialization are: data structures (heaps, balanced search trees, hash tables, bloom filters), graph primitives (applications of breadth-first and depth-first search, connectivity, shortest paths), and their applications (ranging from deduplication to social network analysis).
MOOCs – Massive Open Online Courses – enable students around the world to take university courses online. This guide, by the instructors of edX’s most successful MOOC in 2013-2014, Principles of Written English (based on both enrollments and rate of completion), advises current and future students how to get the most out of their online study, covering areas such as what types of courses are offered and who offers them, what resources students need, how to register, how to work effectively with other students, how to interact with professors and staff, and how to handle assignments. This second edition offers a new chapter on how to stay motivated. This book is suitable for both native and non-native speakers of English, and is applicable to MOOC classes on any subject (and indeed, for just about any type of online study).