Data Science

 

 


 

Customize your search:

E.g., 2017-06-29
E.g., 2017-06-29
E.g., 2017-06-29
Jun 5th 2017

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.

Average: 6.1 (16 votes)
Jun 5th 2017

Este curso te proporcionará las bases del lenguaje de programación estadística R, la lengua franca de la estadística, el cual te permitirá escribir programas que lean, manipulen y analicen datos cuantitativos. Te explicaremos la instalación del lenguaje; también verás una introducción a los sistemas base de gráficos y al paquete para graficar ggplot2, para visualizar estos datos. Además también abordarás la utilización de uno de los IDEs más populares entre la comunidad de usuarios de R, llamado RStudio.

Average: 6.8 (6 votes)
Jun 5th 2017

An introduction to the statistics behind the most popular genomic data science projects. This is the sixth course in the Genomic Big Data Science Specialization from Johns Hopkins University.

Average: 7.2 (5 votes)
Jun 5th 2017

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference. A practitioner can often be left in a debilitating maze of techniques, philosophies and nuance.

Average: 7.1 (11 votes)
Jun 5th 2017

Learn to use tools from the Bioconductor project to perform analysis of genomic data. This is the fifth course in the Genomic Big Data Specialization from Johns Hopkins University.

Average: 8.3 (3 votes)
Jun 5th 2017

This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary.

Average: 6.7 (3 votes)
Jun 5th 2017

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.

Average: 7.2 (5 votes)
Jun 5th 2017

Making predictions is not enough! Effective data scientists know how to explain and interpret their results, and communicate findings accurately to stakeholders to inform business decisions. Visualization is the field of research in computer science that studies effective communication of quantitative results by linking perception, cognition, and algorithms to exploit the enormous bandwidth of the human visual cortex. In this course you will learn to recognize, design, and use effective visualizations.

Average: 8.6 (5 votes)
Jun 5th 2017

Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models. Special cases of the regression model, ANOVA and ANCOVA will be covered as well. Analysis of residuals and variability will be investigated. The course will cover modern thinking on model selection and novel uses of regression models including scatterplot smoothing.

Average: 7.7 (7 votes)
Jun 5th 2017

Data analysis has replaced data acquisition as the bottleneck to evidence-based decision making --- we are drowning in it. Extracting knowledge from large, heterogeneous, and noisy datasets requires not only powerful computing resources, but the programming abstractions to use them effectively. The abstractions that emerged in the last decade blend ideas from parallel databases, distributed systems, and programming languages to create a new class of scalable data analytics platforms that form the foundation for data science at realistic scales.

Average: 6.5 (6 votes)
Jun 5th 2017

One of the most common tasks performed by data scientists and data analysts are prediction and machine learning. This course will cover the basic components of building and applying prediction functions with an emphasis on practical applications. The course will provide basic grounding in concepts such as training and tests sets, overfitting, and error rates. The course will also introduce a range of model based and algorithmic machine learning methods including regression, classification trees, Naive Bayes, and random forests. The course will cover the complete process of building prediction functions including data collection, feature creation, algorithms, and evaluation.

Average: 5.9 (18 votes)
Jun 5th 2017

Statistical experiment design and analytics are at the heart of data science. In this course you will design statistical experiments and analyze the results using modern methods. You will also explore the common pitfalls in interpreting statistical arguments, especially those associated with big data. Collectively, this course will help you internalize a core set of practical and effective machine learning methods and concepts, and apply them to solve some real world problems.

Average: 7.9 (11 votes)
Jun 5th 2017

Get an overview of the data, questions, and tools that data analysts and data scientists work with. This is the first course in the Johns Hopkins Data Science Specialisation. In this course you will get an introduction to the main tools and ideas in the data scientist's toolbox. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, Github, R, and Rstudio.

Average: 4.5 (24 votes)
Jun 5th 2017

In this course you will learn how to program in R and how to use R for effective data analysis. You will learn how to install and configure software necessary for a statistical programming environment and describe generic programming language concepts as they are implemented in a high-level statistical language. The course covers practical issues in statistical computing which includes programming in R, reading data into R, accessing R packages, writing R functions, debugging, profiling R code, and organizing and commenting R code. Topics in statistical data analysis will provide working examples.

Average: 5.6 (24 votes)
Jun 5th 2017

This course will introduce the learner to the basics of the python programming environment, including how to download and install python, expected fundamental python programming techniques, and how to find help with python programming questions. The course will also introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the DataFrame as the central data structure for data analysis.

Average: 7.3 (4 votes)
Jun 1st 2017

With a full tuition under $20K, the University of Illinois Master of Computer Science - Data Science (MCS-DS) is the most affordable gateway to one of the most lucrative and fastest growing careers of the new millennium. The MCS-DS builds expertise in four core areas of computer science: data visualization, machine learning, data mining and cloud computing, in addition to building valuable skill sets in statistics and information science with courses taught in collaboration with the University’s Statistics Department and ISchool (ranked #1 among Library and Information Studies Schools.)

No votes yet
May 29th 2017

By now you have definitely heard about data science and big data. In this one-week class, we will provide a crash course in what these terms mean and how they play a role in successful organizations. This class is for anyone who wants to learn what all the data science action is about, including those who will eventually need to manage data scientists. The goal is to get you up to speed as quickly as possible on data science without all the fluff. We've designed this course to be as convenient as possible without sacrificing any of the essentials.

Average: 7.2 (13 votes)
May 29th 2017

This one-week course describes the process of analyzing data and how to manage that process. We describe the iterative nature of data analysis and the role of stating a sharp question, exploratory data analysis, inference, formal statistical modeling, interpretation, and communication. In addition, we will describe how to direct analytic activities within a team and to drive the data analysis process towards coherent and useful results.

Average: 6.3 (14 votes)
May 29th 2017

This course covers advanced topics in R programming that are necessary for developing powerful, robust, and reusable data science tools. Topics covered include functional programming in R, robust error handling, object oriented programming, profiling and benchmarking, debugging, and proper design of functions.

Average: 6.4 (5 votes)
May 29th 2017

This course will introduce the learner to information visualization basics, with a focus on reporting and charting using the matplotlib library. The course will start with a design and information literacy perspective, touching on what makes a good and bad visualization, and what statistical measures translate into in terms of visualizations.

Average: 2 (1 vote)

Pages