Great data visualization starts with great ideas and design but only materializes with the right tools. Here is a list of software I have used to produce figures and what I think their best features are.……【阅读全文】
Category Archives: 生物统计
[文献合集]Statistics notes from The BMJ
Statistics notes from The BMJ
- Interpreting diagnostic accuracy studies for patient care
- Brackets (parentheses) in formulas
- How to obtain the P value from a confidence interval
- Comparisons within randomised groups can be very misleading
- Correlation in restricted ranges of data
- Analysis of continuous data from small samples
- Parametric v non-parametric methods for data analysis
- Missing data
- The cost of dichotomising continuous variables
- Standard deviations and standard errors
- Treatment allocation by minimisation
- Diagnostic tests 4: likelihood ratios
- The logrank test
- Interaction revisited: the difference between two estimates
- Validating scales and indexes
- Analysing controlled trials with baseline and follow up measurements
- Concealing treatment allocation in randomised trials
- Blinding in clinical trials and other studies
- …
标准差与标准误
The terms “standard error” and “standard deviation” are often confused. The contrast between these two terms reflects the important distinction between data description and inference, one that all researchers should appreciate.……【阅读全文】
[荐]Quickly create online and interactive plots using Plot.ly
As readers likely know putting your data online as interactive visualizations can be a lot of work. We like to use D3, Highcharts and the Google visualization API but all of these tools require some serious programming. When you’re building a website with custom data visualization the effort might make sense, but when you have data you want to share quickly and elegantly a new data visualiation tool, Plot.ly, is a nice option. You can create graphics by uploading data and manually setting plot options or you can create and upload directly from Python, R or other environments.……【阅读全文】
[荐]Beautiful plotting in R: A ggplot2 cheatsheet
Even the most experienced R users need help creating elegant graphics. The ggplot2 library is a phenomenal tool for creating graphics in R but even after many years of near-daily use we still need to refer to our Cheat Sheet. Up until now, we’ve kept these key tidbits on a local PDF. But for our own benefit (and hopefully yours) we decided to post the most useful bits of code.……【阅读全文】
[荐]数据科学家教你用数据模型来恋爱
男生和女生分别是来自不同星球的科学事实已经众所周知的了。男生们总是认为,女生们都是迷一样的生物,他们的情感状态浮动似乎是以秒单位在变化的,难以理解,更勿论预测了! 而女生们觉得男生都是没有感觉动物,完全不能理解什么叫感受——尽管已经告诉他们N次了!这种男女之间的根本差别,导致了他们之间的感情关系是受一种超级无敌复杂的系统所支配的。……【阅读全文】
[荐]Code School
Code School
Code School teaches web technologies in the comfort of your browser with video lessons, coding challenges, and screencasts. We strive to help you learn by doing.……【阅读全文】
[荐]ECplot: An online tool for making standardized plots from large datasets for bioinformatics publications
[荐]box plots与BoxPlotR
box plots
- Bring on the box plots
- Points of View: Bar charts and box plots
- Points of Significance: Visualizing samples with box plots
- Box plot(wikipedia)
- 40 years of boxplots
- Methods for Presenting Statistical Information: The Box Plot
- Beanplot: A Boxplot Alternative for Visual Comparison of Distributions
BoxPlotR
This application allows users to generate customized box plots in a number of variants based on their data. A data matrix can be uploaded as a file or pasted into the application. Basic box plots are generated based on the data and can be modified to include additional information. Additional features become available when checking that option. Information about sample sizes can be represented by the width of each box where the widths are proportional to the square roots of the number of observations n. Notches can be added to the boxes. These are defined as +/-1.58*IQR/sqrt(n) which gives roughly 95% confidence that two medians are different. It is also possible to define the whiskers based on the ideas of Spear and Tukey. Additional options of data visualization (violin and bean plots) reveal more information about the underlying data distribution. Plots can be labeled, customized (colors, dimensions, orientation) and exported as eps, pdf and svg files.……【阅读全文】
[荐]Intermediate R/Bioconductor for High-Throughput Sequence Analysis
Intermediate R/Bioconductor for High-Throughput Sequence Analysis introduces users with some R experience to common Bioconductor work flows for sequence analysis. The course involves a combination of presentations and hands-on exercises. Our starting point is BAM files created by aligning short reads to a reference genome. Topics include exploratory analysis (GenomicRanges, Rsamtools); assessing differential expression of known genes (DESeq); detection, calling, and manipulation of variants (VariantTools, VariantAnnotation). We learn how to integrate results with curated gene and genomic annotations (GenomicFeatures), and to visualize results (GViz, ggbio).……【阅读全文】