kb/data/en.wikipedia.org/wiki/Biostatistics-5.md

7.0 KiB

title chunk source category tags date_saved instance
Biostatistics 6/6 https://en.wikipedia.org/wiki/Biostatistics reference science, encyclopedia 2026-05-05T14:01:53.013815+00:00 kb-cron

=== Expression data === Studies for differential expression of genes from RNA-Seq data, as for RT-qPCR and microarrays, demands comparison of conditions. The goal is to identify genes which have a significant change in abundance between different conditions. Then, experiments are designed appropriately, with replicates for each condition/treatment, randomization and blocking, when necessary. In RNA-Seq, the quantification of expression uses the information of mapped reads that are summarized in some genetic unit, as exons that are part of a gene sequence. As microarray results can be approximated by a normal distribution, RNA-Seq counts data are better explained by other distributions. The first used distribution was the Poisson one, but it underestimate the sample error, leading to false positives. Currently, biological variation is considered by methods that estimate a dispersion parameter of a negative binomial distribution. Generalized linear models are used to perform the tests for statistical significance and as the number of genes is high, multiple tests correction have to be considered. Some examples of other analysis on genomics data comes from microarray or proteomics experiments. Often concerning diseases or disease stages.

=== Other studies === Ecology, ecological forecasting Biological sequence analysis Systems biology for gene network inference or pathways analysis. Clinical research and pharmaceutical development Population dynamics, especially in regards to fisheries science. Phylogenetics and evolution Pharmacodynamics Pharmacokinetics Neuroimaging

== Tools == There are a lot of tools that can be used to do statistical analysis in biological data. Most of them are useful in other areas of knowledge, covering a large number of applications (alphabetical). Here are brief descriptions of some of them:

ASReml: Another software developed by VSNi that can be used also in R environment as a package. It is developed to estimate variance components under a general linear mixed model using restricted maximum likelihood (REML). Models with fixed effects and random effects and nested or crossed ones are allowed. Gives the possibility to investigate different variance-covariance matrix structures. CycDesigN: A computer package developed by VSNi that helps the researchers create experimental designs and analyze data coming from a design present in one of three classes handled by CycDesigN. These classes are resolvable, non-resolvable, partially replicated and crossover designs. It includes less used designs the Latinized ones, as t-Latinized design. Orange: A programming interface for high-level data processing, data mining and data visualization. Include tools for gene expression and genomics. R: An open source environment and programming language dedicated to statistical computing and graphics. It is an implementation of S language maintained by CRAN. In addition to its functions to read data tables, take descriptive statistics, develop and evaluate models, its repository contains packages developed by researchers around the world. This allows the development of functions written to deal with the statistical analysis of data that comes from specific applications. In the case of Bioinformatics, for example, there are packages located in the main repository (CRAN) and in others, as Bioconductor. It is also possible to use packages under development that are shared in hosting-services as GitHub. SAS: A data analysis software widely used, going through universities, services and industry. Developed by a company with the same name (SAS Institute), it uses SAS language for programming. PLA 3.0: Is a biostatistical analysis software for regulated environments (e.g. drug testing) which supports Quantitative Response Assays (Parallel-Line, Parallel-Logistics, Slope-Ratio) and Dichotomous Assays (Quantal Response, Binary Assays). It also supports weighting methods for combination calculations and the automatic data aggregation of independent assay data. Weka: A Java software for machine learning and data mining, including tools and methods for visualization, clustering, regression, association rule, and classification. There are tools for cross-validation, bootstrapping and a module of algorithm comparison. Weka also can be run in other programming languages as Perl or R. Python (programming language) image analysis, deep-learning, machine-learning SQL databases NoSQL NumPy numerical python SciPy SageMath LAPACK linear algebra MATLAB Apache Hadoop Apache Spark Amazon Web Services

== Scope and training programs == Almost all educational programmes in biostatistics are at postgraduate level. They are most often found in schools of public health, affiliated with schools of medicine, forestry, or agriculture, or as a focus of application in departments of statistics. In the United States, where several universities have dedicated biostatistics departments, many other top-tier universities integrate biostatistics faculty into statistics or other departments, such as epidemiology. Thus, departments carrying the name "biostatistics" may exist under quite different structures. For instance, relatively new biostatistics departments have been founded with a focus on bioinformatics and computational biology, whereas older departments, typically affiliated with schools of public health, will have more traditional lines of research involving epidemiological studies and clinical trials as well as bioinformatics. In larger universities around the world, where both a statistics and a biostatistics department exist, the degree of integration between the two departments may range from the bare minimum to very close collaboration. In general, the difference between a statistics program and a biostatistics program is twofold: (i) statistics departments will often host theoretical/methodological research which are less common in biostatistics programs and (ii) statistics departments have lines of research that may include biomedical applications but also other areas such as industry (quality control), business and economics and biological areas other than medicine.

== Specialized journals ==

Biostatistics International Journal of Biostatistics Journal of Epidemiology and Biostatistics Biostatistics and Public Health Biometrics Biometrika Biometrical Journal Communications in Biometry and Crop Science Statistical Applications in Genetics and Molecular Biology Statistical Methods in Medical Research Pharmaceutical Statistics Statistics in Medicine

== See also == Bioinformatics Epidemiological method Epidemiology Group size measures Health indicator Mathematical and theoretical biology

== References ==

== External links == Media related to Biostatistics at Wikimedia Commons The International Biometric Society The Collection of Biostatistics Research Archive Guide to Biostatistics (MedPageToday.com) Archived 2012-05-22 at the Wayback Machine Biomedical Statistics